What is Digitization?


Digitize - To translate into a digital form. For example, optical scanners digitize images by translating them into bit  maps. It is also possible to digitize sound, video, and any type of movement. In all these cases, digitization is performed by sampling at discrete intervals. To digitize sound, for example, a device measures a sound wave's amplitude many times per second. These numeric values can then be recorded digitally. - from Webopedia http://www.webopedia.com/TERM/d/digitize.html

Digitization
- Professor Mike Gerhard's Definition of Digitization:  http://www.tcom.bsu.edu/tcom101/trends.htm  - "Digitization or computerization, refers to the shift to a society where computers are ubiquitous; to carry out, control, or conduct by means of a computer.  Digital refers to communication signals or information presented in a discrete form--usually in a binary or two-state way--0 or 1." - from Digitization trends http://www.bsu.edu/web/jladams2/trends.html

The words digitize and digitization are subjective terms, used by different people to mean different activities in the following continuum:

Scan
File.

1.jpg 2.jpg 3.jpg
Scan
Archive or master file.
Web or access file.

srp05_moms.tif
srp05_moms.jpg  srp05_moms10.jpg
Scan
Archive or master file.
Web or access file.

Descriptive text file.

srp05_moms.tif
srp05_moms.jpg  srp05_moms10.jpg

srp05_moms.txt
Scan
Archive or master file.
Web or access file.

Descriptive text file.
Web page with descripton in body and/or meta tags.
List of pages, organized, searchable.

srp05_moms.tif
srp05_moms.jpg  srp05_moms10.jpg

srp05_moms.txt

srp05_moms.html
Scan
Archive or master file.
Web or access file.

Descriptive text file.
Web page with descripton in body and/or meta tags.
Entry in library catalog, database.
Online exhibit with introductory and related material, lesson plans, etc.

srp05_moms.tif
srp05_moms.jpg  srp05_moms10.jpg

srp05_moms.txt

srp05_moms.html

Digitization - (1) The process of creating a digital image; (2) the process of creating a digital image and then presenting it on a computer, local area network or the Internet; (3) the process of creating a digital images with accompanying description of the origina and the image file, in a organized and/or searchable fashion and presenting on a computer, local server or the Internet.


A glossary of terms

The following are from the Colorado Digitization Project - http://www.cdpheritage.org/resource/introduction/rsrc_glossary.html

Archival Image - An image meant to have lasting utility. Archival images are usually kept off-line on a cheaper storage medium such as CD-ROM or magnetic tape, in a secure environment. Archival images are of a higher resolution and quality than the digital image delivered to the user on-screen. The file format most often associated with archival images is TIFF, or Tagged Image File Format, as compared to on-screen viewing file formats, which are usually JPEGs and GIFs. [Also now called a Master copy. The copy for the web is the Access copy and for printing is the Print copy]

Digital Image - An electronic photograph scanned from an original document... a representation of whatever is being scanned, whether it be manuscripts, text, photographs, maps, drawings, blueprints, halftones, musical scores, 3-D objects, etc.

Dots per inch (dpi) - A measurement of the scanning resolution of an image or the quality of an output device. DPI expresses the number of dots a printer can print per inch, or that a monitor can display, both horizontally and vertically.

GIF - Graphic Image File Format. A widely supported image storage format promoted by Compuserve for use on the web. [Access copy]

HTML - Hypertext Markup Language. An encoding format for linking and identifying electronicdocuments and used to deliver information on the World Wide Web.

JPEG - Joint Photographic Experts Group. A compression algorithm for condensing the size of image files. JPEGs are helpful in allowing access to full screen image files on-line because they require less storage and are therefore quicker to download into a web page. [Access copy]

Pixel - Often referred to as dot, as in "dots per inch". "Pixel" is short for picture elements, which make up an image, similar to grains in a photograph or dots in a half-tone. Each pixel can represent a number of different shades or colors, depending on how much storage space is allocated for it. Pixels per inch (ppi) is sometimes the preferred term, as it more accurately describes the digital image.

Resolution - The number of pixels (in both height and width) making up an image. The more pixels in an image, the higher the resolution, and the higher the resolution of an image, the greater its clarity and definition (and the larger the file size).

Scanner - A device for capturing a digital image. There are many types of scanners, such as flatbed scanners, drum scanners, slide scanners, and microfilm scanners.

TIFF - Tagged Image/Interchange File Format. A file storage format implemented on a wide variety of computer systems, usually used for archival scans. [Master copy]

URL - Uniform Resource Locator. A standard addressing scheme used to locate or reference files on the Internet. Used in World Wide Web documents to locate files. A URL gives the type of resource being used and the path to the file. The syntax used is: scheme://host.domain/path filename.

World Wide Web (WWW) - An interconnected network of electronic hypermedia documents available on the Internet. WWW documents are marked up in HTML. Cross references or hyperlinks between documents are recorded in the form of URLs.


The following are from the TechWeb TechEncyclopedia http://www.techweb.com/encyclopedia/

Database - A set of related files that is created and managed by a database management system (DBMS). Today, DBMSs can manage any form of data including text, images, sound and video. Database and file structures are always determined by the software. As far as the hardware is concerned, it's all bits and bytes.

OCR (Optical Character Recognition) - The machine recognition of printed characters. OCR systems can recognize many different OCR fonts, as well as typewriter and computer-printed characters. Advanced OCR systems can recognize hand printing. When a text document is scanned into the computer, it is turned into a bitmap, which is a picture of the text. OCR software analyzes the light and dark areas of the bitmap in order to indentify each alphabetic letter and numeric digit. When it recognizes a character, it converts it into ASCII text. Hand printing is much more difficult to analyze than machine-printed characters. Old, worn and smudged documents are also difficult. Scanning documents and processing them with OCR is sometimes as much an art as it is a science

[back to the Agenda]
Diane Berry for WNYLRC 9/29/05