7. Preservation images: grayscale and color

Grayscale images. The committee accepted the Picture Elements proposal to use 8-bit-per-pixel capture for grayscale images. The capture of 8-bit grayscale information (providing 256 shades of gray) is sufficient, since studies have shown that a person can only distinguish 32 to 64 shades of gray on a display screen. Furthermore, since the source documents are primarily modal in nature (characterized by dark markings from a limited set of relatively uniform darknesses on a relatively uniform light background) a lesser number shades would probably be sufficient. JPEG (although only the lossless version, not the widely implemented baseline JPEG used in this project) supports up to 12-bit luminances, which would be more useful in scientific or pictorial-image applications.

Color images. The committee discussed but was unable to fully resolve the question of when images should be captured in color. The software used to view JPEG image files generally supports both, so there is no technical impediment to having a mixture of color and grayscale images within the same collection. The argument against color is that is typically takes extra time to capture each color image and color images require more storage space.

The consultants' initial proposal regarding color was that color information be captured whenever color was incorporated in the document by its creator(s), what one consultant called intentional color. For example, the Federal Theatre Project includes typed playscripts in which a production company--in effect, a kind of second-echelon creator of the archived document--underlined elements in red. This, the consultants argued, warranted color capture. In contrast, yellowing paper "just happened," and the consultants suggested that color capture was not warranted.

The committee response to this proposed principle was mixed. Some agreed, while others stated their belief that elements like the red underlining need not be captured in color. One curator pointed out that many similar examples existed in microfilmed collections and that there was no evidence that the lack of color on the film had impeded historical research. Other committee members, however, advanced an argument that would have the opposite effect and warrant color capture in a far greater number of cases. These members stated that the yellowing of paper, for example, is significant to understanding the artifact and thus should be captured. But this argument did not prevail and the consultants were instructed to produce the testbed images using their rule for "intentional color" to determine when to capture color.

There was no particular discussion of the relative value of so-called true color, generally understood to mean the capture of 24 bits of data for each pixel, versus the capture of reduced color information, e.g., 8 bits per pixel. The consultants reported that 8-bit color is often called paletted color, because an appropriate palette of colors is selected for each individual image. This process is called color quantization. A rich set of algorithms or methods exist for choosing this palette, but each introduces a different distortion. True color avoids this decision-making process (the creation of the palette), which could be error-prone. The argument is parallel to the one that holds that thresholding a grayscale into a binary images increases the risk of losing information. In fact, thresholding may be viewed as an extreme type of color quantization.

A strong argument can be made that a limited palette is well suited to the manuscript case, where few distinctions of shading exist as compared to, say, a color photograph. Unfortunately, no sophisticated compression algorithm (based on the human visual model) exists for these types of images. JPEG compensates considerably for the surfeit of color information in a 24 bit image; it retains the chrominance components of the image at only half the spatial resolution of the luminance (grayscale) information (so-called 4:2:2 color). In addition, JPEG quantizes the chrominance information more coarsely than it does the luminance information.

Interestingly, a JPEG color image is typically only 20 percent larger on document-type images than a JPEG grayscale image of the same item. This argues strongly for including occasional color in the collection. Mitigating against this view is the observation that many scanners take three times as long to capture a color as a grayscale scan.

Although it is possible that 8 bits or fewer of color information would suffice for many manuscript documents, the widespread use of 24-bit color in the capture of pictorial matter led this project to adopt a true-color approach.

A section on color images is found in Appendix A.

Next Section | Previous Section | Contents