Text Quality Review Reference Docs
 
 

Filenaming Structures Summary

Relevant Sections from RFP 96-18
Section C.10 Filenames and Delivery Directories (pp. C-23-C25)
Section J. Attachment 4 (pp. J-24-J35)
Note on Filenames
These structures will be used wherever possible. Additional structures may be added if the existing structures lack certain features required by a given collection.

All item IDs and filenames are lowercase and consist of not more than 8 characters.

Filename Pattern Legend:
cControl page number
pPrint page number
fFeature identifier
xHorizontal grid coordinate, alpha (for segmented images)
yVertical grid coordinate, numeric (for segmented images)
  1. Filename/Directory Structures: Unnumbered Documents in Folders
    1. Typically used: manuscript documents in folders described to the folder level.
    2. Existing collection using this structure: Margaret Mead Collection
    3. Item ID: Based on original container and folder number, becomes a directory name.
      1. Example 1 (4th folder found in container 102): 10204
      2. Example 2 (10th folder found in container 50): 05010
    4. Filename Pattern: ccccf
      1. Based on sequential page found in the folder, with new documents identified.
      2. Example 1 (1st page in the folder, starts new document): 0001d.tif
      3. Example 2 (3rd page in the folder): 0003.tif
      4. Example 3 (12th page in folder, starts new document): 0012d.tif

  2. Filename/Directory Structure 2A: Bibliographic Record/Print-Page Number Structure
    1. Typically used: books without machine-readable text
    2. Existing collection using this structure: none
    3. Item ID: Based on LCCN, becomes a directory name.
      1. Example 1 (California all the way back to 1828. LCCN: ca4-14356): 014356
      2. Example 2 (Along the bowstring or south shore ofLlake Superior. LCCN: 03-6059): 06059
    4. Filename pattern: cccppppf
      1. Based on sequential pages in the book and print page numbers and special features of pages (e.g., title page, table of contents, etc.)
      2. Example 1 (4th page, no print page number, table of contents): 0040000n.tif
      3. Example 2 (25th page, print page number 18, no feature): 0250018.tif

  3. Filename/Directory Structure 2B: Bibliographic Record/No-Print-Page Number Structure
    1. Typically used: books with machine-readable text
    2. Existing collection using this structure: Local History; Upper Midwest
    3. Item ID: Based on LCCN, becomes a directory name.
      1. Example 1 (California all the way back to 1828. LCCN: ca4-14356): 014356
      2. Example 2 (Along the bowstring or south shore of Lake Superior. LCCN: 03-6059): 06059
    4. Filename Pattern: cccc
      1. Based on sequential pages in the book
      2. Example 1 (4th page, no print page number, table of contents): 0004.tif
      3. Example 2 (25th page, print page number 18, no feature): 0025.tif

  4. Filename/Directory Structure 3A: Serial Page Images/Print Pages Tracked
    1. Typically used: Serials without machine-readable text
    2. Existing collection using this structure: none
    3. Item ID: Based on Issue year and number (yyyynn)
    4. Filename pattern: cccppppf

  5. Filename/Directory Structure 3B: Serial Page Images/No Print Page Numbers Tracked
    1. Typically used: Serials with machine-readable text
    2. Existing collection using this structure: none
    3. Item ID: Based on Issue year and number (yyyynn)
    4. Filename Pattern: cccc

  6. Filename/Directory Structure 3C: Collation Records and/or Cumulative Indexes
    1. Typically used: Serial collation records and indexes
    2. Existing collection using this structure: none
    3. Item ID: Based on Issue year and number (yyyynn)
    4. Filename Pattern: cccpppp

  7. Filename/Directory Structure 4: Copyright Registration and Technical Document Number Structure
    1. Typically used: Copyright deposits, technical reports
    2. Existing collection using this structure: none
    3. Item ID: Based on registration number or document number
    4. Filename Pattern: cccppppf

  8. Filename/Directory Structure 5: Large Volumes
    1. Typically used: Law Library materials
    2. Existing collection using this structure: none
    3. Item ID: Based on Law Library collation item number
      1. Example 1 (House Journal volume 1): 001
      2. Example 2 (Congressional Globe Volume 5): 005
      3. Note: though items numbers repeat for every group of material, they are distinguished by the aggregate names.
    4. Filename pattern: ccccpppp
      1. Based on sequential pages in the volume and print page numbers
      2. Example 1 (5th page scanned, print page number 3): 00050003.tif
      3. Example 2 (200th page, print page number 185): 02000185.tif

  9. Target Filenames
    1. ID Targets: 0000.tif or 0000.jpg
    2. Resolution Targets: tg300bt.tif (see details on J-35)

  10. Segmented images
    1. Filename Pattern: cccfxy
    2. Example 1 (bottom half of page the 79th page scanned in two parts): 079sb1.tif

  11. SGML and Associated Files
    1. SGML filename: itemid.sgm
    2. Page Info Group filename: itemid.pgi
    3. Reference filename: itemid.ref
    4. Omission Report filename:itemid.omi
    5. Entity filename: itemid.ent

-- Return to top --
-- Return to Text Quality Review Home --


Last revised April 23, 1997