Text Quality Review Guidelines
 

Online Presentation Issues: Parse | Deliverables | Filenames | Page Info | Image References | .ent Syntax | <OMIT> | Header | <DIV>

Content Issues: General Tags | Special Characters | Paragraphs | Keying Errors

NDLP Text Quality Review Reference Guide
To print, set your Print settings to Landscape.
Choose File/Print, then Properties, then Landscape.
Be sure to change it back to Portrait when you're done.

Note: Quality Factors are presented hierarchically by "technical functionality" and importance.

Quality Factors Who? What file? How to Check? Reference Quality Measures
Do all the files in a batch parse? CR2U [Central Receipt and Review Unit] .sgm and .ent files Run NSGMLS and OmniMark parsers. Parsing Instructions Accuracy: 100%
Errors Allowed: 0
Review Action: Return to vendor
Are all the deliverables (.sgm and accompanying files) present? CR2U .sgm, .pgi, .ref, .ent, .omt files and a directory list of all files in batch Review directory on disk. RFP 96-18 Section C.11 Accuracy: 100%
Errors Allowed: 0
Review Action: Return for rework
Do file names follow requirements? Project team Directory list of files or text-check report Review directory list. Or run text_check script and review report. Keying Instructions, Section B.2. Accuracy: 99.95%
Errors Allowed: 1 in 2000 file names
Review Action: Return for rework
Is the page break information correct?
  • Are control page numbers in sequence and complete? 
  • Do print page numbers reflect the actual print page format of the original work? 
Project team .pgi file  Review .pgi file: check for missing or repeated <controlpgno> content; review <printpgno>content for format. Keying Instructions, Section C.3.4 Accuracy: 100%
Errors Allowed: 0
Review Action: Return for rework
Do references to image files for pages, illustrations, and tables follow the correct syntax? 
  • Do <illus> entity values begin with i? 
  • Do <table> entity values begin with p? 
  • Do <note> anchor. ids values begin with n? 
Project team .ref file Review .ref file for entity values. Keying Instructions, Sections B.3, G.2-4,H.1 & I.1.3. Accuracy: 100%
Errors Allowed: 0
Review Action: Return for rework
Is the syntax of the .ent file correct?
  • Do the file names in the .ent file properly reflect the image file names? 
  • Are all file names in lowercase? 
  • Are appropriate file types designated (i.e. .tif, .jpg)? 
CR2U and project team .ent file Review .ent file.
CR2U: Parser checks .ent file.
Project Team: After copied to server check with OmniMark entity checker routine.
1. Image Filenaming Instruction Sheet
2. Directory list of image batch
Accuracy: 100%
Errors Allowed: 0
Review Action: Return for rework
Has <omit> been properly used?
  • Is there a reasonable number of <omit> uses for the document condition? 
  • Is <omit> used to show omitted words and ? used to show omitted characters? 
Project Team .omt file Review .omt file. Keying Instructions, Section E.5. Accuracy: 99.95%
Errors Allowed: 5 unjustified uses 
Review Action: Advise vendor and correct in-house
Is the header information correct?
  • Does <amid> element content match file name? 
  • Is <title> content consistent with information keyed in <div type=idinfo> element? 
  • Have dates in header been rendered in numbers? 
Project team .sgm file Open .sgm file in an ASCIIeditor (Notespad, Rules Builder or Codewright) and look for key elements in <teiheader>. Keying Instructions, Section B.1-2, D.9.1,D.2.2b. Accuracy: 100%
Errors Allowed: 0
Review Action: Return for rework
Have <div> tags been properly applied?
  • Do the <div> elements reflect a reasonable division of the work into its component parts? 
  • Does the application of the <div> element followed by <head> indicate a reasonable Table of Contents for the work? 
Project team .sgm file Search for <head> in .sgm file and, if it is a <div><head>, compare to table of contents of the work, if one exists. Or load file into Panorama and check the navigator contents. 1. Keying Instructions, Sections D.7-9,
2. Tag Library <div> definition, p.24 and Appendix A.
Accuracy: 99.95%
Errors Allowed: 1 tag per 10,000 characters or approx. 1 tag per 5 pages 
Review Action: Advise vendor, correct in-house, or request rework
In general, have tags been applied correctly?
  • Do you know what features occur in your text (i.e. tables,lists,notes, stamped, handwritten or emphasized text)? 
  • Is there a <front> or <back> if the work has front matter or back matter? 
  • Is <hi> used in <head> to show special emphasis for only certain words? 
  • Are <note> tags used appropriately? 
  • Is the <blankpage> element used for pages with the z feature code? 
  • Has <table> markup been used to indicate the text of tables? 
  • Are items in <list> correctly identified? 
  • Is the table of contents for the work identified as a <list>? 
Project team .sgm file Search for elements that you expect to find in your text. If they are present, check their application. If they are not present, identify where they should have occurred. 1. Keying Instructions, Sections D, E, G. I, and J.
2. Tag Library definitions for appropriate elements.
3. Special Keying Instructions
Accuracy: 99.95%
Errors Allowed: 1 tag per 10,000 characters or approx. 1 tag per 5 pages 
Review Action: Advise vendor, correct in-house, or request rework
Are special characters captured properly?
  • Do the character entities represent the correct characters? 
  • If character entity is not available, has [???] been used to identify unknown characters, or <omit reason="untranscribable"> used? 
Project team  Text_check report or .sgm file Search for the ampersand (&) character in the .sgm file.  1. Keying Instructions, Section A.4.
2. ISO 8879
NOTE: Most common characters are &rdquo;, &ldquo;, &apos;, &amp;. 
Accuracy: 99.95%
Errors Allowed: 1 tag per 10,000 characters or approx. 1 tag per 5 pages 
Review Action: Advise vendor, correct in-house, or request rework
Are paragraphs closed properly if they contain lists? Project Team .sgm file Search for : and look for </p><list> directly following. Keying Instructions, Section D.10.2. Accuracy: 99.95%
Errors Allowed: 1 tag per 10,000 characters or approx. 1 tag per 5 pages 
Review Action: Advise vendor, correct in-house, or request rework
Is the file keyed with less than 1 incorrect character per 2000? Project team .sgm file Sample a page or two of the batch to check spelling and keying. Or load your .sgm file into Word or WordPerfect and look at the underlined words.  Original source document Accuracy: 99.95%
Errors Allowed: 1 character per 2000 characters or approx. 1 character per page 
Review Action: Advise vendor, correct in-house, or request rework

-- Return to top --

NDLP, Library of Congress - revised May 1998--ma