ATTACHMENT 7
KEYING INSTRUCTIONS
For an example of SGML-Encoded texts, refer to ATTACHMENT 3.
1.Text to key and tag:
- Unless otherwise instructed, all the words in the document, left to
right, top bottom.
Words are to be keyed in intelligent clusters. For example, text in
each cell of a table should be keyed as a unit, rather than reading across
a row and concatenating words in different table cells; Words are to be
keyed exactly as they appear. Retain all the variant and incorrect spelling
in the original text;
- Each first occurrence of letterhead;
- Text of advertisements, unless Document Instructions say to omit advertising
text. (For complicated advertising formats, key the text as table text
in cells.); and
- Masthead of a newspaper, telegram, etc.
2. Text that will NOT be keyed or retained:
- Running heads;
- Text in illustrations;
- Telephone book-style "ears";
- Hyphens that appear only because a word was too big to fit on a line
(Note: When a word is hyphenated as the last word on the page, complete
the word before beginning the page information group tags.);
- Letterhead (forms heads or personal printed stationery) except for
each first appearance; and
- Immediate corrections. (Note: Where typos have been struck over, key
the corrected letter and ignore the wrong letter.)
3. Marks to ignore during keying include:
- Incidental marks such as coffee stains, blood, doodles, fingerprints,
etc.; Other types of marks, such as stamped, embossed and perforated marks,
should be keyed and tagged appropriately.
4. Line breaks:
- Will be preserved only where they are unusual (title page text, for
example) or important (poetry, for example).
5. Special characters:
- For non-ASCII characters (omega, less-than-or-equal symbol, smiley
face, pointing hand, etc.), the keyers should key the character entity,
if there is one, for example, ω and ≤. If there is no
publicly declared entity (see ISO 8879 character sets), the contractor
should insert "[???]";
- Illuminated characters (and other odd-sized or decorated letters) should
be tagged as
. The entire word should appear between the
tags, not just the initial letter.
6. Standard attributes:
- The "creator" attribute for the should read: "Library
of Congress"; and
- The "date.created" attribute for the should be
set to current date.
7. Page numbers:
- All page numbers on a page will be keyed, using as many tags
as necessary. An unnumbered page is indicated by empty tags;
- When tagging a page number, keep the number and discard any characters
such as brackets, braces, or the word "page" that are used to
set off the number. For example, all the following would be tagged as 3:
PAGE:3 -3- {3} [3] -page 3-
- The sequential number of the page surface in the document (excluding
blank pages), starting from 1, will be recorded in the element
with padded zeroes to four digits; 0001 Controlpgno
ENTITY values should consist of the image filename, without the extension.
- should be keyed at the beginning of a page, but not mid-word.
8. Letterhead:
- Every time there is letterhead that is not identical to that on the
previous page (of a letter, for example), it should be keyed and tagged
as text.
9. Rules, leader dots, and other spacing:
- Leader dots replace with an tag;
- Ellipses keyed as a series of periods;
- Vertical ellipses Use ISO special character entity;
- Rules, vines, borders, and other decorations ignore;
- Gaps in text, where items are not tabular but are deliberately and
clearly separated by various amounts of white space replace all gaps, whatever
the size, with the tag;
- Braces (grouping items) are the same as a table structure and should
be keyed accordingly; and
- Spaces in between the letters spelling a word, usually on title pages
of typed documents, should not be keyed. The text should be tagged as highlighted
text.
10. Tagging of forms (where form is defined as preprinted questions
or statements where a user response is required):
11. The tag - footnotes and endnotes:
- Footnote text, which is referenced in the document text and printed
at the bottom of the page, will be tagged as and incorporated into
the document text after the paragraph in which it is referenced. The tag
will be used to mark the reference to the footnote where it occurs in the
document text.
- Endnote text, which is referenced in the document text but printed
at the end of a major division such as a chapter, will be tagged as and
incorporated into the document text at the division end. The tag
will be used to mark the reference to the endnote where it occurs in the
document text.
- References to and shall be as follows: anchor
ID="n0019-01" Always starts with n (for note), followed by the
control page number (padded with zeroes to make a four digit number), followed
by a hyphen, followed by 01, if it is the first or only note on that page.
If it is the second note on that page, it will be n0019-02. Number all
subsequent notes on the page in ascending order. Type the actual reference
character or entity (e.g., *, 1, or &dag;) in between the start and
end tags. note anchor.ids="n0019-01" The anchor.ids value
should match exactly the ID value in the anchor tag. Type the actual reference
character (e.g., *, 1, or &dag;)-- if it appears before the note text,
at the beginning of the note text after the start tag. Subsequent anchor
IDs for an established note should be numbered sequentially in the regular
manner. Type the actual reference character or entity (e.g., *, 1, or &dag;)
in between the start and end tags.
12. Insertion of tags:
- Tags must never be inserted into the middle of a word; they should
either surround the word or follow it.
- Tags must never replace a space between words.
13. Catch words:
- The odd words repeated at the end of a column or page of text to indicate
the first word on the next column or page, will be treated as a new line
of text, set off by the line break tag
.
14. Columns:
- Treat newspaper-style columns by typing the first column top to bottom
and then the second top to bottom, etc., page by page.
- Treat parallel columns by typing the entire first column even if it
spans pages, then the entire second column, etc., unless there is significance
to the horizontal placement - then treat it as table.
15. Bookplates:
- Key all the text contained in bookplates. Use
(linebreak) to separate
short lines of text.
16. Targets:
- Do not treat targets as the first page of a document. (Page images
of targets should always have filenames that end with at least two zeroes,
"00".)
- The text provided on the target should be keyed in the appropriate
part of the document teiheader. Most targets will contain the text for
the entire teiheader.
17. Typographical design of original:
- Do not try to mimic the typographical design or format of the original
by using extra hard returns, spaces or other typing feats.
18. Text type:
- The National Digital Library Program uses only two attrributes values
for the attribute "type" in the tag: publication or manuscript.
The Library will specify which text type is appropriate for each collection
or set of documents. (This information is generally provided on document
targets following header contents.)
20. Illustration and table ENTITY values:
- Illustration and table ENTITY values should consist of the image filename
without the extension.
21. Title pages:
- Key the text on title pages using paragraph tags to indicate logical
groupings of information. For example, for a centered title and author
statement on a title page, begin with, type the text using
to indicate
where the lines end, and close the paragraph
- When the statement is complete. Using this approach, most title pages
are likely to have at least one paragraph containing the title and author
information and another paragraph containing the publication information.
22. Delivery of completed document texts:
- Each document must be provided to the Library in a single ASCII text
file. If you break up a document into multiple parts for keying and/or
tagging, you must reassemble it into a single file before delivery. Associated
files for each document shall also be delivered in single, separate files
for each document.
Next.....Previous..
...Return to Section J Table of Contents.....Return
to the Table of Contents