The Library of Congress >> Especially for Librarians and Archivists >> Standards
MARC Standards
MARC 21 HOME >> Specifications >> Character Sets >> Part 5

MARC 21 Specifications for Record Structure, Character Sets, and Exchange Media

Code Table Basic Latin (ASCII)

December 2007

The first column in this table contains the MARC-8 code (in hex) for the character as coming from the G0 graphic set, the second column contains the MARC-8 code (in hex) for the character as coming from the G1 graphic set, the third column contains the UCS/Unicode 16-bit code (in hex), the fourth column contains the UTF-8 code (in hex) for the UCS characters, the fifth column contains a representation of the character (where possible), the sixth column contains the MARC character name, followed by the UCS name. If the MARC name is the same as or very similar to the UCS name, only the UCS name is given. For some tables alternate encodings in Unicode and UTF-8 are given. When that occurs the alternate Unicode and alternate UTF-8 columns follow the character name.


Not all characters display in all browsers. We have attempted to allow for font families that show each character set, but you must have one of these fonts on your computer. See the W3C site for a discussion of fonts: http://www.w3.org/TR/REC-CSS2/fonts.html#generic-font-families.

MARC-8UCSUTF-8CHARNAME
1B001B1BESCAPE (Unlikely to occur in UCS/Unicode)
1D001D1DRECORD TERMINATOR / GROUP SEPARATOR
1E001E1EFIELD TERMINATOR / RECORD SEPARATOR
1F001F1FSUBFIELD DELIMITER / UNIT SEPARATOR
20002020 SPACE, BLANK / SPACE
21002121!EXCLAMATION MARK
22002222"QUOTATION MARK
23002323#NUMBER SIGN
24002424$DOLLAR SIGN
25002525%PERCENT SIGN
26002626&AMPERSAND
27002727'APOSTROPHE
28002828(OPENING PARENTHESIS / LEFT PARENTHESIS
29002929)CLOSING PARENTHESIS / CLOSING PARENTHESIS
2A002A2A*ASTERISK
2B002B2B+PLUS SIGN
2C002C2C,COMMA
2D002D2D-HYPHEN-MINUS
2E002E2E.PERIOD, DECIMAL POINT / FULL STOP
2F002F2F/SLASH / SOLIDUS
300030300DIGIT ZERO
310031311DIGIT ONE
320032322DIGIT TWO
330033333DIGIT THREE
340034344DIGIT FOUR
350035355DIGIT FIVE
360036366DIGIT SIX
370037377DIGIT SEVEN
380038388DIGIT EIGHT
390039399DIGIT NINE
3A003A3A:COLON
3B003B3B;SEMICOLON
3C003C3C<LESS-THAN SIGN
3D003D3D=EQUALS SIGN
3E003E3E>GREATER-THAN SIGN
3F003F3F?QUESTION MARK
40004040@COMMERCIAL AT
41004141ALATIN CAPITAL LETTER A
42004242BLATIN CAPITAL LETTER B
43004343CLATIN CAPITAL LETTER C
44004444DLATIN CAPITAL LETTER D
45004545ELATIN CAPITAL LETTER E
46004646FLATIN CAPITAL LETTER F
47004747GLATIN CAPITAL LETTER G
48004848HLATIN CAPITAL LETTER H
49004949ILATIN CAPITAL LETTER I
4A004A4AJLATIN CAPITAL LETTER J
4B004B4BKLATIN CAPITAL LETTER K
4C004C4CLLATIN CAPITAL LETTER L
4D004D4DMLATIN CAPITAL LETTER M
4E004E4ENLATIN CAPITAL LETTER N
4F004F4FOLATIN CAPITAL LETTER O
50005050PLATIN CAPITAL LETTER P
51005151QLATIN CAPITAL LETTER Q
52005252RLATIN CAPITAL LETTER R
53005353SLATIN CAPITAL LETTER S
54005454TLATIN CAPITAL LETTER T
55005555ULATIN CAPITAL LETTER U
56005656VLATIN CAPITAL LETTER V
57005757WLATIN CAPITAL LETTER W
58005858XLATIN CAPITAL LETTER X
59005959YLATIN CAPITAL LETTER Y
5A005A5AZLATIN CAPITAL LETTER Z
5B005B5B[OPENING SQUARE BRACKET / LEFT SQUARE BRACKET
5C005C5C\REVERSE SLASH / REVERSE SOLIDUS
5D005D5D]CLOSING SQUARE BRACKET / RIGHT SQUARE BRACKET
5E005E5E^SPACING CIRCUMFLEX / CIRCUMFLEX ACCENT
5F005F5F_SPACING UNDERSCORE / LOW LINE
60006060`SPACING GRAVE / GRAVE ACCENT
61006161aLATIN SMALL LETTER A
62006262bLATIN SMALL LETTER B
63006363cLATIN SMALL LETTER C
64006464dLATIN SMALL LETTER D
65006565eLATIN SMALL LETTER E
66006666fLATIN SMALL LETTER F
67006767gLATIN SMALL LETTER G
68006868hLATIN SMALL LETTER H
69006969iLATIN SMALL LETTER I
6A006A6AjLATIN SMALL LETTER J
6B006B6BkLATIN SMALL LETTER K
6C006C6ClLATIN SMALL LETTER L
6D006D6DmLATIN SMALL LETTER M
6E006E6EnLATIN SMALL LETTER N
6F006F6FoLATIN SMALL LETTER O
70007070pLATIN SMALL LETTER P
71007171qLATIN SMALL LETTER Q
72007272rLATIN SMALL LETTER R
73007373sLATIN SMALL LETTER S
74007474tLATIN SMALL LETTER T
75007575uLATIN SMALL LETTER U
76007676vLATIN SMALL LETTER V
77007777wLATIN SMALL LETTER W
78007878xLATIN SMALL LETTER X
79007979yLATIN SMALL LETTER Y
7A007A7AzLATIN SMALL LETTER Z
7B007B7B{OPENING CURLY BRACKET / LEFT CURLY BRACKET
7C007C7C|VERTICAL BAR (FILL) / VERTICAL LINE
7D007D7D}CLOSING CURLY BRACKET / RIGHT CURLY BRACKET
7E007E7E~SPACING TILDE / TILDE

MARC 21 Specifications for Record Structure, Character Sets, and Exchange Media

Code Table Extended Latin (ANSEL)

December 2007

The first column in this table contains the MARC-8 code (in hex) for the character as coming from the G0 graphic set, the second column contains the MARC-8 code (in hex) for the character as coming from the G1 graphic set, the third column contains the UCS/Unicode 16-bit code (in hex), the fourth column contains the UTF-8 code (in hex) for the UCS characters, the fifth column contains a representation of the character (where possible), the sixth column contains the MARC character name, followed by the UCS name. If the MARC name is the same as or very similar to the UCS name, only the UCS name is given. For some tables alternate encodings in Unicode and UTF-8 are given. When that occurs the alternate Unicode and alternate UTF-8 columns follow the character name.

Revised June 2004 to add the Eszett (M+C7) and the Euro Sign (M+C8) to the MARC-8 set.

Revised September 2004 to change the mapping from MARC-8 to Unicode for the Ligature (M+EB and M+EC) from U+FE20 and U+FE21 to U+0361.

Revised September 2004 to change the mapping from MARC-8 to Unicode for the Double Tilde (M+FA and M+FB) from U+FE22 and U+FE23 to U+0360.

Revised March 2005 to change the mapping from MARC-8 to Unicode for the Alif (M+2E) from U+02BE to U+02BC.


Not all characters display in all browsers. We have attempted to allow for font families that show each character set, but you must have one of these fonts on your computer. See the W3C site for a discussion of fonts: http://www.w3.org/TR/REC-CSS2/fonts.html#generic-font-families.

MARC-8UCSUTF-8CHARC?NAMEALTALT UTF-8
880098C298˜NON-SORT BEGIN / START OF STRING
89009CC29CœNON-SORT END / STRING TERMINATOR
8D200DE2808DJOINER / ZERO WIDTH JOINER
8E200CE2808CNON-JOINER / ZERO WIDTH NON-JOINER
A10141C581ŁUPPERCASE POLISH L / LATIN CAPITAL LETTER L WITH STROKE
A200D8C398ØUPPERCASE SCANDINAVIAN O / LATIN CAPITAL LETTER O WITH STROKE
A30110C490ĐUPPERCASE D WITH CROSSBAR / LATIN CAPITAL LETTER D WITH STROKE
A400DEC39EÞUPPERCASE ICELANDIC THORN / LATIN CAPITAL LETTER THORN (Icelandic)
A500C6C386ÆUPPERCASE DIGRAPH AE / LATIN CAPITAL LIGATURE AE
A60152C592ŒUPPERCASE DIGRAPH OE / LATIN CAPITAL LIGATURE OE
A702B9CAB9ʹSOFT SIGN, PRIME / MODIFIER LETTER PRIME
A800B7C2B7·MIDDLE DOT
A9266DE299ADMUSIC FLAT SIGN
AA00AEC2AE®PATENT MARK / REGISTERED SIGN
AB00B1C2B1±PLUS OR MINUS / PLUS-MINUS SIGN
AC01A0C6A0ƠUPPERCASE O-HOOK / LATIN CAPITAL LETTER O WITH HORN
AD01AFC6AFƯUPPERCASE U-HOOK / LATIN CAPITAL LETTER U WITH HORN
AE02BCCABCʼALIF / MODIFIER LETTER APOSTROPHE
B002BBCABBʻAYN / MODIFIER LETTER TURNED COMMA
B10142C582łLOWERCASE POLISH L / LATIN SMALL LETTER L WITH STROKE
B200F8C3B8øLOWERCASE SCANDINAVIAN O / LATIN SMALL LETTER O WITH STROKE
B30111C491đLOWERCASE D WITH CROSSBAR / LATIN SMALL LETTER D WITH STROKE
B400FEC3BEþLOWERCASE ICELANDIC THORN / LATIN SMALL LETTER THORN (Icelandic)
B500E6C3A6æLOWERCASE DIGRAPH AE / LATIN SMALL LIGATURE AE
B60153C593œLOWERCASE DIGRAPH OE / LATIN SMALL LIGATURE OE
B702BACABAʺHARD SIGN, DOUBLE PRIME / MODIFIER LETTER DOUBLE PRIME
B80131C4B1ıLOWERCASE TURKISH I / LATIN SMALL LETTER DOTLESS I
B900A3C2A3£BRITISH POUND / POUND SIGN
BA00F0C3B0ðLOWERCASE ETH / LATIN SMALL LETTER ETH (Icelandic)
BC01A1C6A1ơLOWERCASE O-HOOK / LATIN SMALL LETTER O WITH HORN
BD01B0C6B0ưLOWERCASE U-HOOK / LATIN SMALL LETTER U WITH HORN
C000B0C2B0°DEGREE SIGN
C12113E28493SCRIPT SMALL L
C22117E28497SOUND RECORDING COPYRIGHT
C300A9C2A9©COPYRIGHT SIGN
C4266FE299AFMUSIC SHARP SIGN
C500BFC2BF¿INVERTED QUESTION MARK
C600A1C2A1¡INVERTED EXCLAMATION MARK
C700DFC39FßESZETT SYMBOL
C820ACE282ACEURO SIGN
E00309CC89̉CPSEUDO QUESTION MARK / COMBINING HOOK ABOVE
E10300CC80̀CGRAVE / COMBINING GRAVE ACCENT (Varia)
E20301CC81́CACUTE / COMBINING ACUTE ACCENT (Oxia)
E30302CC82̂CCIRCUMFLEX / COMBINING CIRCUMFLEX ACCENT
E40303CC83̃CTILDE / COMBINING TILDE
E50304CC84̄CMACRON / COMBINING MACRON
E60306CC86̆CBREVE / COMBINING BREVE (Vrachy)
E70307CC87̇CSUPERIOR DOT / COMBINING DOT ABOVE
E80308CC88̈CUMLAUT, DIAERESIS / COMBINING DIAERESIS (Dialytika)
E9030CCC8ČCHACEK / COMBINING CARON
EA030ACC8ÅCCIRCLE ABOVE, ANGSTROM / COMBINING RING ABOVE
EB0361CDA1͡CLIGATURE, FIRST HALF / COMBINING DOUBLE INVERTED BREVEFE20EFB8A0
ECNote 1CLIGATURE, SECOND HALF / COMBINING LIGATURE RIGHT HALFFE21EFB8A1
ED0315CC95̕CHIGH COMMA, OFF CENTER / COMBINING COMMA ABOVE RIGHT
EE030BCC8B̋CDOUBLE ACUTE / COMBINING DOUBLE ACUTE ACCENT
EF0310CC90̐CCANDRABINDU / COMBINING CANDRABINDU
F00327CCA7̧CCEDILLA / COMBINING CEDILLA
F10328CCA8̨CRIGHT HOOK, OGONEK / COMBINING OGONEK
F20323CCA3̣CDOT BELOW / COMBINING DOT BELOW
F30324CCA4̤CDOUBLE DOT BELOW / COMBINING DIAERESIS BELOW
F40325CCA5̥CCIRCLE BELOW / COMBINING RING BELOW
F50333CCB3̳CDOUBLE UNDERSCORE / COMBINING DOUBLE LOW LINE
F60332CCB2̲CUNDERSCORE / COMBINING LOW LINE
F70326CCA6̦CLEFT HOOK (COMMA BELOW) / COMBINING COMMA BELOW
F8031CCC9C̜CRIGHT CEDILLA / COMBINING LEFT HALF RING BELOW
F9032ECCAE̮CUPADHMANIYA / COMBINING BREVE BELOW
FA0360CDA0͠CDOUBLE TILDE, FIRST HALF / COMBINING DOUBLE TILDEFE22EFB8A2
FBNote 2CDOUBLE TILDE, SECOND HALF / COMBINING DOUBLE TILDE RIGHT HALFFE23EFB8A3
FE0313CC93̓CHIGH COMMA, CENTERED / COMBINING COMMA ABOVE (Psili)

Note 1: The Ligature that spans two characters is constructed of two halves in MARC-8: EB (Ligature, first half) and EC (Ligature, second half). The preferred Unicode/UTF-8 mapping is to the single character Ligature that spans two characters, U+0361. The single character Ligature is encoded between the two characters to be spanned. The two half Ligatures in Unicode, to which the Ligature has been mapped since 1996, are indicated in the mapping as alternatives, but their use is not recommended. It is expected that font support for the single character Ligature mark will be more easily obtained than for the two halves.

Note 2: The Double Tilde that spans two characters is constructed of two halves in MARC-8: FA (Double Tilde, first half) and FB (Double Tilde, second half). The preferred Unicode/UTF-8 mapping is to the single character Double Tilde that spans two characters, U+0360. The single character Double Tilde is encoded between the two characters to be spanned. The two half Double Tildes in Unicode, to which the MARC8 Double Tilde has been mapped since 1996, are indicated in the mapping as alternatives, but their use is not recommended. It is expected that font support for the single character Double Tilde mark will be more easily obtained than for the two halves.

Go to top of document


MARC 21 HOME >> Specifications >> Character Sets >> Part 5
The Library of Congress> > Especially for Librarians and Archivists >> Standards
(2007-12)
Contact Us