ISO 8859-1
ISO 8859-1, more formally cited as
ISO/IEC 8859-1 or less formally as
Latin-1, is part 1 of
ISO/IEC 8859, a standard
character encoding defined by
ISO. It encodes what it refers to as
Latin alphabet no. 1, consisting of 191
characters from the Latin
script, each encoded as a single 8-
bit code value. These code values can be used in almost any data interchange system to communicate in the following European languages:
Albanian,
Basque,
Catalan,
Danish,
Dutch,
English,
Faroese,
Finnish,
German,
Icelandic,
Irish,
Italian,
Norwegian,
Portuguese,
Rhaeto-Romanic,
Scottish,
Spanish,
Swedish. Other languages covered include
Afrikaans and
Swahili. Thus, this character encoding is used throughout the
American continent,
Western Europe,
Australia, and much of
Africa.
ISO/IEC 8859-1
ISO/IEC 8859-1 suffers from a number of deficiencies, including the omission of a few French and Finnish letters and the lack of a Euro symbol. For this reason, ISO/IEC 8859-15 has been developed as an update of ISO/IEC 8859-1 to add the required additional characters. (This required however the removal of some less used characters from ISO/IEC 8859-1, including fraction symbols and letter-free diacritics: ¤, ¦, ¨, ´, ¸, ¼, ½, and ¾.)
Since all 191 characters encoded by ISO/IEC 8859-1 are graphic and compatible with most web browsers, they can be shown as glyphs in the following table. The row and column headings indicate the hexadecimal digit combinations to produce the 8-bit code value; e.g., "L" is hex 4C, or binary 01001100.
In the table above, 20 is the regular SPACE character, and A0 is the NO-BREAK SPACE. AD is a
SOFT HYPHEN, which should not appear at all in compliant web browsers.
Code values 00-1F, 7F, and 80-9F are not assigned to characters by ISO/IEC 8859-1.
ISO 8859-1 vs ISO-8859-1
\nThe IANA has approved
ISO-8859-1 (note the extra hyphen), a superset of ISO/IEC 8859-1, for use on the
Internet. This character map, or
character set or
code page, supplements the assignments made by ISO/IEC 8859-1, mapping
control characters to code values 00-1F, 7F, and 80-9F. It thus provides for 256 characters via every possible 8-bit value.
The IANA allows all of the following aliases for ISO-8859-1 to be used case-insensitively:\n*
ISO_8859-1:1987\n*
ISO_8859-1\n*
ISO-8859-1\n*
iso-ir-100\n*
csISOLatin1\n*
latin1\n*
l1\n*
IBM819\n*
CP819
The name
Latin-1 is an informal alias unrecognized by ISO or the IANA, but is perhaps meaningful in some computer software.
The following table shows ISO-8859-1, with the 3-letter abbreviations for the control characters shown in underlined text.
| ISO-8859-1 |
|---|
| x0 | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | xA | xB | xC | xD | xE | xF | \n
|---|
| 0x | NUL | SOH | STX | ETX | EOT | ENQ | ACK | BEL | BS | TAB | LF | VT | FF | CR | SO | SI\n |
|---|
| 1x | DLE | DC1 | DC2 | DC3 | DC4 | NAK | SYN | ETB | CAN | EM | SUB | ESC | FS | GS | RS | US\n |
|---|
| 2x | SP | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | /\n |
|---|
| 3x | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ?\n |
|---|
| 4x | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O\n |
|---|
| 5x | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \\ | ] | ^ | _\n |
|---|
| 6x | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o\n |
|---|
| 7x | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | DEL\n |
|---|
| 8x | PAD | HOP | BPH | NBH | IND | NEL | SSA | ESA | HTS | HTJ | VTS | PLD | PLU | RI | SS2 | SS3\n |
|---|
| 9x | DCS | PU1 | PU2 | STS | CCH | MW | SPA | EPA | SOS | SGCI | SCI | CSI | ST | OSC | PM | APC\n |
|---|
| Ax | NBSP | ¡ | ¢ | £ | ¤ | ¥ | ¦ | § | ¨ | © | ª | « | ¬ | | ® | ¯\n |
|---|
| Bx | ° | ± | ² | ³ | ´ | µ | ¶ | · | ¸ | ¹ | º | » | ¼ | ½ | ¾ | ¿\n |
|---|
| Cx | À | Á | Â | Ã | Ä | Å | Æ | Ç | È | É | Ê | Ë | Ì | Í | Î | Ï\n |
|---|
| Dx | Ð | Ñ | Ò | Ó | Ô | Õ | Ö | × | Ø | Ù | Ú | Û | Ü | Ý | Þ | ß\n |
|---|
| Ex | à | á | â | ã | ä | å | æ | ç | è | é | ê | ë | ì | í | î | ï\n |
|---|
| Fx | ð | ñ | ò | ó | ô | õ | ö | ÷ | ø | ù | ú | û | ü | ý | þ | ÿ\n |
|---|
In the table above, 20 is the regular SPACE character, and A0 is the NO-BREAK SPACE. AD is a SOFT HYPHEN, which may not appear at all in some web browsers.
\nThere are additional parts to the ISO/IEC 8859 standard that have corresponding IANA-approved character sets, e.g. ISO/IEC 8859-10 (Latin alphabet no. 6) is very similar to character set ISO-8859-10. Each of the ISO/IEC 8859-
x parts encodes characters in the same way: they cover the
ASCII range (hex 20-7E) plus 96 additional characters in the A0-FF range, for a total of 191 characters. The ISO-8859-
x sets each add the ISO 646 C0 "control" characters from 00-1F, a control character at 7F, and control characters in the 80-9F range, thus encompassing a total of 256 characters. ISO-8859-1 is unique among these sets in that that its coded characters are equivalent to the first 256 code points of
Unicode.
ISO-8859-1 is the standard encoding used by the
X Window System on most
Unix machines.
Windows-1252
\nThe legacy components of Microsoft Windows use, by default, an encoding that is a superset of ISO/IEC 8859-1, but differs from ISO-8859-1, using displayable characters rather than control characters in the 80-9F range. Windows calls it
ANSI generically, but depending on where the operating system was sold, the character set will have another name, e.g.
CP1252 in the US and Western European markets, with the IANA-approved name
Windows-1252.
The following table shows Windows-1252, with changes from ISO-8859-1 highlighted:
In the table above, 20 is the regular SPACE character, and A0 is the NO-BREAK SPACE. AD is a SOFT HYPHEN, which should not appear at all in compliant web browsers. 81, 8D, 8F, 90, and 9D are unused. The Euro character at position 80 was not present in earlier versions of this code page.
MacRoman
Older Apple Macintosh computers use an encoding,
MacRoman, that differs from ISO 8859-1 in the first 32 and beyond the first 127 characters, but does include all characters present in ISO 8859-1 at other locations, with the exception of the soft hyphen. In contrast MacRoman includes multiple characters which are not in ISO 8859-1. The Euro glyph replaced the previous generic currency sign.
The following table shows MacRoman, with the differences from ISO-8859-1 highlighted:
In the table above, 20 is the regular SPACE character, and CA is the NO-BREAK SPACE. F0 is a glyph depicting the

. This character does not exist in
Unicode and therefore is remapped in the Private Use Area. If your user agent displays anything there it may or may not be the
Apple Computer logo.
00–08, 0B and 0C, 0E–1F and 7F are unused.
The distinction between ISO 8859-1, ISO-8859-1, Windows-1252, and MacRoman is a common source of confusion among computer programmers.
External links
\n*ISO/IEC 8859-1:1998 final draft of the standard (PDF)\n*
Windows Codepages\n*
Differences between ANSI, ISO-8859-1 and MacRoman Character Sets\n*
The Letter Database\n*
ASCII - ISO 8859-1 Table with HTML Entity Names
Category:ISO standards\nCategory:Character sets
\n\n\n