charsets.7: Minor tweaks

And restore a piece about Biblical Hebrew that was inadvertently deleted by Marko Myllynen's patch. Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2014-06-05 14:25:56 +02:00 · 2014-06-05 14:25:56 +02:00 · 42d940faf8
parent a8ed5f7430
commit 42d940faf8
1 changed files with 14 additions and 12 deletions
--- a/man7/charsets.7
+++ b/man7/charsets.7
@ -1,3 +1,4 @@
+'\" t -*- coding: UTF-8 -*-
 .\" Copyright (c) 1996 Eric S. Raymond <esr@thyrsus.com>
 .\" and Copyright (c) Andries Brouwer <aeb@cwi.nl>
 .\"
@ -46,7 +47,7 @@ supersets of ASCII.
 As Unicode, when using UTF-8, is ASCII-compatible, plain ASCII text
 still renders properly on modern UTF-8 using systems.
 .SS ISO 8859
-ISO 8859 is a series of 15 8-bit character sets all of which have ASCII
+ISO 8859 is a series of 15 8-bit character sets, all of which have ASCII
 in their low (7-bit) half, invisible control characters in positions
 128 to 159, and 96 fixed-width graphics in positions 160-255.
 .LP
@ -79,12 +80,12 @@ Slovak, and Slovene.
 Replacing Romanian ș/ț with ş/ţ was considered tolerable.
 .TP
 8859-3 (Latin-3)
-Latin-3 was designed to cover of Esperanto, Maltese, and Turkish but
+Latin-3 was designed to cover of Esperanto, Maltese, and Turkish, but
 8859-9 later superseded it for Turkish.
 .TP
 8859-4 (Latin-4)
 Latin-4 introduced letters for North European languages such as
-Estonian, Latvian, Lithuanian but was superseded by 8859-10 and
+Estonian, Latvian, and Lithuanian, but was superseded by 8859-10 and
 8859-13.
 .TP
 8859-5
@ -99,19 +100,20 @@ letter forms, but a proper display engine should combine these
 using the proper initial, medial, and final forms.
 .TP
 8859-7
-Was created for modern Greek in 1987, updated in 2003.
+Was created for Modern Greek in 1987, updated in 2003.
 .TP
 8859-8
-Supports modern Hebrew without niqud (punctuation signs).
+Supports Modern Hebrew without niqud (punctuation signs).
 Niqud and full-fledged Biblical Hebrew were outside the scope of this
-character set.
+character set;
+under Linux, UTF-8 is the preferred encoding for these.
 .TP
 8859-9 (Latin-5)
 This is a variant of Latin-1 that replaces Icelandic letters with
 Turkish ones.
 .TP
 8859-10 (Latin-6)
-Latin-6 added Inuit (Greenlandic) and Sami (Lappish) letters that were 
+Latin-6 added the Inuit (Greenlandic) and Sami (Lappish) letters that were 
 missing in Latin-4 to cover the entire Nordic area.
 .TP
 8859-11
@ -130,7 +132,7 @@ This is the Celtic character set, covering Old Irish, Manx, Gaelic,
 Welsh, Cornish, and Breton.
 .TP
 8859-15 (Latin-9)
-Latin-9 is similar to widely used Latin-1 but replaces some less
+Latin-9 is similar to the widely used Latin-1 but replaces some less
 common symbols with the Euro sign and French and Finnish letters that
 were missing in Latin-1.
 .TP
@ -142,7 +144,7 @@ KOI8-R is a non-ISO character set popular in Russia before Unicode.
 The lower half is ASCII;
 the upper is a Cyrillic character set somewhat better designed than
 ISO 8859-5.
-KOI8-U, based off KOI8-R, has better support for Ukrainian.
+KOI8-U, based on KOI8-R, has better support for Ukrainian.
 Neither of these sets are ISO-2022 compatible,
 unlike the ISO-8859 series.
 .LP
@ -198,7 +200,7 @@ It is not ISO 2022 compliant.
 .SS TIS-620
 TIS-620 is a Thai national standard character set and a superset
 of ASCII.
-Like in the ISO 8859 series, Thai characters are mapped into
+In the same fashion as the ISO 8859 series, Thai characters are mapped into
 0xa1-0xfe.
 .SS Unicode
 Unicode (ISO 10646) is a standard which aims to unambiguously represent
@ -262,9 +264,9 @@ Rendering of Unicode data streams is typically handled through
 "subfont" tables which map a subset of Unicode to glyphs.
 Internally
 the kernel uses Unicode to describe the subfont loaded in video RAM.
-This means that the Linux console in UTF-8 mode one can use a character 
+This means that in the Linux console in UTF-8 mode, one can use a character 
 set with 512 different symbols.
-This is not enough for Japanese, Chinese and
+This is not enough for Japanese, Chinese, and
 Korean, but it is enough for most other purposes.
 .LP
 .SS ISO 2022 and ISO 4873