utf-8.7: Minor rewordings in the opening paragraph

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
This commit is contained in:
Michael Kerrisk 2014-02-26 10:55:52 +01:00
parent 99c2f1a20f
commit 76f6db57c7
1 changed files with 6 additions and 6 deletions

View File

@ -26,7 +26,7 @@
.\" 2001-05-11 Markus Kuhn <mgk25@cl.cam.ac.uk>
.\" Update
.\"
.TH UTF-8 7 2012-04-30 "GNU" "Linux Programmer's Manual"
.TH UTF-8 7 2014-02-26 "GNU" "Linux Programmer's Manual"
.SH NAME
UTF-8 \- an ASCII compatible multibyte Unicode encoding
.SH DESCRIPTION
@ -37,11 +37,10 @@ The most obvious
Unicode encoding (known as
.BR UCS-2 )
consists of a sequence of 16-bit words.
Such strings can contain as
parts of many 16-bit characters bytes
like \(aq\\0\(aq or \(aq/\(aq which have a
Such strings can contain\(emas part of many 16-bit characters\(embytes
such as \(aq\\0\(aq or \(aq/\(aq, which have a
special meaning in filenames and other C library function arguments.
In addition, the majority of UNIX tools expects ASCII files and can't
In addition, the majority of UNIX tools expect ASCII files and can't
read 16-bit words as characters without major modifications.
For these reasons,
.B UCS-2
@ -50,7 +49,8 @@ is not a suitable external encoding of
in filenames, text files, environment variables, and so on.
The
.BR "ISO 10646 Universal Character Set (UCS)" ,
a superset of Unicode, occupies even a 31-bit code space and the obvious
a superset of Unicode, occupies an even larger code
space\(em31\ bits\(emand the obvious
.B UCS-4
encoding for it (a sequence of 32-bit words) has the same problems.