mirror of https://github.com/mkerrisk/man-pages
utf-8.7: Minor rewordings in the opening paragraph
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
This commit is contained in:
parent
99c2f1a20f
commit
76f6db57c7
12
man7/utf-8.7
12
man7/utf-8.7
|
@ -26,7 +26,7 @@
|
|||
.\" 2001-05-11 Markus Kuhn <mgk25@cl.cam.ac.uk>
|
||||
.\" Update
|
||||
.\"
|
||||
.TH UTF-8 7 2012-04-30 "GNU" "Linux Programmer's Manual"
|
||||
.TH UTF-8 7 2014-02-26 "GNU" "Linux Programmer's Manual"
|
||||
.SH NAME
|
||||
UTF-8 \- an ASCII compatible multibyte Unicode encoding
|
||||
.SH DESCRIPTION
|
||||
|
@ -37,11 +37,10 @@ The most obvious
|
|||
Unicode encoding (known as
|
||||
.BR UCS-2 )
|
||||
consists of a sequence of 16-bit words.
|
||||
Such strings can contain as
|
||||
parts of many 16-bit characters bytes
|
||||
like \(aq\\0\(aq or \(aq/\(aq which have a
|
||||
Such strings can contain\(emas part of many 16-bit characters\(embytes
|
||||
such as \(aq\\0\(aq or \(aq/\(aq, which have a
|
||||
special meaning in filenames and other C library function arguments.
|
||||
In addition, the majority of UNIX tools expects ASCII files and can't
|
||||
In addition, the majority of UNIX tools expect ASCII files and can't
|
||||
read 16-bit words as characters without major modifications.
|
||||
For these reasons,
|
||||
.B UCS-2
|
||||
|
@ -50,7 +49,8 @@ is not a suitable external encoding of
|
|||
in filenames, text files, environment variables, and so on.
|
||||
The
|
||||
.BR "ISO 10646 Universal Character Set (UCS)" ,
|
||||
a superset of Unicode, occupies even a 31-bit code space and the obvious
|
||||
a superset of Unicode, occupies an even larger code
|
||||
space\(em31\ bits\(emand the obvious
|
||||
.B UCS-4
|
||||
encoding for it (a sequence of 32-bit words) has the same problems.
|
||||
|
||||
|
|
Loading…
Reference in New Issue