mirror of https://github.com/mkerrisk/man-pages
utf-8.7: Minor rewordings in the opening paragraph
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
This commit is contained in:
parent
99c2f1a20f
commit
76f6db57c7
12
man7/utf-8.7
12
man7/utf-8.7
|
@ -26,7 +26,7 @@
|
||||||
.\" 2001-05-11 Markus Kuhn <mgk25@cl.cam.ac.uk>
|
.\" 2001-05-11 Markus Kuhn <mgk25@cl.cam.ac.uk>
|
||||||
.\" Update
|
.\" Update
|
||||||
.\"
|
.\"
|
||||||
.TH UTF-8 7 2012-04-30 "GNU" "Linux Programmer's Manual"
|
.TH UTF-8 7 2014-02-26 "GNU" "Linux Programmer's Manual"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
UTF-8 \- an ASCII compatible multibyte Unicode encoding
|
UTF-8 \- an ASCII compatible multibyte Unicode encoding
|
||||||
.SH DESCRIPTION
|
.SH DESCRIPTION
|
||||||
|
@ -37,11 +37,10 @@ The most obvious
|
||||||
Unicode encoding (known as
|
Unicode encoding (known as
|
||||||
.BR UCS-2 )
|
.BR UCS-2 )
|
||||||
consists of a sequence of 16-bit words.
|
consists of a sequence of 16-bit words.
|
||||||
Such strings can contain as
|
Such strings can contain\(emas part of many 16-bit characters\(embytes
|
||||||
parts of many 16-bit characters bytes
|
such as \(aq\\0\(aq or \(aq/\(aq, which have a
|
||||||
like \(aq\\0\(aq or \(aq/\(aq which have a
|
|
||||||
special meaning in filenames and other C library function arguments.
|
special meaning in filenames and other C library function arguments.
|
||||||
In addition, the majority of UNIX tools expects ASCII files and can't
|
In addition, the majority of UNIX tools expect ASCII files and can't
|
||||||
read 16-bit words as characters without major modifications.
|
read 16-bit words as characters without major modifications.
|
||||||
For these reasons,
|
For these reasons,
|
||||||
.B UCS-2
|
.B UCS-2
|
||||||
|
@ -50,7 +49,8 @@ is not a suitable external encoding of
|
||||||
in filenames, text files, environment variables, and so on.
|
in filenames, text files, environment variables, and so on.
|
||||||
The
|
The
|
||||||
.BR "ISO 10646 Universal Character Set (UCS)" ,
|
.BR "ISO 10646 Universal Character Set (UCS)" ,
|
||||||
a superset of Unicode, occupies even a 31-bit code space and the obvious
|
a superset of Unicode, occupies an even larger code
|
||||||
|
space\(em31\ bits\(emand the obvious
|
||||||
.B UCS-4
|
.B UCS-4
|
||||||
encoding for it (a sequence of 32-bit words) has the same problems.
|
encoding for it (a sequence of 32-bit words) has the same problems.
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue