1
0
mirror of https://github.com/openbsd/src.git synced 2024-12-21 23:18:00 -08:00

Even though US-ASCII (= ANSI X3.4-1986) only defines 128 characters,

the POSIX standard explicitly requires in section 6.2 that "the POSIX
locale shall contain 256 single-byte characters", see:
https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap06.html#tag_06_02

So the current behaviour of treating non-ASCII bytes in an LC_CTYPE=POSIX
input stream as if they were characters is not a POSIX violation, but
actually required by the standard - and not just for awk(1), but for
utility programs in general and even for library functions in general.
Consequently, delete the wrong sentence i added to the STANDARDS section
last year.

Thanks to millert@ and jmc@ for making me realize my mistake.
OK millert@ jmc@
This commit is contained in:
schwarze 2024-08-11 18:24:43 +00:00
parent f3825f8693
commit 76e9942174

View File

@ -1,4 +1,4 @@
.\" $OpenBSD: awk.1,v 1.69 2024/07/30 13:55:11 jmc Exp $
.\" $OpenBSD: awk.1,v 1.70 2024/08/11 18:24:43 schwarze Exp $
.\"
.\" Copyright (C) Lucent Technologies 1997
.\" All Rights Reserved
@ -22,7 +22,7 @@
.\" ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
.\" THIS SOFTWARE.
.\"
.Dd $Mdocdate: July 30 2024 $
.Dd $Mdocdate: August 11 2024 $
.Dt AWK 1
.Os
.Sh NAME
@ -1041,11 +1041,6 @@ and
.Fn srand
has been changed to support non-deterministic random numbers.
.Pp
In
.Ev LC_CTYPE Ns Li =POSIX
mode, treating non-ASCII input bytes as non-letter characters rather
than as input encoding errors intentionally violates the specification.
.Pp
The flags
.Op Fl \&dV ,
.Op Fl -csv ,