Hi Andries,
[Just for my own reference, I reinclude the pointer to Philippe
Troin's patch
http://marc.theaimsgroup.com/?l=linux-kernel&m=108380640603164&w=2
]
> > > Except of course for fcntl(fd, F_GETOWN) where the owner is a
> > > (negative) process group... If the owning process group has a "low
> > > enough" PGID, it collides with errors and glibc reports an error and
> > > sets errno to -PGID. One might argue that in this instance, that the
> > > BSD's overloading of the pid field with pgids is at fault, but the
> > > bug
> > > still remains :-)
> >
> > I believe that practically speaking this is a non-issue. The
> > lowest PID / PGID that can be allocated to a process other than
> > init or a kernel thread is 300. (RESERVED_PID in kernel/pid.c
> > in 2.6, details differ, but same limit in <= 2.4.)
>
> Hmm. RESERVED_PIDS is used as starting value after overflow,
> not as a starting value at the beginning. I think you are mistaken.
Hmm -- yes. And I was in any case assuming the notion
of a process that might do an F_SETOWN assigning
its own PGID to the socket -- but that might not be so.
And I was overlooking a comment in the fs/fcntl.c
sources that reiterates the point:
case F_GETOWN:
/*
* XXX If f_owner is a process group, the
* negative return value will get converted
* into an error. Oops. If we keep the
* current syscall conventions, the only way
* to fix this will be in libc.
*/
err = filp->f_owner.pid;
force_successful_syscall_return();
break;
And now I've actually created the error in userland code.
It seems that whenever the -PGID retrieved by F_GETOWN is
smaller than 4096, then it is interpreted as an error.
Now I see the relevant code in
sysdeps/unix/sysv/linux/i386/sysdep.h:
==
/* Linux uses a negative return value to indicate syscall errors,
unlike most Unices, which use the condition codes' carry flag.
Since version 2.1 the return value of a system call might be
negative even if the call succeeded. E.g., the `lseek' system call
might return a large offset. Therefore we must not anymore test
for < 0, but test for a real error by making sure the value in %eax
is a real error number. Linus said he will make sure the no syscall
returns a value in -1 .. -4095 as a valid result so we can savely
test with -4095. */
[...]
DO_CALL (syscall_name, args);
cmpl $-4095, %eax;
jae SYSCALL_ERROR_LABEL;
==
Ugh.
of several Section 2 pages using the _syscallN() macros.
In addition:
-- erroneous semicolons at the end of _syscallN() were removed
on various pages.
-- types such as "uint" in syscalN() declarations were changed
to "unsigined int", etc.
-- various other minor breakages in the synopses were fixed.