Let us return to path_resolution.2...
> Von: Andries Brouwer <Andries.Brouwer@cwi.nl>
> Betreff: Re: ***UNCHECKED*** man-pages-2.11
> Datum: Mon, 24 Oct 2005 20:43:42 +0200
>
> On Mon, Oct 24, 2005 at 05:27:56PM +0200, Michael Kerrisk wrote:
>
> > PS I changed some text in path_rolution.2, where it seems to
> > me that you made an error. But I could be wrong -- you
> > might like to double check it?
>
> Hmm, I think it was precisely correct and no longer is.
>
> I see some change in wording that does not actually change anything,
> and the addition of "as well" that may be incorrect.
Let's begin with a diff:
=====
--- man-pages-2.10/man2/path_resolution.2 2005-07-18 18:17:52.000000000 +0200
+++ man-pages-2.11/man2/path_resolution.2 2005-10-24 13:18:13.000000000 +0200
@@ -185,11 +185,13 @@
Traditional systems do not use capabilities and root (user ID 0) is
all-powerful. Such systems are presently (2.6.7) handled by giving root
-all capabilities except for CAP_SETPCAP. More precisely, at exec time
-a process gets all capabilities except CAP_SETPCAP and the five capabilities
+all capabilities except for CAP_SETPCAP. More precisely,
+a process gets all capabilities except CAP_SETPCAP
+and the five capabilities
CAP_CHOWN, CAP_DAC_OVERRIDE, CAP_DAC_READ_SEARCH, CAP_FOWNER, CAP_FSETID,
-in case it has zero effective UID, and it gets these last five capabilities
-in case it has zero fsuid, while all other processes get no capabilities.
+if its effective UID is 0,
+and it gets these last five capabilities if its fsuid is 0 as well,
+while all other processes get no capabilities.
The CAP_DAC_OVERRIDE capability overrides all permission checking,
but will only grant execute permission when at least one
====
The main points of change are the following:
1. Removal of discussion of "exec time".
2. Addition of "as well".
I'll start with point 2. I'm wrong. I had it in my mind that
fsuid could only be made 0 if euid was already 0. But that isn't
true; setfsuid(x) allows us to turn this (somewhat unusual, but
theoretically possible scenario):
Real Eff Saved FS
0 y y y
into this (setfsuid() allows us to set the fsuid to any of the R/E/S
UID values):
Real Eff Saved FS
0 y y 0
And indeed the process then has the 5 CAP_FS_MASK capabilities,
in its effective set, but none of the others.
I've removed the words "as well".
On to point 1.
I removed "exec time" because it seems misleading. As far as I can
tell, exec is not directly relevant, except in as much as we exec
a set-user-ID-root program. The real point is that effective
capabilities are dropped as a result of changes to the euid and
fsuid. Those can happen because we exec a set-user-ID-root program,
or via manipulations via seteuid(), setfsuid(), and friends.
As such, that change still seems to me to be correct. But
perhaps I have still missed something that you were trying to
say. If so, let me know.
Cheers,
Michael
Added text on effect of NULL for 'set' argument.
Added text noting effect of ignoring SIGBUS, SIGFPE, SIGILL,
and SIGSEGV.
Noted that sigprocmask() can't be used in multithreaded process.
Fixed EINVAL error diagnostic.
Changed CONFORMING TO.
Rewrote description of MREMAP_MAYMOVE.
Rewrote description of EAGAIN error.
Added discussion of resizing of memory locks.
Added entries to SEE ALSO.
Some formatting fixes.
An: Olivier Croquette <ocroquette@free.fr>
Betreff: Re: 2.6.12 and setitimer
Datum: Mon, 4 Jul 2005 08:36:35 +0200 (MEST)
Hi Olivier,
> You will probably consider adding also a note to point out that the bug
> will stay a known bug of the 2.4 serie:
>
> http://lkml.org/lkml/2005/7/1/165
First off, I _very_ much appreciate the fact that you keep
informing me of the progress of this bug! Thank you.
At the moment, I'm inlined yo leave the manual page as it is.
It currently reads:
On certain systems (including x86), Linux ker‐
nels before version 2.6.12 have a bug which
will produce premature timer expirations of up
to one jiffy under some circumstances. This
bug is fixed in kernel 2.6.12.
To me that implies that the bug also affects kernels before
2.4 -- e.g., 2.4.x. Now, what would be interesting is if the
bug *does* get fixed in 2.4, then I could also add a note
about the 2.4.x version where it is fixed.
In the meantime, I have added a note to myself (i.e., a comment
in the man page source) about this point.
If the bug *does* eventually get fixed in 2.4.x, and you
hear of it, please do let me know.
Thanks,
Michael
"file status flags", and "file decriptor flags"
Some rewriting of discussion of file descriptor flags
Under F_DUPFD, replaced some text duplicated in dup.2 with a cross ref to dup.2
Minor wording and formatting fixes
Regarding man page documentation of the problem of short sleeps
for setiteimer(2)...
> > -- pointers to those threads
>
> http://bugzilla.kernel.org/show_bug.cgi?id=4569
> http://lkml.org/lkml/2005/4/29/163
>
> > -- indications of which kernel versions show this bahaviour
>
> AFAIK, all versions as far as x86 is concerned.
> Dunno if it is hardware specific.
>
> > -- a (short) test program to demonstrate it, if you have one.
>
> See the bugzilla bug's attachments
Sorry for the long delay in following this up, but I've got to
it now. I tweaked your suggestions slightly:
{{
Timers will never expire before the requested time,
-instead expiring some short, constant time afterwards, dependent
-on the system timer resolution (currently 10ms).
+but may expire some (short) time afterwards, which depends
+on the system timer resolution and on the system load.
+Upon expiration, a signal will be generated and the timer reset.
+If the timer expires while the process is active (always true for
+On certain systems (including x86), the Linux kernel has a bug which will
+produce premature timer expirations of up to one jiffy under some
+circumstances.
}}
Thanks for this bug reporet,
Nishanth: if and when your changes are accepted, and the problem
is thus fixed, could you please send me a notification of that
fact, and I can then further amend the manual pages.
Cheers,
Michael
/* itimer_short_interval_bug.c
June 2005
In current Linux kernels, an interval timer set using setitimer()
can sometimes sleep *less* than the specified interval.
This program demonstrates the behaviour by looping through all
itimer values from 1 microsecond upwards, in one microsecond steps.
*/
/* Adapted from a program by Olivier Croquette, June 2005 */
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/time.h>
#include <sys/wait.h>
typedef unsigned long long int u_time_t; /* in microsecs */
static int handler_flag;
/* return time as a number of microsecs */
static u_time_t
gettime(void )
{
struct timeval tv;
if ( gettimeofday(&tv, NULL) == -1) {
perror("gettimeofday()");
return 0;
}
return (tv.tv_usec + tv.tv_sec * 1000000LL);
}
static void
handler (int sig, siginfo_t *siginfo, void *context)
{
handler_flag++;
return ;
}
/* Sleep for 'time' microsecs. */
static int
isleep(u_time_t time)
{
struct itimerval newtv;
sigset_t sigset;
struct sigaction sigact;
if (time == 0)
return 0;
/* block SIGALRM */
sigemptyset (&sigset);
sigaddset (&sigset, SIGALRM);
sigprocmask (SIG_BLOCK, &sigset, NULL);
/* set up our handler */
sigact.sa_sigaction = handler;
sigemptyset(&sigact.sa_mask);
sigact.sa_flags = SA_SIGINFO;
sigaction (SIGALRM, &sigact, NULL);
newtv.it_interval.tv_sec = 0;
newtv.it_interval.tv_usec = 0;
newtv.it_value.tv_sec = time / 1000000;
newtv.it_value.tv_usec = time % 1000000;
if (setitimer(ITIMER_REAL,&newtv,NULL) == -1) {
perror("setitimer(set)");
return 1;
}
sigemptyset (&sigset);
sigsuspend (&sigset);
return 0;
}
int
main(int argc, char *argv[]) {
u_time_t wait;
int loop, numLoops;
u_time_t t1, t2;
u_time_t actual;
long long minDiff, maxDiff, totDiff, diff;
int numFail = 0;
if (argc != 2) {
fprintf(stderr, "Usage: %s num-loops\n", argv[0]);
exit(EXIT_FAILURE);
} /* if */
numLoops = atoi(argv[1]);
setbuf(stdout, NULL);
for (wait = 1; ; wait++) {
maxDiff = 0;
numFail = 0;
totDiff = 0;
minDiff = -wait;
if (wait % 10000 == 0)
printf("%llu\n", wait);
for (loop = 0; loop < numLoops; loop++) {
t1 = gettime();
handler_flag = 0;
isleep(wait);
if ( handler_flag != 1 )
printf("Problem with the handler flag (%d)!\n", handler_flag);
t2 = gettime();
actual = t2 - t1;
if ( actual < wait ) {
diff = actual - wait;
if (diff < maxDiff)
maxDiff = diff;
if (diff > minDiff)
minDiff = diff;
totDiff += diff;
numFail++;
} /* if */
} /* for */
if (numFail > 0)
printf("%llu: %3d fail (%4lld %4lld; avg=%6.1f)\n",
wait, numFail, minDiff, maxDiff,
(double) totDiff / numFail);
} /* for */
return 0;
} /* main */
> The question came up whether execve of a suid binary while being ptraced
> would fail or ignore the suid part. The answer today seems to be the
> latter:
>
> E.g. (in 2.6.11) security/dummy.c:
>
> static void dummy_bprm_apply_creds (struct linux_binprm *bprm, int
> unsafe)
> {
> if (bprm->e_uid != current->uid || bprm->e_gid != current->gid) {
> if ((unsafe & ~LSM_UNSAFE_PTRACE_CAP) &&
> !capable(CAP_SETUID)) {
> bprm->e_uid = current->uid;
> bprm->e_gid = current->gid;
> }
> }
> }
>
> and fs/exec.c:
>
> void compute_creds(struct linux_binprm *bprm) {
> int unsafe;
>
> unsafe = unsafe_exec(current);
> security_bprm_apply_creds(bprm, unsafe);
> }
>
> static inline int unsafe_exec(struct task_struct *p) {
> int unsafe = 0;
> if (p->ptrace & PT_PTRACED) {
> if (p->ptrace & PT_PTRACE_CAP)
> unsafe |= LSM_UNSAFE_PTRACE_CAP;
> else
> unsafe |= LSM_UNSAFE_PTRACE;
> }
> return unsafe;
> }
>
> That is: if the process that calls execve() is being traced,
> the LSM_UNSAFE_PTRACE bit is et in unsafe and security_bprm_apply_creds()
> will make sure the suid/sgid bits are ignored.
>
> ---
>
> In my man page I do not read anything like that. It says
>
> EPERM The process is being traced, the user is not the superuser and
> the file has an SUID or SGID bit set.
> and
>
> If the current program is being ptraced, a SIGTRAP is sent to it after
> a successful execve().
>
> If the set-uid bit is set on the program file pointed to by filename
> the effective user ID of the calling process is changed to that of the
> owner of the program file.
>
> So, maybe this sentence should be amended to read
>
> If the set-uid bit is set on the program file pointed to by filename
> and the current process is not being ptraced, the effective user ID
> of the calling process is changed to ...
I changed your "current" to "calling" (to be consistent with the
rest of the page), but otherwise applied as you suggest.
The revision will appear in man-pages-2.03, which I can release
any time now. Are you avialable to do an upload tomorrow?
Added text on permissions required to send signal to owner.
====
Hello Johannes,
> Betreff: Inaccuracy of fcntl man page
> Datum: Mon, 2 May 2005 20:07:12 +0200
Thanks for yor note.
Sorry for the delay in getting back to you. I needed to find time
to set aside to look at the details. Now I've finally got there.
> I have attached a simple program
Thanks -- a little program is always helpful.
> that uses the fcntl system call in order
> to kill an arbitrary process of the same user.
> According to the fcntl man page, fcntl(fd,F_SETOWN,pid) returns zero if
> it has success.
Yes.
> If you strace the program while killing for exampe man running in another
> terminal, you will see that man is killed, but fcntl(fd,F_SETOWN,pid)
> will return EPERM,
I confirm that I see this problem in 2.4, with both Unix domain
and Internet domain sockets.
> where you can only find a very confusing explanation
> in the fcntl man page.
I'm not sure what explanation you mean here. As far as I can
tell, the manual page just doesn't cover this point.
> I have looked into the kernel source of 2.4.30 and found out, that
> net/core/socket::sock_no_fcntl is the culprit if you use fcntl on Unix
> sockets.
Yes, looks that way to me, as well, And the 2.2 code looks
similar.
> If pid is not your own pid or not your own process group,
> the system call will return EPERM but will also set the pid
> as you wanted to.
Yes.
> In the 2.6 kernel line, fcntl will react according the specification in
> the manual page.
Yes.
> If you also think, that one should clarify the return specification of
> fcntl(fd,F_SETOWN,pid) or 2.4.x kernels, please tell me and I will
> provide you with a patch for the manual page.
In fact I've written some new text under BUGS, which describes
the problem:
In Linux 2.4 and earlier, there is bug that can occur when an
unprivileged process uses F_SETOWN to specify the owner of a
socket file descriptor as a process (group) other than the
caller. In this case, fcntl() can return -1 with errno set to
EPERM, even when the owner process (group) is one that the
caller has permission to send signals to. Despite this error
return, the file descriptor owner is set, and signals will be
sent to the owner.
Does that seem okay to you?
> Furthermore, it would be interseting to write there, what permissions
> one need in order to send signals to processes via fcntl
Good idea. I added the following new text:
Sending a signal to the owner process (group) specified by
F_SETOWN is subject to the same permissions checks as are
described for kill(2), where the sending process is the one that
employs F_SETOWN (but see BUGS below).
====
#define _GNU_SOURCE /* needed to get the defines */
#include <fcntl.h> /* in glibc 2.2 this has the needed
values defined */
#include <signal.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/un.h>
/**
* Funnykill kills a program with fcntl
**/
int
main (int argc, char **argv)
{
if (argc != 2)
{
fprintf (stderr, "Usage: funnykill <pid>\n");
return 1;
}
int sockets[2];
socketpair (AF_UNIX, SOCK_STREAM, 0, sockets);
if (fcntl (sockets[0], F_SETFL, O_ASYNC | O_NONBLOCK) == -1)
errMsg("fcntl-F_SETFL");
if (fcntl (sockets[0], F_SETOWN, atoi (argv[1])) == -1)
errMsg("fcntl-F_SETOWN");
// fcntl (sockets[0], F_SETOWN, getpid());
if (fcntl (sockets[0], F_SETSIG, SIGKILL) == -1)
errMsg("fcntl-_FSETSIG");
write (sockets[1], "good bye", 9);
}
.\" For Unix domain sockets and regular files, EPERM is only returned in
.\" Linux 2.2 and earlier; in Linux 2.4 and later, unprivileged can
.\" use mknod() to make these files.