The length of this page means that it's becoming difficult to parse
which info is specific to mount() versus umount()/umount2(), so split
the umount material out into its own page.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Refer the reader to new text in execve(2) that describes how
(since Linux 2.6.23) RLIMIT_STACK determines the value of ARG_MAX.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
POSIX.1-2001 says that the values returned by sysconf()
are constant for the life of the process.
But the fact that, since Linux 2.6.23, ARG_MAX is settable
via RLIMIT_STACK means _SC_ARG_MAX is no longer constant,
since it can change at each execve().
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Starting with Linux 2.6.23, the ARG_MAX limit became settable via
(1/4 of) RLIMIT_STACK. This broke ABI compatibility if RLIMIT_STACK
was set such that ARG_MAX was < 32 pages. Document the fact that
since 2.6.25 Linux imposes a floor on ARG_MAX, so that the old limit
of 32 pages is guaranteed.
For some background on the changes to ARG_MAX in kernels 2.6.23 and
2.6.25, see:
http://sourceware.org/bugzilla/show_bug.cgi?id=5786http://bugzilla.kernel.org/show_bug.cgi?id=10095http://thread.gmane.org/gmane.linux.kernel/646709/focus=648101,
checked into 2.6.25 as commit a64e715fc74b1a7dcc5944f848acc38b2c4d4ee2.
Also some reordering/rewording of the discussion of ARG_MAX.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
The old sentence sat on its own in an odd place, and anyway the
modern BSDs use the name RLIMIT_NOFILE.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
A sentence clarifying that pending signal set is union of
per-thread and process-wide pending signal sets.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
The page was previously fuzzy about whether the these interfaces
have process-wide or per-thread semantics. (E.g., now the
page states that the calling *thread* (not process) is suspended
until the signal is delivered.)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
These words are slightly bogus: although the interface is obsolete,
for ABI-compatibility reasons, the kernel folk should never be changing
this interface.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
glibc doesn't provide any support for readdir(2),
so remove these header files (which otherwirse suggest
that glibc does provide the required pieces).
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
The location of the fields fater d_name varies according to
the size of d_name. We can't properly declare them in C;
therefore, put those fields inside a comment.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
The structure isn't currently defined in glibc headers, and the kernel
name of the structure is 'linux_dirent' (as was already used in some,
but not all, places in this page).
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Maxin suggested a patch, which I've rewritten and expanded.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Reported-by: Maxin B. John <maxin.john@ap.sony.com>
As at kernel 2.6.27, only ext[234] support d_type.
On other file systems, d_type is always set to DT_UNKNOWN (0).
Reported-by: Ricardo Catalinas Jimnez <jimenezrick@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Make it clear that the POSIX.1 revision that is likely
to affect the feature test macro requirements for futimens() is
POSIX.1-2008.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Reported-by: Nicolas Franois <nicolas.francois@centraliens.net>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
The times argument point to *an array of* structures, and the
man-page should say that consistently.
(The '&' before sop in the semop() call is unneeded.)
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Summary from mtk: recent work on mlock caused Maxin to notice that
the EAGAIN error was not documented. KOSAKI Motohiro noted
that this behavior is longstanding.
=====
Dear Michael,
As per the mlock(2) implementation bugfix which is present in
Linux 2.6.27-rc2 git commit,
(http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=a477097d9c37c1cf289c7f0257dffcfa42d50197),
the mlock(2) man page should be modified to reflect the latest changes
in the kernel.
See the LKML thread regarding this commit :
http://www.nabble.com/mlock()-return-value-issue-in-kernel-2.6.23.17-td18751601.html
This patch modifies the mlock(2) behaviour as per the SUSv3 specification.
[ENOMEM]
Some or all of the address range specified by the addr and
len arguments does not correspond to valid mapped pages
in the address space of the process.
[EAGAIN]
Some or all of the memory identified by the operation could not
be locked when the call was made.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Maxin B. John <maxin.john@ap.sony.com>
=====
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
To: "Maxin John" <maxin.john@gmail.com>
Subject: Re: mlock(2) man page modifications
Cc: kosaki.motohiro@jp.fujitsu.com,
"Michael Kerrisk" <mtk.manpages@googlemail.com>, man@vger.kernel.org
Date: Thu, 25 Sep 2008 15:04:49 +0900 (JST)
Hi Maxin,
Thank you for your attention.
I think your point and your patch are right.
However, my patch is trivial regression fix, not behavior change.
An older kernel can return EAGAIN at memory stavation.
my patch has following hunk.
> +++ b/mm/mlock.c
> @@ -78,8 +78,6 @@ success:
>
> mm->locked_vm -= pages;
> out:
> - if (ret == -ENOMEM)
> - ret = -EAGAIN;
In addition, 2.6.11 (oldest code of git repository) has following code.
static int mlock_fixup(struct vm_area_struct * vma,
unsigned long start, unsigned long end, unsigned int newflags)
{
(snip)
vma->vm_mm->locked_vm -= pages;
out:
if (ret == -ENOMEM)
ret = -EAGAIN;
return ret;
}
that behavior is linux mlock's behavior for long long time.
Thanks!
The error by getpid() in the presence of clone() occurs
only for a fork-like clone (one that omits CLONE_VM from the flags.)
This is a low-level detail, but there is no problem [known-to-me]
for thread-like clone().
getpid() caches the PID after the first call. This relies
on support in the glibc wrappers for fork()/vfork()/clone().
However, if syscall() is used to directly invoke fork()/vfork()/clone(),
the cache is not updated, and getpid() in the child procudes the wrong
result.
> > Linux, lstat(2) will generally not trigger automounter action, whereas
> > stat(2) will.
>
> I don't understand this last piece. Can you say some more. (I'm not
> familiar with automounter details.)
An automounter (either an explicit one, like autofs, or an implicit
one, such as are used by AFS or NFSv4) is something that triggers
a mount when something is touched.
However, it's undesirable to automount, say, everyone's home
directory just because someone opened up /home in their GUI
browser or typed "ls -l /home". The early automounters simply
didn't list the contents until you accessed it by name;
this is still the case when you can't enumerate a mapping
(say, all DNS names under /net). However, this is extremely
inconvenient, too.
The solution we ended up settling on is to create something
that looks like a directory (i.e. reports S_IFDIR in stat()),
but behaves somewhat like a symlink. In particular, when it is
accessed in a way where a symlink would be dereferenced,
the automount triggers and the directory is mounted. However,
system calls which do *not* cause a symlink to be dereferenced,
like lstat(), also do not cause the automounter to trigger.
This means that "ls -l", or a GUI file browser, can see a list
of directories without causing each one of them to be automounted.
-hpa
links in 'oldpath'; see also http://lwn.net/Articles/294667.
POSIX.1-2008 makes it implementation-dependent whether or not
'oldpath' is dereferenced if it is a symbolic link.
Another attempt to rationalize description of MPOL_DEFAULT.
Since ~2.6.25, the system default memory policy is "local allocation".
MPOL_DEFAULT itself is a request to remove any non-default policy and
"fall back" to the surrounding context. Try to say that without delving
into implementation details.
Update the get_mempolicy(2) man page to add in the description of
the MPOL_F_MEMS_ALLOWED flag, added in 2.6.23.
mtk
Document additional EINVAL error that occurs is MPOL_F_MEMS_ALLOWED
is specified with either MPOL_F_ADDR or MPOL_F_NODE.
Misc cleanup of get_mempolicy(2):
+ mention that any mode flags will be saved with mode.
I don't bother to document mode flags here because we
already have a pointer to set_mempolicy(2) for more info
on memory policy. mode flags are discussed there.
+ remove some old, obsolete [IMO] NOTES and 'roff comments.
PF_ constants have always had the same values; there never has
been a protocol family that had more than one address family,
and POSIX.1-2001 only specifies the AF_* constants.
PF_ constants have always had the same values; there never has
been a protocol family that had more than one address family,
and POSIX.1-2001 only specifies the AF_* constants.
nodes outside the task's cpuset, as long as one valid node remains.
Now that cpuset man page exists, we can refer to it. Remove
stale comment regarding lack thereof.
Fix up the error return for nodemask containing nodes disallowed by
the process' current cpuset. Disallowed nodes are now silently ignored,
as long as the nodemask contains at least one node that is on-line,
allowed by the process' cpuset and has memory.
Now that we have a cpuset man page, we can refer to cpusets directly
in the man page text.
Document PR_GET_TSC and PR_SET_TSC.
Document PR_SET_SECCOMP and PR_GET_SECCOMP.
PR_SET_KEEPCAPS and PR_GET_KEEPCAPS operate on a per-thread
setting, not a per-process setting.
Clarify fork(2) details for PR_SET_PDEATHSIG.
Add description of PR_SET_SECUREBITS and PR_GET_SECUREBITS,
as well as pointer to further info in capabilities(7).
PR_GET_ENDIAN returns endianness info in location pointed to by
arg2 (not as function result, as was implied by previous text).
Expand description of PR_SET_NAME and PR_GET_NAME.
RETURN VALUE: bring up to date for various options.
Various improvements in ERRORS.
Note that PR_SET_TIMING setting of PR_TIMING_TIMESTAMP is not
currently implemented.
Minor changes:
* Clarify wording for PR_GET_UNALIGN, PR_GET_FPEMU, and PR_GET_FPEXC.
* Some reformatting of kernel version information.
* Reorder PR_GET_ENDIAN and PR_SET_ENDIAN entries.
handler for SIGCHLD.
Describe POSIX specification, and Linux semantics for
SA_NOCLDWAIT when establishing a handler for SIGCHLD.
Add pointer under SA_RESTART to new text in signal(7)
describing system call restarting.
Other minor edits.