Summary from mtk: recent work on mlock caused Maxin to notice that
the EAGAIN error was not documented. KOSAKI Motohiro noted
that this behavior is longstanding.
=====
Dear Michael,
As per the mlock(2) implementation bugfix which is present in
Linux 2.6.27-rc2 git commit,
(http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=a477097d9c37c1cf289c7f0257dffcfa42d50197),
the mlock(2) man page should be modified to reflect the latest changes
in the kernel.
See the LKML thread regarding this commit :
http://www.nabble.com/mlock()-return-value-issue-in-kernel-2.6.23.17-td18751601.html
This patch modifies the mlock(2) behaviour as per the SUSv3 specification.
[ENOMEM]
Some or all of the address range specified by the addr and
len arguments does not correspond to valid mapped pages
in the address space of the process.
[EAGAIN]
Some or all of the memory identified by the operation could not
be locked when the call was made.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Maxin B. John <maxin.john@ap.sony.com>
=====
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
To: "Maxin John" <maxin.john@gmail.com>
Subject: Re: mlock(2) man page modifications
Cc: kosaki.motohiro@jp.fujitsu.com,
"Michael Kerrisk" <mtk.manpages@googlemail.com>, man@vger.kernel.org
Date: Thu, 25 Sep 2008 15:04:49 +0900 (JST)
Hi Maxin,
Thank you for your attention.
I think your point and your patch are right.
However, my patch is trivial regression fix, not behavior change.
An older kernel can return EAGAIN at memory stavation.
my patch has following hunk.
> +++ b/mm/mlock.c
> @@ -78,8 +78,6 @@ success:
>
> mm->locked_vm -= pages;
> out:
> - if (ret == -ENOMEM)
> - ret = -EAGAIN;
In addition, 2.6.11 (oldest code of git repository) has following code.
static int mlock_fixup(struct vm_area_struct * vma,
unsigned long start, unsigned long end, unsigned int newflags)
{
(snip)
vma->vm_mm->locked_vm -= pages;
out:
if (ret == -ENOMEM)
ret = -EAGAIN;
return ret;
}
that behavior is linux mlock's behavior for long long time.
Thanks!
The error by getpid() in the presence of clone() occurs
only for a fork-like clone (one that omits CLONE_VM from the flags.)
This is a low-level detail, but there is no problem [known-to-me]
for thread-like clone().
getpid() caches the PID after the first call. This relies
on support in the glibc wrappers for fork()/vfork()/clone().
However, if syscall() is used to directly invoke fork()/vfork()/clone(),
the cache is not updated, and getpid() in the child procudes the wrong
result.
> > Linux, lstat(2) will generally not trigger automounter action, whereas
> > stat(2) will.
>
> I don't understand this last piece. Can you say some more. (I'm not
> familiar with automounter details.)
An automounter (either an explicit one, like autofs, or an implicit
one, such as are used by AFS or NFSv4) is something that triggers
a mount when something is touched.
However, it's undesirable to automount, say, everyone's home
directory just because someone opened up /home in their GUI
browser or typed "ls -l /home". The early automounters simply
didn't list the contents until you accessed it by name;
this is still the case when you can't enumerate a mapping
(say, all DNS names under /net). However, this is extremely
inconvenient, too.
The solution we ended up settling on is to create something
that looks like a directory (i.e. reports S_IFDIR in stat()),
but behaves somewhat like a symlink. In particular, when it is
accessed in a way where a symlink would be dereferenced,
the automount triggers and the directory is mounted. However,
system calls which do *not* cause a symlink to be dereferenced,
like lstat(), also do not cause the automounter to trigger.
This means that "ls -l", or a GUI file browser, can see a list
of directories without causing each one of them to be automounted.
-hpa
links in 'oldpath'; see also http://lwn.net/Articles/294667.
POSIX.1-2008 makes it implementation-dependent whether or not
'oldpath' is dereferenced if it is a symbolic link.