old-www/LDP/LG/issue31/tag_uidgid.html

<!--startcut ======================================================= -->
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html><head>
<META NAME="generator" CONTENT="lgazmail v1.1pre9c">
<TITLE>The Answer Guy 31:
UID/GID Synchronization and Management
</TITLE>
<!-- ORIGINAL SUBJECT:
Assigning UID/GID
JTD SUBTITLE:

-->
</head>

<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#A000A0"
ALINK="#FF0000">
<H4>"Linux Gazette...<I>making Linux just a little more fun!</I>"
</H4>
<P> <hr> <P>
<!-- ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -->
<H1 align="center"><A NAME="answer">
	<img src="../gx/dennis/qbubble.gif" alt="" border="0" align="middle">
	<a href="./index.html">The Answer Guy</a>
	<img src="../gx/dennis/bbubble.gif" alt="" border="0" align="middle">
</A></H1>
<BR>
<H4 align="center">By James T. Dennis,
	<a href="mailto:linux-questions-only@ssc.com">linux-questions-only@ssc.com</a>
	<BR>Starshine Technical Services, <A HREF="http://www.starshine.org/">http://www.starshine.org/</A>
</H4>
<p><hr><p>
<!--endcut ========================================================= -->
<H3><img src="../gx/dennis/qbub.gif" alt="(?)"
width="50" height="28" align="left" border="0"
>UID/GID Synchronization and Management</H3><p><strong>From Gordon Haverland on 16 Jul 1998
 in the</strong>
	<a href="news:comp.unix.questions">comp.unix.questions</a>
	<strong>newsgroup</strong></p>

<!-- begin body -->

<strong><p>Hi:
</p></strong>

<strong><p>I inherited sys admin stuff as part of a job.  At first, this
wasn't a problem: GIS work on a single Linux machine.  I did
development and analysis, others did just analysis.  Soon we got
another Linux machine, so development moved to there.  To share
printing, Ethernet was installed and LPRng.  Then a Solaris 2.5.1
machine was added.  So, the 2 linux machines have a handful of
users, the Sun has those plus a few other groups of users, and I
plan to add a Beowulf cluster "real soon now".  Is there any
rationale out there for assigning UID and GID in a hetrogeous
cluster/network like this?  It sure looks like users common among
machines have to have the same UID and GIDs.  The Solaris has NIS
on it, so I guess whatever I do should get administered from
there.  Thanks for any light you might shed on this.
</p></strong>


<P><STRONG>
Gordon Haverland
</strong></p>


<BLOCKQUOTE><IMG SRC="../gx/dennis/bbub.gif" ALT="(!)" width="50" height="28" border="0" lign="bottom"
>I'm not sure what you mean by "rationale" on this context.
</blockquote>

<blockquote>Do you mean:
</blockquote>

<blockquote>"Why should I co-ordinate and synchronize the account
management on the systems throughout my network?"
</blockquote>

<blockquote>... or do you mean:
</blockquote>


<blockquote>"How should I ....."
</blockquote>


<blockquote>... or do you mean something else entirely?
</blockquote>

<blockquote>I'll answer the first two questions (probably in
far more detail than you wanted):
</blockquote>

<blockquote>There are two principle reasons why you <em>want</em>
to co-ordinate the user/UID and group/GID management across your network.
The first is relatively obvious --- it has to do with user
and administrative convenience.
</blockquote>

<blockquote>If each of your users are expected to have relatively uniform
access to the systems throughout the network, then they'll
expect the same username and password to work on each system
that they are supposed to use.  If they change their password
they will expect that change to be global.
</blockquote>

<blockquote>When you --- as the admin --- add, remove, disable, or change
an account, you want to do it once, in one place.  You don't
want to have to manually copy those changes to every system.
</blockquote>

<blockquote>Of course these reasons don't require that the UID/GID's
match.  As you probably know names and group names in Unix and
Linux are mapped into numeric forms (UID's and GID's respectively).
All file ownership (inodes) and processes use these numerics for
all access and identity determination throughout the kernel and
drivers.   These numeric values are reverse mapped back to their
corresponding principle symbolic representations (the names) by
the utilities that display or process that information.  Thus
the 'ls -l' command is doing a lookup on each directory entry to
find the name that corresponds to the the owner and group ID's.
</blockquote>

<blockquote>Most of the commands you use actually do this through library
calls.  In deed most of these commands are "dynamically linked"
(use shared libraries) which perform the calls through common
external files (the <tt>libc</tt>).  As we'll see this is very important
as we look at the implications of consolidating the account
mapping information into a networked model (such as NIS).
</blockquote>

<blockquote>As I said, you could maintain a network of systems which
co-ordinated username/password data, and group membership
lists without synchronizing the UID's and GID's across the
systems.  Most network protocols and utilities (the r* gang:
rsh, rlogin, rcp, and things like telnet, ftp, etc) exchange
this data in "text" (symbolic) form.
</blockquote>

<blockquote>However, we then come to NFS!
</blockquote>

<blockquote>The NFS protocols use numeric forms to represent ownership.
Therefore an NFS server provides access based on an implicit
trust that the NFS client is providing a compatible and
legimate mapping of the cient's UID/GID to the server's.
</blockquote>

<blockquote>It is possible in Linux' NFS implementation to run a ugidd
(a UID/GID mapping daemon).  Thus you could create maps for
every NFS server to map each client<TT>s UID</TT>s to this server's
UID's, etc.  Yes, that idea is as ugly as it sounds!
</blockquote>

<blockquote>I won't go into the security implications of NFS' mechanism
here.  I'll just point out that my pet expansion of NFS is
"no flippin' security."  I'm told that it is possible to enable
a "secure RPC" portmapper which implements host-to-host
authentication.  I'd like to know more about that.
</blockquote>

<blockquote>However, it is still the case that any users who can get root
access to any trusted NFS client can impersonate any non-root
user so far as the NFS servers in that domain are concerned.
Since "sufficient" physical access virtually guarantees that
workstation users <em>can</em> get root access (possibly by resorting
to a screwdriver and CMOS battery jumper) I come to the
conclusion that NFS hopelessly insecure in today's common
network configurations (which workstations and PC's at
everyone's desks).
</blockquote>

<blockquote>(In defense of NFS I should point out that its security
model, and the one's we see in the r* gang were not
unreasonable when most Unix installations had a small
cluster of multi-user systems locked in a server room ---
and all user access was via terminals and X-terminals.
This suggests that there are some situations where they
are still justified).
</blockquote>

<blockquote>Despite these limitations and implications, NFS is the
most commonly deployed networked filesystem between Unix
and Linux systems.  I have high hopes for CODA, but
even the most optimistic dreams reveal that it will take
a long time to be widely adopted.
</blockquote>

<blockquote>So, it is in your best interests to synchronize your UID/GID
to user/group name mappings throughout your enterprise.  It is
also recommended that you adopt a policy that UID's are not
re-used.  When a user leaves your organization you "retire"
their UID (disabling their access by *'ing out their passwd,
removing them from the groups maps, setting their "shell" to
some <TT>/bin/denied</TT> binary and their "home" directory to a
secured "graveyard" --- I use <TT>/home/.graveyard</TT> on my systems).
The reason for this may not be obvious.  However, if you are
maintaining archival backups for several years (or indefinitely)
you'll want to avoid any ambiguities and confusion that might
result from restoring one (long gone) user's files and finding
them owned by one of your new users.
</blockquote>

<blockquote>(This "UID retirement" policy is obviously not feasible for
larger ISP's and usually difficult for Universities and other
high turnover environments.  You can still make it a policy to
cycle all the way around the UID/GID space before re-use).
</blockquote>

<blockquote>That should answer the questions about "why" we want to
co-ordinate account information (user/password, and
group/membership data) and why many (most) of us want to
synchronize the UID's and GID's that the accounts map to.
</blockquote>

<blockquote>Now, we think about "how" to do so.
</blockquote>

<blockquote>One common method is to use '<tt>rdist</tt>' to distribute a set of
files (usually <TT>/etc/passwd</TT>, <TT>/etc/group</TT>, and
<TT>/etc/hosts</TT>) to every machine in a "domain" (this being the
"administrative" sense of the term, which might or might not match a DNS
domain or subdomain).  For this to work we have to declare one system
to be the "master" and we have to ensure that all account
changes occur on that system.
</blockquote>

<blockquote>This can be done by manually training everyone to always issue
their '<tt>passwd</TT>' '<TT>chfn</tt>' '<tt>chsh</tt>' and similar
commands from a shell on that system, or you can create wrappers for each
of the affected commands (replacing the client copies of these
commands with a script that doesn't something like: '<tt>ssh
$master "$0"</tt>' for example).
</blockquote>

<blockquote>The nice things about this approach are:
</blockquote>

<blockquote>It works for just about any Unix and any Linux
(regardless of the libraries and programs running
on the client).
</blockquote>

<blockquote>The new risks and protocols are explicitly put
in place by the sysadmin --- we don't introduce
new protocols that might affect our security.
</blockquote>

<blockquote>There is no additional network latency and
overhead for most programs running most of
the time.  You are never waiting for '<tt>ls</tt>' to
resolve user and group names over the network!
</blockquote>


<blockquote>The concerns about this method are:
</blockquote>

<blockquote>You have to ensure the integrity and security of
the master --- I'd suggest requiring '<TT>ssh</tt>' access
to it and using PAM and possibly a chroot jail to
limit the access of most users to just the
appropriate commands.
</blockquote>

<blockquote>All clients must "trust" the master -- they
must allow that system to "push" new <em>root owned</em>
system configuration files to them.  I'd use
'<tt>rdist</TT>' or '<TT>rsync</tt>' over '<tt>ssh</tt>' for this as well.
</blockquote>

<blockquote>You may have unacceptable propagation delays
(a user's new password may take hours to get
propagated to all systems).
</blockquote>

<blockquote>It doesn't "scale" well and it doesn't conform
to any standards.  You (as the sysadmin) will
have to do your own scripting to deploy it.  Any
bugs in your scripts are quite likely to take
down the entire administrative domain.
</blockquote>


<blockquote>Then there's NIS.
</blockquote>

<blockquote>NIS is a protocol and a set of utilities and libraries which
basically implement exactly the features we've just described.
I've deliberately used several NIS terms in my preceding discussion.
</blockquote>

<blockquote>NIS distributes various sorts of "maps" (different "maps" for
passwords, groups, hosts, etc).  The primary NIS server for a
domain is called the "master" --- and secondary <em>servers</em> are
called "slaves."  Nodes (hosts, workstations, etc) that request
data from these "maps" are called "clients."
</blockquote>

<blockquote>One of the big features of glibc (the
<a href="http://www.gnu.org/">GNU</a> libc version 2.x
which is being integrated into Linux distributions as <tt>libc.6.x</tt>)
is support for NIS.  It used to be the case that supporting
NIS on a Linux <em>client</em> required a special version of the
shared libraries (a variant compilation of libc.5).
</blockquote>

<blockquote>In <A HREF="http://www.redhat.com/">Red Hat</A> 5.x and
<A HREF="http://www.debian.org/">Debian</A> 2.x this will not be
necessary.  We expect that most other Linux distributions will follow suit
in their next major releases.  (This transition is similar to
the a.out to ELF transition we faced a couple of years ago,
and much less of a hassle than the infamous "<tt>procps</tt>" fiasco
that we went through between the 1.x and 2.x kernels.  Notably
it is possible to have <tt>libc.5</tt> and <tt>glibc</tt> concurrently
installed on a system --- the major issue is which way your base system
binaries and utilities are linked).
</blockquote>

<blockquote>The advantages of NIS:
</blockquote>

<blockquote>It's a standard.  Most modern forms of Unix support it.
</blockquote>

<blockquote>It's scaleable and robust.  It automatically
deals with capacity and availability issues
by having two tiers of servers (master and slave).
</blockquote>

<blockquote>It's already been written.  You won't be
re-inventing this wheel.  (At the same time
it is more generalized --- so this wheel may
have more spokes, lug nuts, and axle trimmings
than you needed or wanted).
</blockquote>


<blockquote>The disadvantages of NIS:
</blockquote>

<blockquote>NIS is designed to do more than you might
want.  It will default to providing host
mapping services (which might conflict with
your DNS scheme and might give you a bit of
extra grief while configuring '<tt>sendmail</tt>'
--- at least the Solaris default version of
'<tt>sendmail</tt>').  These are relatively easy
issues to resolve --- once you understand
the underlying model.  However they are
cause for sysadmin confusion and frustration
in the early stages.
</blockquote>

<blockquote>It's not terribly secure.  There is a NIS+ which
uses cryptographic means to tighten up some of
that.  However, NIS+ doesn't seem to be available
for Linux yet.  That is probably largely the result
of the U.S. federal government's unpopular and idiotic
attitudes towards cryptography --- which has a generally
chilling effect on the development and deployment of
robust security.  The fact that U.S. policy also recognizes
patents on software and algorithms (particularly the
very broad RSA held patents on public key cryptography)
also severely constrains our programmers (they are liable
if they <em>re-invent</em> any protected algorithm --- no matter how
"obvious" it seemed to them nor how "independently" their
derivation). Regardless of these political issues, I still
have technical concerns about NIS security.
</blockquote>

<blockquote>Hybrid:
</blockquote>

<blockquote>You can use NIS within your domain, and you can distribute
your NIS maps out to systems that are on the periphery (for
example out to your web servers and bastion/proxy systems
out on the "firewall" or "perimeter network segment."  This
can be combined with some custom filtering (to disable
shell access by most users to these machines --- helping to
ensure that the UID/GID mappings are used solely for marking
file ownership --- for example).
</blockquote>

<blockquote>NIS maps are is the same format as the files to which they
correspond.  Thus the NIS passwd map is a regular looking
passwd file, and the NIS group map is in the conventional
format you'd expect in your <TT>/etc/group</TT> file.
</blockquote>

<blockquote>You might have to fuss with these files a bit to
"shadow" them (or "star out" the passwords on accounts
that shouldn't be give remote access to a given host).
</blockquote>

<blockquote>Ideally I'd like to see a hybrid of NIS and Kerberos.
We'd see NIS used to provide the names/UID's --- and Kerberos
used for the authentication.  However, I haven't yet heard
of any movement to do this.  I have heard rumblings of
LDAP used in a way that might overlap with NIS quite a bit
(and I'd hope that there'd be an LDAP to NIS gateway so we
wouldn't have to transition all those libraries <em>again</em>).
</blockquote>

<blockquote>Back to your case.
</blockquote>

<blockquote>NIS sounds like a natural choice.  However, you don't have to
pick the Solaris system for the administration.  You can use
any of the Linux systems or any Solaris system (among others)
as the NIS master.  Since your Solaris system is probably
installed on more expensive SPARC hardware, and it probably
was purchased to run some services or applications that aren't
readily available on your Linux systems --- it would probably
be wiser to put up an extra Linux box as a dedicated NIS
master and administrative console.
</blockquote>

<blockquote>It doesn't sound like internal security is even on your
roadmap.  That's fine and fairly common.  All the members
of your team probably have <em>sufficient physical access</em> to
all of the systems in your group that significant efforts at
intranet (internal) security in software would probably be pointless.
</blockquote>

<blockquote>I'd still recommend that you use "private net" addressing
(RFC1918 --- <tt>10.*.*.*</tt>, <tt>192.168.*.*</tt> and the range of
class B's from <tt>172.16.*.*</tt> through <tt>172.31.*.*</tt>) --- and
make your systems go through a masquerading router (Linux or any of several
others) or a set of proxies or some combination of these.
</blockquote>

<blockquote>In fact I highly recommend that you fire up a DNS caching
server on at least one system --- and point all of your
clients at that, and that you install a caching web proxy
(<A HREF="http://www.apache.org/">Apache</A> can be configured for this,
or you can use Squid --- which is my personal favorite).  These caches
can save a significant amount of bandwidth for even a small workgroup and
they only cost a little bit of installation and configuration
time and a bit of disk space and memory.
</blockquote>

<blockquote>(The default <A HREF="http://www.redhat.com/">Red Hat</A>
configuration for their '<tt>named</tt>' rc file
is to just run in caching mode.  So that's truly a no brainer
--- just distribute a new <tt>resolv.conf</tt> file to all the clients
so that it refers *first* to the host that runs the cache.  My
squid configuration on a <A HREF="http://www.suse.com/">S.u.S.E.</A>
machine and has run, unmodified, for months.  I vaguely remember having
to edit a configuration file.  It must not have been <em>too</em> bad.
Naturally you have to get users to point their web browsers at
the proxy --- that might be a hassle.  With '<tt>lynx</tt>' I just edit
the global <tt>lynx.cfg</tt> file and send it to each host.  Similar
features are available in Netscape Navigator --- but you have
to touch everyone's configuration at least once).
</blockquote>

<blockquote>Once you have your workgroup/LAN isolated on its own group of
addresses and working through proxies --- it is relatively
easy to configure your router to filter most sorts of traffic
that should not be trusted across domains and, especially, to
prevent "address spoofing" (<em>incoming</em> packets that claim to
be <em>from</em> some point inside of your domain).
</blockquote>

<blockquote>You can certainly spend <em>all</em> of your time learning about
and implementing security.  However, the cost of that effort
may exceed your management's valuation of the resources that
are accessible on your LAN.  Obviously they'll have to do their
own risk and cost/benefit analyses on those issues.
</blockquote>

<blockquote>I pay an undue amount of attention to systems security because
it is my hobby.  As a consultant it turns out to be useful since
I can explain these concerns and concepts to my customers, and
refer to them to specialists when they want "real" security.
</blockquote>

<blockquote>To learn more details about how to setup and use NIS under
Linux read the &quot;The Linux NIS(YP)/NYS/NIS+ HOWTO&quot; at:
(<A HREF="http://www.ssc.com/linux/LDP/HOWTO/NIS-HOWTO.html"
	>http://www.ssc.com/linux/LDP/HOWTO/NIS-HOWTO.html</A>).  This
was just updated a couple of weeks ago.
</blockquote>

<blockquote>I guess there is support for NIS+ <em>clients</em> in glibc
--- so that's new to me.  I've copied Thorsten Kukuk
(the author of this HOWTO) so he can correct any errors
I've made or otherwise comment.
</blockquote>

<blockquote>By the way:  What is GIS?  I've heard references to it
--- and I gather that it has to do with geography and
informations systems.  Would you consider writing an
overview of how Linux is being used in GIS related
work for <a href="http://www.ssc.com/lj/">LJ</a> or
<a HREF="http://www.linuxgazette.com/">LG</a>?
</blockquote>
<!-- end body -->

<!--================================================================-->
<P> <hr> <P>
<H5 align="center"><a href="http://www.linuxgazette.com/copying.html"
	>Copyright &copy;</a> 1998, James T. Dennis <BR>
Published in <I>Linux Gazette</I> Issue 31 August 1998</H5>
<P> <hr> <P>
<!--================================================================-->
<table width="98%"><tr valign="center" align="center">
<td rowspan="3"><A HREF="./lg_answer31.html"><IMG
        SRC="../gx/dennis/answernew.gif"
        ALT="[ Answer Guy Index ]"></A></td>
  <td><A HREF="tag_backup.html">backup</A></td>
  <td><A HREF="tag_uidgid.html">uidgid</A></td>
  <td><A HREF="tag_connect.html">connect</A></td>
  <td><A HREF="tag_95slow.html">95slow</A></td>
  <td><A HREF="tag_badblock.html">badblock</A></td>
  <td><A HREF="tag_trident.html">trident</A></td>
  <td><A HREF="tag_sound.html">sound</A></td>

</tr><tr valign="center" align="center">
  <td><A HREF="tag_kernel.html">kernel</A></td>
  <td><A HREF="tag_solprint.html">solprint</A></td>
  <td><A HREF="tag_idescsi.html">idescsi</A></td>
  <td><A HREF="tag_distrib.html">distrib</A></td>
  <td><A HREF="tag_modem.html">modem</A></td>
  <td><A HREF="tag_NDS.html">NDS</a></td>
  <td><A HREF="tag_rpm.html">rpm</A></td>

</tr><tr valign="center" align="center">
  <td><A HREF="tag_guy.html">guy</A></td>
  <td><A HREF="tag_maildns.html">maildns</A></td>
  <td><A HREF="tag_memleak.html">memleak</a></td>
  <td><A HREF="tag_multihead.html">multihead</A></td>
  <td><A HREF="tag_cdr.html">cdr</A></td>
</tr></table>
<P> <hr> <P>
<!--================================================================-->
<A HREF="./index.html"><IMG SRC="../gx/indexnew.gif"
        ALT="[ Table Of Contents ]"></A>
<A HREF="../index.html"><IMG SRC="../gx/homenew.gif"
        ALT="[ Front Page ]"></A>
<A HREF="lg_bytes31.html"><IMG SRC="../gx/back2.gif"
        ALT="[ Previous Section ]"></A>
<A HREF="./searls.html"><IMG SRC="../gx/fwd.gif"
        ALT="[ Next Section ]"></A>
<!--startcut =======================================================  -->
</body>
</html>
<!--endcut =========================================================  -->