432 lines
8.4 KiB
HTML
432 lines
8.4 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
|
|
<HTML
|
|
><HEAD
|
|
><TITLE
|
|
>Set Up Slave Nodes</TITLE
|
|
><META
|
|
NAME="GENERATOR"
|
|
CONTENT="Modular DocBook HTML Stylesheet Version 1.7"><LINK
|
|
REL="HOME"
|
|
TITLE="The Beowulf HOWTO"
|
|
HREF="index.html"><LINK
|
|
REL="PREVIOUS"
|
|
TITLE="Set Up The Head Node"
|
|
HREF="x70.html"><LINK
|
|
REL="NEXT"
|
|
TITLE="Verification"
|
|
HREF="x195.html"></HEAD
|
|
><BODY
|
|
CLASS="sect1"
|
|
BGCOLOR="#FFFFFF"
|
|
TEXT="#000000"
|
|
LINK="#0000FF"
|
|
VLINK="#840084"
|
|
ALINK="#0000FF"
|
|
><DIV
|
|
CLASS="NAVHEADER"
|
|
><TABLE
|
|
SUMMARY="Header navigation table"
|
|
WIDTH="100%"
|
|
BORDER="0"
|
|
CELLPADDING="0"
|
|
CELLSPACING="0"
|
|
><TR
|
|
><TH
|
|
COLSPAN="3"
|
|
ALIGN="center"
|
|
>The Beowulf HOWTO</TH
|
|
></TR
|
|
><TR
|
|
><TD
|
|
WIDTH="10%"
|
|
ALIGN="left"
|
|
VALIGN="bottom"
|
|
><A
|
|
HREF="x70.html"
|
|
ACCESSKEY="P"
|
|
>Prev</A
|
|
></TD
|
|
><TD
|
|
WIDTH="80%"
|
|
ALIGN="center"
|
|
VALIGN="bottom"
|
|
></TD
|
|
><TD
|
|
WIDTH="10%"
|
|
ALIGN="right"
|
|
VALIGN="bottom"
|
|
><A
|
|
HREF="x195.html"
|
|
ACCESSKEY="N"
|
|
>Next</A
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
><HR
|
|
ALIGN="LEFT"
|
|
WIDTH="100%"></DIV
|
|
><DIV
|
|
CLASS="sect1"
|
|
><H1
|
|
CLASS="sect1"
|
|
><A
|
|
NAME="AEN137"
|
|
></A
|
|
>5. Set Up Slave Nodes</H1
|
|
><P
|
|
>Get your network cables out. Install Linux on the first non-head
|
|
node. Follow these steps for each non-head node.</P
|
|
><DIV
|
|
CLASS="sect2"
|
|
><H2
|
|
CLASS="sect2"
|
|
><A
|
|
NAME="AEN140"
|
|
></A
|
|
>5.1. Base Linux Install</H2
|
|
><P
|
|
>Going with my example node names and IP addresses, this is what I
|
|
chose during setup:</P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="100%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="screen"
|
|
>Workstation
|
|
auto partition
|
|
remove all partitions on system
|
|
use LILO as the boot loader
|
|
put boot loader on the MBR
|
|
host name wolf01
|
|
ip address 192.168.0.101
|
|
add the user "wolf"
|
|
same password as on all other nodes
|
|
NO firewall</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
><P
|
|
>The ONLY package installed: network servers. Un-select all other
|
|
packages.</P
|
|
><P
|
|
>It doesn't matter what else you choose; this is the minimum that
|
|
you need. Why fill the box up with non-essential software you will never
|
|
use? My research has been concentrated on finding that minimal
|
|
configuration to get up and running.</P
|
|
><P
|
|
>Here's another very important point: when you move on to an
|
|
automated install and config, you really will NEVER log in to the box.
|
|
Only during setup and install do I type anything directly on the
|
|
box.</P
|
|
></DIV
|
|
><DIV
|
|
CLASS="sect2"
|
|
><H2
|
|
CLASS="sect2"
|
|
><A
|
|
NAME="AEN147"
|
|
></A
|
|
>5.2. Hardware</H2
|
|
><P
|
|
>When the computer starts up, it will complain if it does not have
|
|
a keyboard connected. I was not able to modify the BIOS, because I had
|
|
older discarded boxes with no documentation, so I just connected a
|
|
"fake" keyboard.</P
|
|
><P
|
|
>I am in the computer industry, and see hundreds of keyboards come
|
|
and go, and some occasionally end up in the garbage. I get the old dead
|
|
keyboard out of the garbage, remove JUST the cord with the tiny circuit
|
|
board up there in the corner, where the num lock and caps lock lights
|
|
are. Then I plug the cord in, and the computer thinks it has a complete
|
|
keyboard without incident.</P
|
|
><P
|
|
>Again, you would be better off modifying your bios, if you are
|
|
able to. This is just a trick to use in case you don't have the bios
|
|
program.</P
|
|
></DIV
|
|
><DIV
|
|
CLASS="sect2"
|
|
><H2
|
|
CLASS="sect2"
|
|
><A
|
|
NAME="AEN152"
|
|
></A
|
|
>5.3. Post Install Commands</H2
|
|
><P
|
|
>After your newly installed box reboots, log on as root again,
|
|
and...</P
|
|
><P
|
|
></P
|
|
><UL
|
|
><LI
|
|
><P
|
|
>do the same chkconfig commands stated above to set up the
|
|
right services.</P
|
|
></LI
|
|
></UL
|
|
><P
|
|
></P
|
|
><UL
|
|
><LI
|
|
><P
|
|
>modify hosts; remove "wolfnn" from localhost, and just add
|
|
wolfnn and wolf00.</P
|
|
></LI
|
|
></UL
|
|
><P
|
|
></P
|
|
><UL
|
|
><LI
|
|
><P
|
|
>install lam</P
|
|
></LI
|
|
></UL
|
|
><P
|
|
></P
|
|
><UL
|
|
><LI
|
|
><P
|
|
>create the /mnt/wolf directory and set up security for
|
|
it.</P
|
|
></LI
|
|
></UL
|
|
><P
|
|
></P
|
|
><UL
|
|
><LI
|
|
><P
|
|
>do the ssh configuration</P
|
|
></LI
|
|
></UL
|
|
><P
|
|
>Up to this point, we are pretty much the same as the head node. I
|
|
do NOT do the modification of the exports file.</P
|
|
><P
|
|
>Also, do NOT add this line to the .bash_profile:</P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="100%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="screen"
|
|
>sh -c 'ssh-add && bash'</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
><DIV
|
|
CLASS="sect2"
|
|
><H2
|
|
CLASS="sect2"
|
|
><A
|
|
NAME="AEN173"
|
|
></A
|
|
>5.4. SSH On Slave Nodes</H2
|
|
><P
|
|
>Recall that on the head node, we created a file "authorized_keys".
|
|
Copy that file, created on your head node, to the ~/.ssh directory on
|
|
the slave nodes. The HEAD node will log on the all the SLAVE
|
|
nodes.</P
|
|
><P
|
|
>The requirement, as stated in the LAM user manual, is that there
|
|
should be no interaction required when logging in from the head to any
|
|
of the slaves. So, copying the public key from the head node into each
|
|
slave node, in the file "authorized_keys", tells each slave
|
|
that "wolf
|
|
user on wolf00 is allowed to log on here without any password; we know
|
|
it is safe."</P
|
|
><P
|
|
>However you may recall that the documentation states that the
|
|
first time you log on, it will ask for confirmation. So only once, after
|
|
doing the above configuration, go back to the head node, and type ssh
|
|
wolfnn where "wolfnn" is the name of your newly configured slave node.
|
|
It will ask you for confirmation, and you simply answer "yes" to it, and
|
|
that will be the last time you will have to interact.</P
|
|
><P
|
|
>Prove it by logging off, and then ssh back to that node, and it
|
|
should just immediately log you in, with no dialog whatsoever.</P
|
|
></DIV
|
|
><DIV
|
|
CLASS="sect2"
|
|
><H2
|
|
CLASS="sect2"
|
|
><A
|
|
NAME="AEN179"
|
|
></A
|
|
>5.5. NFS Settings On Slave Nodes</H2
|
|
><P
|
|
>As root, enter these commands:</P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="100%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="screen"
|
|
>cat >> /etc/fstab
|
|
wolf00:/mnt/wolf /mnt/wolf nfs rw,hard,intr 0 0
|
|
<control d> </PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
><P
|
|
>What we did here was automatically mount the exported directory we
|
|
put in the /etc/exports file on the head node. More discussion regarding
|
|
nfs later in this document.</P
|
|
></DIV
|
|
><DIV
|
|
CLASS="sect2"
|
|
><H2
|
|
CLASS="sect2"
|
|
><A
|
|
NAME="AEN184"
|
|
></A
|
|
>5.6. Lilo Modifications On Slave Nodes</H2
|
|
><P
|
|
>Then modify /etc/lilo.conf.</P
|
|
><P
|
|
>The 2nd line of this file says</P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="100%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="screen"
|
|
>timeout=nn</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
><P
|
|
>Modify that line to say:</P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="100%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="screen"
|
|
>timeout=1200</PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
><P
|
|
>After it is modified, we invoke the changes. You type
|
|
"/sbin/lilo", and it will display back "added linux *" to confirm that
|
|
it took the changes you made to the lilo.conf file:</P
|
|
><TABLE
|
|
BORDER="0"
|
|
BGCOLOR="#E0E0E0"
|
|
WIDTH="100%"
|
|
><TR
|
|
><TD
|
|
><FONT
|
|
COLOR="#000000"
|
|
><PRE
|
|
CLASS="screen"
|
|
>/sbin/lilo
|
|
Added linux * </PRE
|
|
></FONT
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
><P
|
|
>Why do I do this lilo modification? If you were researching
|
|
Beowulf on the web, and understand everything I have done so far, you
|
|
may wonder, "I don't remember reading anything about lilo.conf."</P
|
|
><P
|
|
>All my Beowulf nodes share a single power strip. I turn on the
|
|
power strip, and every box on the cluster starts up immediately. As the
|
|
startup procedure progresses, it mounts file systems. Seeing that the
|
|
non-head nodes mount the shared directory from the head node, they all
|
|
will have to wait a little bit until the head node is up, with NFS ready
|
|
to go. So I make each slave node wait 2 minutes in the lilo step.
|
|
Meanwhile, the head node comes up, and making the shared directory
|
|
available. By then, the slave nodes finally start booting up because
|
|
lilo has waited 2 minutes.</P
|
|
></DIV
|
|
></DIV
|
|
><DIV
|
|
CLASS="NAVFOOTER"
|
|
><HR
|
|
ALIGN="LEFT"
|
|
WIDTH="100%"><TABLE
|
|
SUMMARY="Footer navigation table"
|
|
WIDTH="100%"
|
|
BORDER="0"
|
|
CELLPADDING="0"
|
|
CELLSPACING="0"
|
|
><TR
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="left"
|
|
VALIGN="top"
|
|
><A
|
|
HREF="x70.html"
|
|
ACCESSKEY="P"
|
|
>Prev</A
|
|
></TD
|
|
><TD
|
|
WIDTH="34%"
|
|
ALIGN="center"
|
|
VALIGN="top"
|
|
><A
|
|
HREF="index.html"
|
|
ACCESSKEY="H"
|
|
>Home</A
|
|
></TD
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="right"
|
|
VALIGN="top"
|
|
><A
|
|
HREF="x195.html"
|
|
ACCESSKEY="N"
|
|
>Next</A
|
|
></TD
|
|
></TR
|
|
><TR
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="left"
|
|
VALIGN="top"
|
|
>Set Up The Head Node</TD
|
|
><TD
|
|
WIDTH="34%"
|
|
ALIGN="center"
|
|
VALIGN="top"
|
|
> </TD
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="right"
|
|
VALIGN="top"
|
|
>Verification</TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
></BODY
|
|
></HTML
|
|
> |