This commit is contained in:
gferg 2005-09-06 00:37:39 +00:00
parent 7321b40075
commit 4b3a2416c5
5 changed files with 63 additions and 314 deletions

View File

@ -87,7 +87,7 @@ rendering and modelling development environment using RedHat Linux. </Para>
AI-Alife-HOWTO</ULink>,
<CiteTitle>Linux AI &amp; Alife HOWTO</CiteTitle>
</Para><Para>
<CiteTitle>Updated: Aug 2004</CiteTitle>.
<CiteTitle>Updated: Aug 2005</CiteTitle>.
Information about, and links to, various AI related software libraries,
applications, etc. that work on the Linux platform. </Para>
</ListItem>

View File

@ -160,7 +160,7 @@ advocate the use of Linux. </Para>
AI-Alife-HOWTO</ULink>,
<CiteTitle>Linux AI &amp; Alife HOWTO</CiteTitle>
</Para><Para>
<CiteTitle>Updated: Aug 2004</CiteTitle>.
<CiteTitle>Updated: Aug 2005</CiteTitle>.
Information about, and links to, various AI related software libraries,
applications, etc. that work on the Linux platform. </Para>
</ListItem>
@ -807,7 +807,7 @@ partition images to and from a TFTP server. </Para>
Cluster-HOWTO</ULink>,
<CiteTitle>Linux Cluster HOWTO</CiteTitle>
</Para><Para>
<CiteTitle>Updated: Nov 2004</CiteTitle>.
<CiteTitle>Updated: Sep 2005</CiteTitle>.
How to set up high-performance Linux computing clusters. </Para>
</ListItem>

View File

@ -103,7 +103,7 @@ This is a Red Hat and LAM specific version of this document.
Cluster-HOWTO</ULink>,
<CiteTitle>Linux Cluster HOWTO</CiteTitle>
</Para><Para>
<CiteTitle>Updated: Nov 2004</CiteTitle>.
<CiteTitle>Updated: Sep 2005</CiteTitle>.
How to set up high-performance Linux computing clusters. </Para>
</ListItem>

View File

@ -762,7 +762,7 @@ test cases for developing accessible Linux applications. </Para>
AI-Alife-HOWTO</ULink>,
<CiteTitle>Linux AI &amp; Alife HOWTO</CiteTitle>
</Para><Para>
<CiteTitle>Updated: Aug 2004</CiteTitle>.
<CiteTitle>Updated: Aug 2005</CiteTitle>.
Information about, and links to, various AI related software libraries,
applications, etc. that work on the Linux platform. </Para>
</ListItem>

View File

@ -1,4 +1,3 @@
<!doctype Linuxdoc system>
<article>
@ -6,7 +5,7 @@
<title> Linux Cluster HOWTO </title>
<author>Ram Samudrala <tt>(me@ram.org)</tt> </author>
<date> v1.31, November 7, 2004 </date>
<date> v1.5, September 5, 2005 </date>
<abstract>
How to set up high-performance Linux computing clusters.
@ -18,9 +17,9 @@ How to set up high-performance Linux computing clusters.
<sect> Introduction
<p> This document describes how I set up my Linux computing clusters
for high-performance computing which I need for <htmlurl
url="http://compbio.washington.edu" name="my research">. </p>
<p> This document describes how we set up our Linux computing clusters
for high-performance computing which we need for <htmlurl
url="http://compbio.washington.edu" name="our research">. </p>
<p> Use the information below at your own risk. I disclaim all
responsibility for anything you may do after reading this HOWTO. The
@ -173,262 +172,28 @@ the following setups:
<!-- ************************************************************* -->
<sect1> Desktop hardware
<sect1> Desktop and terminal hardware
<p> 1 desktop with the following setup:
<p> We have identified at least two kinds of users of our cluster:
those that need (i.e., take advantage of) permanent local processing
power and disk space in conjunction with the cluster to speed up
processing, and those that just need only the cluster processing
power. The former are assigned "desktops" which are essentially
high-performance machines, and the latter are assigned dumb
"terminals". Our desktops are usually dual or quad processor machines
with the current high-end CPU being a 1.6 GHz Opteron, having as much
as 10 GB of RAM, and over 1 TB of local disk space. Our terminals are
essentially machines where a user can log in and then run jobs on our
farm. In this setup, people may also use laptops as dumb terminals. </p>
<itemize>
<item> 4 AMD 842 Opteron 1.6 GHz CPUs </item>
<item> Tyan S4880UG2NR motherboard </item>
<item> 80GB MAX 7200 HD </item>
<item> 2 250GB WD 7200 HD </item>
<item> 10 GB DDR PC3200 REG ECC RAM </item>
<item> Chenbro SR107 BLACK 550W 4u case </item>
</itemize>
</p>
<p> 2 desktops with the following setup:
<itemize>
<item> 2 AMD Opteron 240 1.4 GHz CPUs </item>
<item> K8T MASTER2-FAR K8T800 ATX motherboard </item>
<item> 2 KINGSTON 512MB PC2700 REG. ECC RAM </item>
<item> 550W Antec Xeon power supply </item>
<item> ANTEC SX630II 300W mid-tower case </item>
<item> 1.44mb floppy drive </item>
<item> PRO 660 TV/DVI FX5200T 128MB video card </item>
<item> 1 80GB SEA 7200 harddisk </item>
<item> 2 200GB WD 7200 8MB harddisk </item>
<item> CREATIVE SB 128 5.1 PCI soundcard </item>
</itemize>
</p>
<p> 1 desktop with the following setup:
<itemize>
<item> 2 AMD XP 2600 MP CPUs </item>
<item> MSI K7D Master-L DUAL MS-6501 motherboard </item>
<item> 4 1024MB PC2100 DDR REG ECC RAM </item>
<item> 1 40GB SEA 7200 Maxtor harddisk </item>
<item> 2 120GB SEA 7200 Maxtor hardidks </item>
<item> PIONEER DVR-AO5 IDE DVD-RW </item>
<item> 1.44mb floppy drive </item>
<item> ATI Expert 2000 Rage 128 32mb video card </item>
<item> IN-WIN P4 300ATX Mid Tower case </item>
<item> Intel PCI PRO-100 10/100Mbps network card </item>
<item> 450W ENERMAX P4-430ATX power supply </item>
<item> CREATIVE SB 128 5.1 PCI soundcard </item>
</itemize>
</p>
<p> 2 desktops with the following setup:
<itemize>
<item> 2 AMD XP 2600 MP CPUs </item>
<item> MSI K7D Master-L DUAL MS-6501 motherboard </item>
<item> 2 512MB PC2100 DDR REG ECC RAM </item>
<item> 1 40GB SEA 7200 Maxtor harddisk </item>
<item> 2 120GB SEA 7200 Maxtor hardidks </item>
<item> MSI 52X24X52X CR52-A2 CD-RW </item>
<item> 1.44mb floppy drive </item>
<item> ATI Expert 2000 Rage 128 32mb video card </item>
<item> IN-WIN P4 300ATX Mid Tower case </item>
<item> Intel PCI PRO-100 10/100Mbps network card </item>
<item> 450W ENERMAX P4-430ATX power supply </item>
<item> CREATIVE SB 128 5.1 PCI soundcard </item>
</itemize>
</p>
<p> 1 desktop with the following setup:
<itemize>
<item> 2 AMD Palamino MP XP 2000+ 1.67 GHz CPUs </item>
<item> Asus A7M266-D w/LAN Dual DDR </item>
<item> 2 Kingston 512mb PC2100 DDR-266MHz REG ECC RAM </item>
<item> Ricoh 32x12x10 CDRW/DVD Combo EIDE </item>
<item> 1.44mb floppy drive </item>
<item> 1 41 GB Maxtor 7200rpm ATA100 HD </item>
<item> 1 120 GB Maxtor 5400rpm ATA100 HD </item>
<item> ATI Expert 2000 Rage 128 32mb video card </item>
<item> IN-WIN P4 300ATX Mid Tower case </item>
<item> Intel PCI PRO-100 10/100Mbps network card </item>
<item> Enermax P4-430ATX power supply </item>
</itemize>
</p>
<p> 1 desktop with the following setup:
<itemize>
<item> 2 Intel Xeon 1.7 GHz 256K 400FS </item>
<item> Supermicro P4DCE Dual Xeon motherboard </item>
<item> 4 256mb RAMBUS 184-Pin 800 MHz memory </item>
<item> 2 120 GB Maxtor ATA/100 5400 RPM HD </item>
<item> 1 60 GB Maxtor ATA/100 7200 RPM HD </item>
<item> 52X Asus CD-A520 INT IDE CDROM </item>
<item> 1.4 MB floppy drive </item>
<item> Leadtex 64 MB GF2 MX400 AGP </item>
<item> Creative SB LIVE Value PCI 5.1 </item>
<item> Microsoft Natural Keyboard </item>
<item> Microsoft Intellimouse Explorer </item>
<item> Supermicro SC760 full-tower case with 400W PS </item>
</itemize>
</p>
<p> 2 desktops with the following setup:
<itemize>
<item> 2 AMD K7 1.2g/266 MP Socket A CPU </item>
<item> Tyan S2462NG Dual Socket A motherboard </item>
<item> 4 256mb PC2100 REG ECC DDR-266Mhz </item>
<item> 3 40 GB Maxtor UDMA/100 7200 RPM HD </item>
<item> 50X Asus CD-A520 INT IDE CDROM </item>
<item> 1.4 MB floppy drive </item>
<item> Chaintech Geforce2 MX200 32mg AGP </item>
<item> Creative SB LIVE Value PCI </item>
<item> Full-tower case with 300W PS </item>
</itemize>
</p>
<p> 2 desktops with the following setup:
<itemize>
<item> 2 Pentium III 1 GHz Intel CPUs </item>
<item> Supermicro 370 DLE Dual PIII-FCPGA motherboard </item>
<item> 4 256 MB 168-pin PC133 Registered ECC Micron RAM </item>
<item> 3 40 GB Maxtor UDMA/100 7200 RPM HD </item>
<item> Asus CD-S500 50x CDROM </item>
<item> 1.4 MB floppy drive </item>
<item> Jaton Nvidia TNT2 32mb PCI </item>
<item> Creative SB LIVE Value PCI </item>
<item> Full-tower case with 300W PS </item>
</itemize>
</p>
<p> 2 desktops with the following setup:
<itemize>
<item> 2 Pentium III 1 GHz Intel CPUs </item>
<item> Supermicro 370 DLE Dual PIII-FCPGA motherboard </item>
<item> 4 256 MB 168-pin PC133 Registered ECC Micron RAM </item>
<item> 3 40 GB Maxtor UDMA/100 7200 RPM HD </item>
<item> Mitsumi 8x/4x/32x CDRW </item>
<item> 1.4 MB floppy drive </item>
<item> Jaton Nvidia TNT2 32mb PCI </item>
<item> Creative SB LIVE Value PCI </item>
<item> Full-tower case with 300W PS </item>
</itemize>
</p>
<p> 1 desktop with the following setup:
<itemize>
<item> 2 Pentium III 1 GHz Intel CPUs </item>
<item> Supermicro 370 DE6 Dual PIII-FCPGA motherboard </item>
<item> 4 256 MB 168-pin PC133 Registered ECC Micron RAM </item>
<item> 3 40 GB Maxtor UDMA/100 7200 RPM HD </item>
<item> Ricoh 32x12x10 CDRW/DVD Combo EIDE </item>
<item> Asus CD-A520 52x CDROM </item>
<item> 1.4 MB floppy drive </item>
<item> Asus V7700 64mb GeForce2-GTS AGP video card </item>
<item> Creative SB Live Platinum 5.1 sound card </item>
<item> Full-tower case with 300W PS </item>
</itemize>
</p>
<p> 3 desktops with the following setup:
<itemize>
<item> 2 Pentium III 1 GHz Intel CPUs </item>
<item> Supermicro 370 DE6 Dual PIII-FCPGA motherboard </item>
<item> 4 256 MB 168-pin PC133 Registered ECC Micron RAM </item>
<item> 3 40 GB Maxtor UDMA/100 7200 RPM hard disk </item>
<item> Ricoh 32x12x10 CDRW/DVD Combo EIDE </item>
<item> 1.4 MB floppy drive </item>
<item> Asus V7700 64mb GeForce2-GTS AGP video card </item>
<item> Creative SB Live Platinum 5.1 sound card </item>
<item> Full-tower case with 300W PS </item>
</itemize>
</p>
</sect1>
<!-- ************************************************************* -->
<sect1> Firewall/gateway hardware
<p> 1 firewall with the following setup:
<itemize>
<item> AMD Palamino XP 1700+ 1.47GHz CPU </item>
<item> MSI KT3 Ultra2 KT333 MS-6380E motherboard </item>
<item> 512 MB PC2100 DDR-266MHz DIMM RAM </item>
<item> 40GB Seagate 7200rpm ATA/100 hard disk </item>
<item> Asus 52X CD-A520 INT IDE cdrom </item>
<item> 1.44 MB floppy drive </item>
<item> ATI Expert 2000 Rage 128 32mb video card </item>
<item> 3 Intel Pro/1000T Gigabit Server ethernet cards </item>
<item> 4U Black Rackmount Steel case </item>
</itemize>
</p>
<p> 1 gateway with the following setup. The gateway is a mirror of
the firewall in case the firewall breaks.
<itemize>
<item> AMD Palamino XP 1800+ 1.57GHz CPU </item>
<item> MSI KT3 Ultra2 KT333 MS-6380E motherboard </item>
<item> 512 MB PC2100 DDR-266MHz DIMM RAM </item>
<item> 40GB Seagate 7200rpm ATA/100 hard disk </item>
<item> Asus 52X CD-A520 INT IDE cdrom </item>
<item> 1.44 MB floppy drive </item>
<item> ATI Expert 2000 Rage 128 32mb video card </item>
<item> 3 Intel Pro/1000T Gigabit Server ethernet cards </item>
<item> 4U Black Rackmount Steel case </item>
</itemize>
</p>
</sect1>
<!-- ************************************************************* -->
<sect1> Miscellaneous/accessory hardware
<p> Backup:
<itemize>
<item> 2 Sony 20/40 GB DSS4 SE LVD DAT drives </item>
</itemize>
</p>
<p> Monitors:
<itemize>
<item> 3 17" Viewsonic VG700 LCB monitor </item>
<item> 2 17" Viewsonic VE700 LCD monitor </item>
<item> 1 20.1" Viewsonic VP201M LCD monitor </item>
<item> 1 22" Viewsonic P220F 0.25-0.27m monitor </item>
<item> 4 21" Sony CPD-G500 .24mm monitor </item>
<item> 2 18" Viewsonic VP181 LCD monitor </item>
<item> 1 17" Viewsonic VE170 LCD monitor </item>
<item> 2 21" Sun monitors </item>
</itemize>
</p>
<p> Printers:
<itemize>
<item> HP colour laserject 4600dn </item>
</itemize>
</p>
<p> Keyboards/mice:
<itemize>
<item> Microsoft Internet/Natural Keyboard </item>
<item> Microsoft Intellimouse Explorer </item>
</itemize>
</p>
<p> We generally use/prefer Viewsonic monitors, Microsoft Intellimouse
mice, and Microsoft Natural keyboards. These generally have worked
quite reliably for us. </p>
</sect1>
@ -436,35 +201,14 @@ the firewall in case the firewall breaks.
<sect1> Putting-it-all-together hardware
<p> We used to use KVM switches with a cheap monitor to connect up and "look"
at all the machines:
<p> For visual access to the nodes, we initially used to use KVM
switches with a cheap monitor to connect up and "look" at all the
machines. While this was a nice solution, it did not scale. We
currently wheel a small monitor around and hook up cables as needed.
What we need is a small hand held monitor that can plug into the back
of the PC (operated with a stylus, like the Palm). </p>
<itemize>
<item> 15" .28dp XLN CTL monitor </item>
<item> 3 Belkin Omniview 16-Port Pro switches </item>
<item> Belkin Omniview 2-Port switch </item>
<item> 2 APC AR203 netshelter rack units </item>
</itemize>
</p>
<p> While this is a nice solution, I think it's kind of needless. What
we need is a small hand held monitor that can plug into the back of
the PC (operated with a stylus, like the Palm). I don't plan to use
more monitor switches/KVM cables. </p>
<p> Networking is important:
<itemize>
<item> 1 Netgear FS750ATNA 48 port/1 git network switch </item>
<item> 2 Netgear FS750NA 48 port/1 git network switch </item>
<item> 1 Netgear FSM750S 48 port/2 git network switch </item>
<item> 1 Netgear FS517TS 16 port/1 git network switch </item>
<item> 1 Netgear FS524 24 port network switch </item>
<item> 1 Cisco Catalyst 3448 XL Enterprise Edition 48 port network switch </item>
<item> 1 Netgear ME102NA Wireless Access Point </item>
<item> 1 Netgear MA401NA Wireless PCMCIA network card </item>
</itemize>
</p>
<p> For networking, we generally use Netgear and Cisco switches. </p>
</sect1>
@ -500,7 +244,6 @@ of each processor to below $1000 (including housing it). </p>
<item> Kernel 2.4.18-10, distribution KRUD 7.3
<item> Kernel 2.4.20-13.9, distribution KRUD 9.0
<item> Kernel 2.4.22-1.2188, distribution KRUD 2004-05
<item> Kernel 2.6.5-1.358, distribution Fedora Core 2
</itemize>
These distributions work very well for us since updates are sent to us
@ -553,40 +296,45 @@ copiable. </p>
<sect1> Disk configuration
<p> This section describes disk partitioning strategies. </p>
<p> This section describes disk partitioning strategies. Our goal is
to keep the virtual structures of the machines organised such that
they are all logical. We're finding that the physical mappings to the
logical structures are not sustainable as hardware and software
(operating system) change. Currently, our strategy is as follows:
<p>
<tscreen><verb>
farm/cluster machines:
hda1 - swap (2 * RAM)
hda2 - / (remaining disk space)
hdb1 - /maxa (total disk)
partition 1 on system disk - swap (2 * RAM)
partition 2 on system disk - / (remaining disk space)
partition 1 on additional disk - /maxa (total disk)
desktops (without windows):
servers:
hda1 - swap (2 * RAM)
hda2 - / (4-8 GB)
hda3 - /spare (remaining disk space)
hdb1 - /maxa (total disk)
hdd1 - /maxb (total disk)
partition 1 on system disk - swap (2 * RAM)
partition 2 on system disk - / (4-8 GB)
partition 3 on system disk - /home (remaining disk space)
partition 1 on additional disk 1 - /maxa (total disk)
partition 1 on additional disk 2 - /maxb (total disk)
partition 1 on additional disk 3 - /maxc (total disk)
partition 1 on additional disk 4 - /maxd (total disk)
partition 1 on additional disk 5 - /maxe (total disk)
partition 1 on additional disk 6 - /maxf (total disk)
partition 1 on additional disk(s) - /maxg (total disk space)
desktops (with windows):
desktops:
hda1 - /win (total disk)
hdb1 - swap (2 * RAM)
hdb2 - / (4 GB)
hdb3 - /spare (remaining disk space)
hdd1 - /maxa (total disk)
laptops (single disk):
hda1 - /win (half the total disk size)
hda2 - swap (2 * RAM)
hda3 - / (remaining disk space)
partition 1 on system disk - swap (2 * RAM)
partition 2 on system disk - / (4-8 GB)
partition 3 on system disk - /spare (remaining disk space)
partition 1 on additional disk 1 - /maxa (total disk)
partition 1 on additional disk(s) - /maxb (total disk space)
</verb></tscreen>
</p>
<p> Note that in the case of servers and desktops, maxg and maxb can
be a single disk or a conglomeration of disks. </p>
</sect1>
<!-- ************************************************************* -->
@ -594,7 +342,8 @@ hda3 - / (remaining disk space)
<sect1> Package configuration
<p> Install a minimal set of packages for the farm. Users are allowed
to configure desktops as they wish. </p>
to configure desktops as they wish, provided the virtual structure is
kept the same described above is kept the same. </p>
</sect1>
@ -799,8 +548,8 @@ have an impact) and the supporting hardware is different. </p>
<p> These machines are incredibly stable both in terms of hardware and
software once they have been debugged (usually some in a new batch of
machines have hardware problems), running constantly under very heavy
loads. One example is given below. Reboots have generally occurred
when a circuit breaker is tripped.
loads. One common example is given below. Reboots have generally
occurred when a circuit breaker is tripped.
<tscreen><verb>
2:29pm up 495 days, 1:04, 2 users, load average: 4.85, 7.15, 7.72