From 4b3a2416c54c9007c7e21295de38cd3aaab51b53 Mon Sep 17 00:00:00 2001 From: gferg <> Date: Tue, 6 Sep 2005 00:37:39 +0000 Subject: [PATCH] updated --- LDP/howto/docbook/HOWTO-INDEX/appsSect.sgml | 2 +- LDP/howto/docbook/HOWTO-INDEX/howtoChap.sgml | 4 +- LDP/howto/docbook/HOWTO-INDEX/hwSect.sgml | 2 +- .../docbook/HOWTO-INDEX/programmSect.sgml | 2 +- LDP/howto/linuxdoc/Cluster-HOWTO.sgml | 367 +++--------------- 5 files changed, 63 insertions(+), 314 deletions(-) diff --git a/LDP/howto/docbook/HOWTO-INDEX/appsSect.sgml b/LDP/howto/docbook/HOWTO-INDEX/appsSect.sgml index c1ef8aa4..a303c67c 100644 --- a/LDP/howto/docbook/HOWTO-INDEX/appsSect.sgml +++ b/LDP/howto/docbook/HOWTO-INDEX/appsSect.sgml @@ -87,7 +87,7 @@ rendering and modelling development environment using RedHat Linux. AI-Alife-HOWTO, Linux AI & Alife HOWTO -Updated: Aug 2004. +Updated: Aug 2005. Information about, and links to, various AI related software libraries, applications, etc. that work on the Linux platform. diff --git a/LDP/howto/docbook/HOWTO-INDEX/howtoChap.sgml b/LDP/howto/docbook/HOWTO-INDEX/howtoChap.sgml index 848aaf41..3d29e058 100644 --- a/LDP/howto/docbook/HOWTO-INDEX/howtoChap.sgml +++ b/LDP/howto/docbook/HOWTO-INDEX/howtoChap.sgml @@ -160,7 +160,7 @@ advocate the use of Linux. AI-Alife-HOWTO, Linux AI & Alife HOWTO -Updated: Aug 2004. +Updated: Aug 2005. Information about, and links to, various AI related software libraries, applications, etc. that work on the Linux platform. @@ -807,7 +807,7 @@ partition images to and from a TFTP server. Cluster-HOWTO, Linux Cluster HOWTO -Updated: Nov 2004. +Updated: Sep 2005. How to set up high-performance Linux computing clusters. diff --git a/LDP/howto/docbook/HOWTO-INDEX/hwSect.sgml b/LDP/howto/docbook/HOWTO-INDEX/hwSect.sgml index 5d7d3bf6..d2a39622 100644 --- a/LDP/howto/docbook/HOWTO-INDEX/hwSect.sgml +++ b/LDP/howto/docbook/HOWTO-INDEX/hwSect.sgml @@ -103,7 +103,7 @@ This is a Red Hat and LAM specific version of this document. Cluster-HOWTO, Linux Cluster HOWTO -Updated: Nov 2004. +Updated: Sep 2005. How to set up high-performance Linux computing clusters. diff --git a/LDP/howto/docbook/HOWTO-INDEX/programmSect.sgml b/LDP/howto/docbook/HOWTO-INDEX/programmSect.sgml index 4ed461cf..4777bc0d 100644 --- a/LDP/howto/docbook/HOWTO-INDEX/programmSect.sgml +++ b/LDP/howto/docbook/HOWTO-INDEX/programmSect.sgml @@ -762,7 +762,7 @@ test cases for developing accessible Linux applications. AI-Alife-HOWTO, Linux AI & Alife HOWTO -Updated: Aug 2004. +Updated: Aug 2005. Information about, and links to, various AI related software libraries, applications, etc. that work on the Linux platform. diff --git a/LDP/howto/linuxdoc/Cluster-HOWTO.sgml b/LDP/howto/linuxdoc/Cluster-HOWTO.sgml index fa446e1a..1e9cb929 100644 --- a/LDP/howto/linuxdoc/Cluster-HOWTO.sgml +++ b/LDP/howto/linuxdoc/Cluster-HOWTO.sgml @@ -1,4 +1,3 @@ -
@@ -6,7 +5,7 @@ Linux Cluster HOWTO Ram Samudrala (me@ram.org) - v1.31, November 7, 2004 + v1.5, September 5, 2005 How to set up high-performance Linux computing clusters. @@ -18,9 +17,9 @@ How to set up high-performance Linux computing clusters. Introduction -

This document describes how I set up my Linux computing clusters -for high-performance computing which I need for .

+

This document describes how we set up our Linux computing clusters +for high-performance computing which we need for .

Use the information below at your own risk. I disclaim all responsibility for anything you may do after reading this HOWTO. The @@ -173,262 +172,28 @@ the following setups: - Desktop hardware + Desktop and terminal hardware -

1 desktop with the following setup: +

We have identified at least two kinds of users of our cluster: +those that need (i.e., take advantage of) permanent local processing +power and disk space in conjunction with the cluster to speed up +processing, and those that just need only the cluster processing +power. The former are assigned "desktops" which are essentially +high-performance machines, and the latter are assigned dumb +"terminals". Our desktops are usually dual or quad processor machines +with the current high-end CPU being a 1.6 GHz Opteron, having as much +as 10 GB of RAM, and over 1 TB of local disk space. Our terminals are +essentially machines where a user can log in and then run jobs on our +farm. In this setup, people may also use laptops as dumb terminals.

- - 4 AMD 842 Opteron 1.6 GHz CPUs - Tyan S4880UG2NR motherboard - 80GB MAX 7200 HD - 2 250GB WD 7200 HD - 10 GB DDR PC3200 REG ECC RAM - Chenbro SR107 BLACK 550W 4u case - -

- -

2 desktops with the following setup: - - - 2 AMD Opteron 240 1.4 GHz CPUs - K8T MASTER2-FAR K8T800 ATX motherboard - 2 KINGSTON 512MB PC2700 REG. ECC RAM - 550W Antec Xeon power supply - ANTEC SX630II 300W mid-tower case - 1.44mb floppy drive - PRO 660 TV/DVI FX5200T 128MB video card - 1 80GB SEA 7200 harddisk - 2 200GB WD 7200 8MB harddisk - CREATIVE SB 128 5.1 PCI soundcard - -

- -

1 desktop with the following setup: - - - 2 AMD XP 2600 MP CPUs - MSI K7D Master-L DUAL MS-6501 motherboard - 4 1024MB PC2100 DDR REG ECC RAM - 1 40GB SEA 7200 Maxtor harddisk - 2 120GB SEA 7200 Maxtor hardidks - PIONEER DVR-AO5 IDE DVD-RW - 1.44mb floppy drive - ATI Expert 2000 Rage 128 32mb video card - IN-WIN P4 300ATX Mid Tower case - Intel PCI PRO-100 10/100Mbps network card - 450W ENERMAX P4-430ATX power supply - CREATIVE SB 128 5.1 PCI soundcard - -

- -

2 desktops with the following setup: - - - 2 AMD XP 2600 MP CPUs - MSI K7D Master-L DUAL MS-6501 motherboard - 2 512MB PC2100 DDR REG ECC RAM - 1 40GB SEA 7200 Maxtor harddisk - 2 120GB SEA 7200 Maxtor hardidks - MSI 52X24X52X CR52-A2 CD-RW - 1.44mb floppy drive - ATI Expert 2000 Rage 128 32mb video card - IN-WIN P4 300ATX Mid Tower case - Intel PCI PRO-100 10/100Mbps network card - 450W ENERMAX P4-430ATX power supply - CREATIVE SB 128 5.1 PCI soundcard - -

- -

1 desktop with the following setup: - - - 2 AMD Palamino MP XP 2000+ 1.67 GHz CPUs - Asus A7M266-D w/LAN Dual DDR - 2 Kingston 512mb PC2100 DDR-266MHz REG ECC RAM - Ricoh 32x12x10 CDRW/DVD Combo EIDE - 1.44mb floppy drive - 1 41 GB Maxtor 7200rpm ATA100 HD - 1 120 GB Maxtor 5400rpm ATA100 HD - ATI Expert 2000 Rage 128 32mb video card - IN-WIN P4 300ATX Mid Tower case - Intel PCI PRO-100 10/100Mbps network card - Enermax P4-430ATX power supply - -

- -

1 desktop with the following setup: - - - 2 Intel Xeon 1.7 GHz 256K 400FS - Supermicro P4DCE Dual Xeon motherboard - 4 256mb RAMBUS 184-Pin 800 MHz memory - 2 120 GB Maxtor ATA/100 5400 RPM HD - 1 60 GB Maxtor ATA/100 7200 RPM HD - 52X Asus CD-A520 INT IDE CDROM - 1.4 MB floppy drive - Leadtex 64 MB GF2 MX400 AGP - Creative SB LIVE Value PCI 5.1 - Microsoft Natural Keyboard - Microsoft Intellimouse Explorer - Supermicro SC760 full-tower case with 400W PS - -

- -

2 desktops with the following setup: - - - 2 AMD K7 1.2g/266 MP Socket A CPU - Tyan S2462NG Dual Socket A motherboard - 4 256mb PC2100 REG ECC DDR-266Mhz - 3 40 GB Maxtor UDMA/100 7200 RPM HD - 50X Asus CD-A520 INT IDE CDROM - 1.4 MB floppy drive - Chaintech Geforce2 MX200 32mg AGP - Creative SB LIVE Value PCI - Full-tower case with 300W PS - -

- -

2 desktops with the following setup: - - - 2 Pentium III 1 GHz Intel CPUs - Supermicro 370 DLE Dual PIII-FCPGA motherboard - 4 256 MB 168-pin PC133 Registered ECC Micron RAM - 3 40 GB Maxtor UDMA/100 7200 RPM HD - Asus CD-S500 50x CDROM - 1.4 MB floppy drive - Jaton Nvidia TNT2 32mb PCI - Creative SB LIVE Value PCI - Full-tower case with 300W PS - -

- -

2 desktops with the following setup: - - - 2 Pentium III 1 GHz Intel CPUs - Supermicro 370 DLE Dual PIII-FCPGA motherboard - 4 256 MB 168-pin PC133 Registered ECC Micron RAM - 3 40 GB Maxtor UDMA/100 7200 RPM HD - Mitsumi 8x/4x/32x CDRW - 1.4 MB floppy drive - Jaton Nvidia TNT2 32mb PCI - Creative SB LIVE Value PCI - Full-tower case with 300W PS - -

- -

1 desktop with the following setup: - - - 2 Pentium III 1 GHz Intel CPUs - Supermicro 370 DE6 Dual PIII-FCPGA motherboard - 4 256 MB 168-pin PC133 Registered ECC Micron RAM - 3 40 GB Maxtor UDMA/100 7200 RPM HD - Ricoh 32x12x10 CDRW/DVD Combo EIDE - Asus CD-A520 52x CDROM - 1.4 MB floppy drive - Asus V7700 64mb GeForce2-GTS AGP video card - Creative SB Live Platinum 5.1 sound card - Full-tower case with 300W PS - - -

- -

3 desktops with the following setup: - - - 2 Pentium III 1 GHz Intel CPUs - Supermicro 370 DE6 Dual PIII-FCPGA motherboard - 4 256 MB 168-pin PC133 Registered ECC Micron RAM - 3 40 GB Maxtor UDMA/100 7200 RPM hard disk - Ricoh 32x12x10 CDRW/DVD Combo EIDE - 1.4 MB floppy drive - Asus V7700 64mb GeForce2-GTS AGP video card - Creative SB Live Platinum 5.1 sound card - Full-tower case with 300W PS - -

- - - - - - Firewall/gateway hardware - -

1 firewall with the following setup: - - - AMD Palamino XP 1700+ 1.47GHz CPU - MSI KT3 Ultra2 KT333 MS-6380E motherboard - 512 MB PC2100 DDR-266MHz DIMM RAM - 40GB Seagate 7200rpm ATA/100 hard disk - Asus 52X CD-A520 INT IDE cdrom - 1.44 MB floppy drive - ATI Expert 2000 Rage 128 32mb video card - 3 Intel Pro/1000T Gigabit Server ethernet cards - 4U Black Rackmount Steel case - -

- -

1 gateway with the following setup. The gateway is a mirror of -the firewall in case the firewall breaks. - - - AMD Palamino XP 1800+ 1.57GHz CPU - MSI KT3 Ultra2 KT333 MS-6380E motherboard - 512 MB PC2100 DDR-266MHz DIMM RAM - 40GB Seagate 7200rpm ATA/100 hard disk - Asus 52X CD-A520 INT IDE cdrom - 1.44 MB floppy drive - ATI Expert 2000 Rage 128 32mb video card - 3 Intel Pro/1000T Gigabit Server ethernet cards - 4U Black Rackmount Steel case - -

- -
Miscellaneous/accessory hardware -

Backup: - - - 2 Sony 20/40 GB DSS4 SE LVD DAT drives - -

- -

Monitors: - - - 3 17" Viewsonic VG700 LCB monitor - 2 17" Viewsonic VE700 LCD monitor - 1 20.1" Viewsonic VP201M LCD monitor - 1 22" Viewsonic P220F 0.25-0.27m monitor - 4 21" Sony CPD-G500 .24mm monitor - 2 18" Viewsonic VP181 LCD monitor - 1 17" Viewsonic VE170 LCD monitor - 2 21" Sun monitors - -

- -

Printers: - - - HP colour laserject 4600dn - -

- -

Keyboards/mice: - - - Microsoft Internet/Natural Keyboard - Microsoft Intellimouse Explorer - -

+

We generally use/prefer Viewsonic monitors, Microsoft Intellimouse +mice, and Microsoft Natural keyboards. These generally have worked +quite reliably for us.

@@ -436,35 +201,14 @@ the firewall in case the firewall breaks. Putting-it-all-together hardware -

We used to use KVM switches with a cheap monitor to connect up and "look" -at all the machines: +

For visual access to the nodes, we initially used to use KVM +switches with a cheap monitor to connect up and "look" at all the +machines. While this was a nice solution, it did not scale. We +currently wheel a small monitor around and hook up cables as needed. +What we need is a small hand held monitor that can plug into the back +of the PC (operated with a stylus, like the Palm).

- - 15" .28dp XLN CTL monitor - 3 Belkin Omniview 16-Port Pro switches - Belkin Omniview 2-Port switch - 2 APC AR203 netshelter rack units - -

- -

While this is a nice solution, I think it's kind of needless. What -we need is a small hand held monitor that can plug into the back of -the PC (operated with a stylus, like the Palm). I don't plan to use -more monitor switches/KVM cables.

- -

Networking is important: - - - 1 Netgear FS750ATNA 48 port/1 git network switch - 2 Netgear FS750NA 48 port/1 git network switch - 1 Netgear FSM750S 48 port/2 git network switch - 1 Netgear FS517TS 16 port/1 git network switch - 1 Netgear FS524 24 port network switch - 1 Cisco Catalyst 3448 XL Enterprise Edition 48 port network switch - 1 Netgear ME102NA Wireless Access Point - 1 Netgear MA401NA Wireless PCMCIA network card - -

+

For networking, we generally use Netgear and Cisco switches.

@@ -500,7 +244,6 @@ of each processor to below $1000 (including housing it).

Kernel 2.4.18-10, distribution KRUD 7.3 Kernel 2.4.20-13.9, distribution KRUD 9.0 Kernel 2.4.22-1.2188, distribution KRUD 2004-05 - Kernel 2.6.5-1.358, distribution Fedora Core 2 These distributions work very well for us since updates are sent to us @@ -553,40 +296,45 @@ copiable.

Disk configuration -

This section describes disk partitioning strategies.

+

This section describes disk partitioning strategies. Our goal is +to keep the virtual structures of the machines organised such that +they are all logical. We're finding that the physical mappings to the +logical structures are not sustainable as hardware and software +(operating system) change. Currently, our strategy is as follows: -

farm/cluster machines: -hda1 - swap (2 * RAM) -hda2 - / (remaining disk space) -hdb1 - /maxa (total disk) +partition 1 on system disk - swap (2 * RAM) +partition 2 on system disk - / (remaining disk space) +partition 1 on additional disk - /maxa (total disk) -desktops (without windows): +servers: -hda1 - swap (2 * RAM) -hda2 - / (4-8 GB) -hda3 - /spare (remaining disk space) -hdb1 - /maxa (total disk) -hdd1 - /maxb (total disk) +partition 1 on system disk - swap (2 * RAM) +partition 2 on system disk - / (4-8 GB) +partition 3 on system disk - /home (remaining disk space) +partition 1 on additional disk 1 - /maxa (total disk) +partition 1 on additional disk 2 - /maxb (total disk) +partition 1 on additional disk 3 - /maxc (total disk) +partition 1 on additional disk 4 - /maxd (total disk) +partition 1 on additional disk 5 - /maxe (total disk) +partition 1 on additional disk 6 - /maxf (total disk) +partition 1 on additional disk(s) - /maxg (total disk space) -desktops (with windows): +desktops: -hda1 - /win (total disk) -hdb1 - swap (2 * RAM) -hdb2 - / (4 GB) -hdb3 - /spare (remaining disk space) -hdd1 - /maxa (total disk) - -laptops (single disk): - -hda1 - /win (half the total disk size) -hda2 - swap (2 * RAM) -hda3 - / (remaining disk space) +partition 1 on system disk - swap (2 * RAM) +partition 2 on system disk - / (4-8 GB) +partition 3 on system disk - /spare (remaining disk space) +partition 1 on additional disk 1 - /maxa (total disk) +partition 1 on additional disk(s) - /maxb (total disk space)

+

Note that in the case of servers and desktops, maxg and maxb can +be a single disk or a conglomeration of disks.

+
@@ -594,7 +342,8 @@ hda3 - / (remaining disk space) Package configuration

Install a minimal set of packages for the farm. Users are allowed -to configure desktops as they wish.

+to configure desktops as they wish, provided the virtual structure is +kept the same described above is kept the same.

@@ -799,8 +548,8 @@ have an impact) and the supporting hardware is different.

These machines are incredibly stable both in terms of hardware and software once they have been debugged (usually some in a new batch of machines have hardware problems), running constantly under very heavy -loads. One example is given below. Reboots have generally occurred -when a circuit breaker is tripped. +loads. One common example is given below. Reboots have generally +occurred when a circuit breaker is tripped. 2:29pm up 495 days, 1:04, 2 users, load average: 4.85, 7.15, 7.72