Working Schedule for UPS test on July 14th, 2008 The "Plan" This is a rough plan of actions which need to be performed on that day (all times are CEST=UTC 0200) ...
Central benchmark page All our benchmark results should be linked from this page to help rediscovering already performed benchmarks. Ideally, a summary page will ...
Benchmarks Network Disk Performance ffsb We are also using ffsb to simulated different workloads. The following tests were performed so far: * Testing diffe...
Measured performance with this command line bonnie s 32768 d /path/to/dir u 1000 The results are # data server locally store03,32G,76714,99,358710,47,160723,3...
Compute cluster racks In each compute cluster rack will be stuffed with 42 compute nodes. Each compute nodes needs: * Height units needed: 1 * Ethernet conn...
Monday 2008 02 04 30 nodes in rack down at about 16:00 UTC All computers and switches were off by about 16:11 UTC Tuesday 2008 02 05 Computers were started at th...
DNS scheme The internal domain is atlas.local (managed by us), external domain might be atlas.uni hannover.de (managed by RRZN) Nodes Nodes have three networking...
What to do if a drive failure occurs? First signs A drive failure might be reported via fmdump or zpool status, e.g. # fmdump v Jul 16 22:18:52.7769 f52c874e 91...
For Users * General Introduction for Users * Useful Items * How ATLAS stores files * ErrorMessages and how to fix them (not updated) General Document...
ATLAS Hardware Resources And Photo Gallery This is about the hardware resources that ATLAS is based on and related photos Computer node There are 1680 Computer n...
$HOME file systems Where does the data live? We currently have 12 Sun Thumper X4500 which are used to store users' $HOME file systems. The users are distributed ...
HOWTO Update ILOM Firmware cf. ILOM Howto Update ILOM Firmware * Log in in ILOM CLI and type "version" to check * Download new Firmware versions http://w...
IP Scheme The goal is to create a scheme which eases daily life. IP mapping should be more or less straightforward and extensible. For servers look at: * indi...
What is ATLAS? ATLAS is a general purpose compute cluster, located in the Albert Einstein Institute for Gravitational Physics, in Hannover Germany, on the campus ...
jumpstart A jumpstart server provides the information and configuration used to install other nodes. It serves tftp requests to provide a kernel and installatio...
Enable blastwave package repository pkgadd d http://www.blastwave.org/pkg_get.pkg With this you can install nice tools such as gtar, gcc, wget with pkg get ins...
Boundary Conditions Racks * Cooling: up to 42*220W = 9.24 kW per rack * Electrical Power: up to 9.24 kW per rack * 42 horizontal and a few vertical heig...
Abstract This documentation describes server, the corresponding functions and the location. Table of Server name location function * ip * FAI manage...
Here some links to useful Solaris documentation: Thumper * Sun Fire X4500 Server Administration Guide * Specifications ZFS * Manpage zfs * Tips for ...
Sun Fire X4500 We have got a Sun Fire X4500 , a dual Opteron storage server with 48 SATA 500GB drives from Sun Microsystems. Installation It weights about 80kg...
How to determine which user to put onto which Sun server The idea is pretty simple and should work well. From our 12 Sun servers currently set up for users, we wi...
Backup Strategy for users' $HOME Rolling snapshots Each thumper will loop over all users' file systems and create a new snapshot every 6 hours. Old snapshots wil...
SNMP OIDs on SunFire X4500 Here some OIDs for Thumper Events: FAN front: OK=7 FAIL=5 enterprises.42.2.70.101.1.1.2.1.3.29 FAN 0 enterprises.42.2.70.101.1.1.2.1.3....
Case ID38101108 X4500 reboots infinitely when jumpstarting with mirror setup We encountered a problem, that when setting up a mirrored system disk during jumpst...
Case38104140 Sun in principle acknowledged this error, but said this is Neterion's business. Neterion is working to reproduce this error. I love hotline ping pong...
Main.CarstenAulbert 31 Dec 2008 After adding extra nodes in December 2008 we discovered that the SUN servers sent out a continuous stream of ARP requests into the...
Case 72065766 Opened 2009 12 01 Synopsis After running zpool scrub on s13 a vdev was marked degraded due to excessive read errors. After exchanging the supposedl...
Recover from Neterion NIC running at low MTU size Just run this: ifconfig xge0 unplumb # ifconfig xge1 unplumb cd /root/xge 2.0.7.6641 solbin/ ./install.sh jumbo ...
Mirroring system disks Right now jumpstart fails to create mirrored system on our thumpers (see according Case ID). Thus it is required to create the mirror after...
User mapping scheme This is the first idea, it should already work nicely, but we may need to adjust it in the future. Initial user mapping We will start with 10...
Fresh install of a X4500 Files to add/overwrite to the default installation root root/.ssh root/.ssh/id_rsa # define local key root/.ssh/authorized_key...
Performance problems with X4500 using ZFS with NFSv3 When moving a large number of small files onto our X4500 boxes we found very bad performance numbers: Job ...
Possible zpool configurations In the standard setup there are two system disks (either on controller c5 or c6 depending if one installs the box via a USB device ...