NodeTests

Tasks to do:

Initial work

(HP) Manual work
  • Blank disk of node, wipe by:
 dd if=/dev/zero of=/dev/sda; sync

  • Put MAC address into DHCP table on server
  • PXE boot

Automatic from here (check list):

  • Clone/Install operating system ((FAI )) -> in progress
  • Reboot -> DONE

Slave tests

Sensors

(HP) remotely get
  • temperatures? -> DONE
  • fan speeds? -> DONE
  • voltages? -> DONE
  • SMART values?
by using IPMI

Warnings

(all) Are there any warnings in
  • dmesg
  • /var/log/{messages|syslog}

Partitions

  • correct partition tables, inode number customization -> DONE (config in FAI)

Clean?

  • are /boot, /lib/modules clean? -> DONE

X/tools

  • does startx work? Which WindowManagers? dwm -> DONE
  • not standard way to boot -> DONE
  • gcc/ddd/gdb/prof/gprof/valgrind/g++ + other vital tools on the head nodes??? -> DONE

Network

  • networking runs wire-speed, full-duplex (netperf/netpipe)
  • (VLAN?) wait for the core swicth
  • correct identity for machine (hostname, IP) -> DONE

Power

(MS)
  • shutdown -hf now (maybe only shutdown -h) -> DONE
  • After shutdown, power cycle (disconnect, reconnect cable) box needs to stay off -> DONE
  • shutdown -rf now reboots -> DONE
  • unplug box, plug back in, box should stay off -> DONE
  • remotely power on machines (IPMI, etherwake)? Under any of given conditions (except reboot) -> DONE
  • cut UPS power for less than 60s, nodes should stay on -> DONE
  • cut UPS pwer for more than 60s, nodes should shut down -> DONE
  • What about not full UPS?

Time

  • no files with dates in the future -> DONE
 cd /; touch temp.dat; find / -xdev -cnewer temp.dat
  • no files with dates with "early dates" (e.g. 1980) -> DONE
  find / -xdev -type f -printf "%TY %p\n"| grep "^19[0-7][0-9] "
  • ntp working? -> DONE
 ntpq -p

Benchmarks/Tests

  • benchmarks run at full speed?
  • disk speeds (guesstimate > 50 MB/s and > 800 MB/s) -> DONE
  hdparm -tT /dev/sda
  • big file support (>2 GB) -> DONE
  • /proc/meminfo should show full memory

Automount

  • Automounting works? E.g. /net/s1234/data
  • cd to automounted directory?
 cd /net/s1234/data
  • Copy to/from automounted partitions (permissions for users/root correct - what is correct?)
  • low NFS time-out values

rsh/ssh

  • Can root on master rsh to any node? -> DONE
  • Can ordinary rsh to any node? -> DONE
  • Node to node rsh should work as well (host.allow/host.deny) -> DONE
  • rsh uptime on any nodes? -> DONE
 master $ rsh s1234 uptime

Misc

  • does /root/cloned-date (svn-tagnumber) exist? Maybe better in /etc? -> Puts date in /etc/cloned-date '-> DONE '
  • does recloning preserve data? -> DONE '' if it is covered by fai softupdate
  • does email work on the nodes (outgoing)? (X) -> DONE * garbage bag test. Computer starts to overheat and should shutdown cleanly. -> DONE

This topic: ATLAS > NodeTests
Topic revision: 29 Nov 2007, Sebastian
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback