Summary of Atlas Servers/Services
If you want a quick overview, please refer to
our server matrix page, otherwise just read on.
External machines/services
We have four head nodes named atlasX.atlas.aei.uni-hannover.de, where X can be 1,2,3 or 4 as well as two "Titans" named titanX.atlas.aei.uni-hannover.de with X=1,2. These are the main machines to connect to from the outside, all of them are running GSI enabled ssh, some are running web servers or gsiftp servers. From here you can "leap frog" to other machines, or submit your Condor jobs or run Octave, Matlab among others interactively.
titan3
is a bit special, as it's not reachable externally, however, you can still use it for Condor submission.
More to come: Git, Webserver, GSIftp, LDR
Internal machines/services
Each machine should be available by
rsh
or
ssh
(preferred), e.g.
ssh n0523
should be all you need to log into n0523. If it asks you for a password, please set-up a
atlas-internal ssh keypair.
Compute nodes
We have many compute nodes (currently 1680), almost all are used as Condor nodes. Each machine has 4 CPU cores (2.4 GHz), 8 GB RAM and a 500 GB hard drive. Except for some nodes, all resources are equally shared between all four Condor slots, but the limits are currently not enforced. This allows for jobs to be "creative" about its resource requirements, i.e. it may be ok of a job uses more than the 2 GB of RAM allocated for it, however, please don't exploit this too much! nodes beyond n1000 are also set up to run MPI jobs, however, the current experiences are not that great. The current status can always be queried by
condor_status
.
gpu001..gpu066
Currently, we try to bring up the machines named gpu001..gpu066 into Condor as special execute machines. These nodes will prefer "checkpointable" jobs over standard vanilla and will evict non-GPU jobs as soon as jobs needing
GPU capabilities will arrive. The machine ads of these nodes will announce what kind of
GPU they support and bind to. Please honor these settings, otherwise we need to be much more restrictive here! Each node is equipped with up to 4 Nvidia Tesla cards (C1060 or the newer C2050), until further notice most of these can be used interactively, thus watch for announcements on atlas-users and/or DASWG.
gpudev1..gpudev4
As the name hints, these machine are aimed for development with CUDA/OpenCL. Each of these is equipped with a C1060 and a C2050. Please note, that
- these systems are already running Debian Squeeze (needed for the Cuda debugger to work), so the LSCsoft softwre stack is currently NOT available on these machines
- These machines are meant for developers which might need to have dedicated access to these machines/cards. We are currently finalizing a scheme which will automatically notify user when logging into a box if the machine is blocked.
More to come: LDR, ...
--
CarstenAulbert - 27 Jun 2009