Summary of Atlas Servers/Services

If you want a quick overview, please refer to our server matrix page, otherwise just read on.

External machines/services

Head nodes: AtlasX, TitanX

We have four head nodes named atlasX.atlas.aei.uni-hannover.de, where X can be 1,2,3 or 4 as well as two "Titans" named titanX.atlas.aei.uni-hannover.de with X=1,2. These are the main machines to connect to from the outside, all of them are running GSI enabled ssh, some are running web servers or gsiftp servers. From here you can "leap frog" to other machines, or submit your Condor jobs or run Octave, Matlab among others interactively. titan3 is a bit special, as it's not reachable externally, however, you can still use it for Condor submission.

More to come: Git, Webserver, GSIftp, LDR

Internal machines/services

Each machine should be available by rsh or ssh (preferred), e.g. ssh n0523 should be all you need to log into n0523. If it asks you for a password, please set-up a atlas-internal ssh keypair.

Compute nodes

We have many compute nodes (currently 1680), almost all are used as Condor nodes. Each machine has 4 CPU cores (2.4 GHz), 8 GB RAM and a 500 GB hard drive. Except for some nodes, all resources are equally shared between all four Condor slots, but the limits are currently not enforced. This allows for jobs to be "creative" about its resource requirements, i.e. it may be ok of a job uses more than the 2 GB of RAM allocated for it, however, please don't exploit this too much! nodes beyond n1000 are also set up to run MPI jobs, however, the current experiences are not that great. The current status can always be queried by condor_status.

GPU nodes

gpu001..gpu066

Currently, we try to bring up the machines named gpu001..gpu066 into Condor as special execute machines. These nodes will prefer "checkpointable" jobs over standard vanilla and will evict non-GPU jobs as soon as jobs needing GPU capabilities will arrive. The machine ads of these nodes will announce what kind of GPU they support and bind to. Please honor these settings, otherwise we need to be much more restrictive here! Each node is equipped with up to 4 Nvidia Tesla cards (C1060 or the newer C2050), until further notice most of these can be used interactively, thus watch for announcements on atlas-users and/or DASWG.

gpudev1..gpudev4

As the name hints, these machine are aimed for development with CUDA/OpenCL. Each of these is equipped with a C1060 and a C2050. Please note, that
  • these systems are already running Debian Squeeze (needed for the Cuda debugger to work), so the LSCsoft softwre stack is currently NOT available on these machines
  • These machines are meant for developers which might need to have dedicated access to these machines/cards. We are currently finalizing a scheme which will automatically notify user when logging into a box if the machine is blocked.

More to come: LDR, ...

-- CarstenAulbert - 27 Jun 2009
Topic revision: r7 - 07 Dec 2010, CarstenAulbert
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback