Atlas Boinc Condor Scheduling As Condor's fetchwork does not seem to work with dynamic slots, we are working on our own "scheduling" system for BOINC Initial tho...
Summary of Atlas Servers/Services If you want a quick overview, please refer to our server matrix page, otherwise just read on. External machines/services Head ...
Atlas Compute nodes Compute Node 2008 In 2008 we bought 1680 Supermicro based machines from Pyramid in 2008, getting a total of 6720 2.4GHz compute cores, 13TB R...
HTCondor configuration updates in 2015 (1) Using cgroups to softly enforce memory and core limits Reasoning In the past, we either relied on users' jobs to obey...
IP Scheme The goal is to create a scheme which eases daily life. IP mapping should be more or less straightforward and extensible. For servers look at: * indi...
What is ATLAS? ATLAS is a general purpose compute cluster, located in the Albert Einstein Institute for Gravitational Physics, in Hannover Germany, on the campus ...
Atlas basic usage guide First things first Be nice to others, others should be nice to you as well :) Please read this aloud: I will be nice to other users, and ...
Monitoring for Jessie and Beyond What do we want/need to monitor (metrics/checks) A list of non exhaustive metrics and checks we we need/would like to have, e.g....
Detailed list of metrics we want to monitor Compute nodes (61) * CPU: user / nice / system / wait (4) * disk: * space available/free per locally defi...
Move from Debian Lenny to Debian Squeeze Changes * updated packages from upstream Debian * Condor 7.6 with dynamic slots on most execute machines (exceptio...
Shutdown priorities The following list puts priorities on computers, equipment and other items of interest. Computers in racks, which will stay powered up, should...
Work planned for cluster shutdown on 2013 01 15 shutdown plan The following services will be shut down 1 all compute nodes possibly with the exception of "r...