Working Schedule for UPS test on July 14th, 2008 The "Plan" This is a rough plan of actions which need to be performed on that day (all times are CEST=UTC 0200) ...
How to add a new host (salt era) This example will use einstein12 as a sample machine which before was known as ra15. Before you begin, you need to have ssh agent...
Aptly For the new LDG repo set , we are trying to use aptly as a potential successor to reprepro. The goals are: * support various Debian and Ubuntu releases ...
Atlas Boinc Condor Scheduling As Condor's fetchwork does not seem to work with dynamic slots, we are working on our own "scheduling" system for BOINC Initial tho...
Small tour of Atlas Atlas is a computer cluster situated in the basement of a university building near the Max Planck institute in Hannover. Since the ceiling is ...
Data available via LDR Our LDR server is named ldr.aei.uni hannover.de (externally) and ldr.atlas.local (internally). You may need to use this environment variabl...
All times given are in UTC, usage and backup sizes in full GByte. If the last backup was done within the past 48hrs the times are green, otherwise red. Main.Ca...
Central benchmark page All our benchmark results should be linked from this page to help rediscovering already performed benchmarks. Ideally, a summary page will ...
Benchmarks Network Disk Performance ffsb We are also using ffsb to simulated different workloads. The following tests were performed so far: * Testing diffe...
Cluster File Systems A long list of what file systems are available can be found on Wikipedia, for us these file systems seem to be interesting enough for further...
Common guide lines for cluster usage This document describes common pitfalls and guide lines when using a large computing cluster. Some of the details are specifi...
Condor Accounting Groups on Atlas In May 2015, LIGO introduced mandatory accounting groups for jobs running on the LIGO data grid (LDG). As Atlas is part of the L...
HTCondor configuration updates in 2015 (1) Using cgroups to softly enforce memory and core limits Reasoning In the past, we either relied on users' jobs to obey...
Configuration Management (primer/summary/brainstormer) What's out there? These are not really meant for configuration mgmt (alone) and have their strengths somew...
Create a hybrid USB Image The goal is to create an image file which can be copied onto a USB stick and booted both via legacy BIOS as well as UEFI. This document ...
New networking set up for Atlas From 2008 till 2013 we used a flat networking structure, i.e. all computers on the data network were connected "directly" to the c...
Main.CarstenAulbert 05 Jun 2008 Data Servers Currently we have 30 data servers up and running. Within the cluster you can get these file areas automounted via /a...
How to disable KM1/2 and use KM4 manually In order to disable KM1/2 and temporarily run with KM4 only, the following steps are needed (please monitor that each ch...
Distributed/clustered file systems This page should summarize what scenarios such file systems could fulfill within Atlas and what we expect from it. Properties s...
Error messages and their (likely) way of fixing them Hardware Software You can find a very short description of the symptoms here, for more information, follow ...
FAI Jessie set up 1 base install via old fai jessie 1 base minimal config via salt 1 echo 'deb http://repo.atlas.local/reprepro fai contrib' /etc/apt/...
Collection of HowTos OS hangs Try to reset node. OS hangs, even after reset Possible causes 1. hdd broken look if everything is well wired, change hdd, ma...
HSM file system check stats (last update: 2016 07 26T18:42Z) Planned steps (starting at 2016 07 26T11:00Z): 1 Issuing condor_hold to all jobs on all submit hos...
HSM upgrade Current status 2014 02 01T12:33Z: samfsdump/final backup stopped due to too many non archived files. Rushing to archive those as fast as possible. 20...
What is ATLAS? ATLAS is a general purpose compute cluster, located in the Albert Einstein Institute for Gravitational Physics, in Hannover Germany, on the campus ...
Atlas basic usage guide First things first Be nice to others, others should be nice to you as well :) Please read this aloud: I will be nice to other users, and ...
Cluster upgrade to Debian 8/Jessie We plan to use this page for keeping a record of where we are with respect to our full cluster upgrade to Debian Jessie. Curr...
Rebuilding Debian's kernel (loosely following https://kernel team.pages.debian.net/kernel handbook/ch common tasks.html#s common official) pbuilder environment I...
Directory hierarchy for LSC files Storage structure for S4/S5/S6 data (past) In the past we used paths like these H/H1/RDS/C03/L1/H H1_RDS_C03_L1 822092472 60.gw...
Overview of Topics on LSC Software packaging for Debian * Packaging Howto 1 * Packaging Howto 2 * How to build personal lal/lalapps on ATLAS * How to ...
llldd * TCP connection from CIT (or sites) to special receiver machine (possibly need root access for John Zweizig) possibly SL6???? * from there UDP multic...
Monitoring for Jessie and Beyond What do we want/need to monitor (metrics/checks) A list of non exhaustive metrics and checks we we need/would like to have, e.g....
Move from Debian Lenny to Debian Squeeze Changes * updated packages from upstream Debian * Condor 7.6 with dynamic slots on most execute machines (exceptio...
Trying to get iPXE as the default method to netinstalls working (based on http://ipxe.org/howto/chainloading and https://doc.rogerwhittaker.org.uk/ipxe installati...
Planned downtimes This page summarizes planned/on going work within the Atlas cluster along with a few details. Usually, we will issue condor_off peaceful at lea...
Rack layout which racks contained what and when Basic information * rack rows are numbered 1 to 10 * Water cooled racks are numbered 1 to 102 * open ...
Simple ZFS ZVol testing creating baseline Create simple test data set in RAM: mkdir p /dev/shm/data for i in $(seq w 30); do dd if=/dev/null base64)" nosalt...
First steps with spack Please note this all this was tested on a extremely minimally installed server. I.e. just installing something like doxygen can take a very...
Main.CarstenAulbert 08 Jan 2009 Special nodes The following nodes are considered special, please help to keep this list up to date! node name assigned job...
SQLDump tests for Einstein@home On einstein db1 the following was found: # no compresion /usr/bin/mysqldump opt master data=2 EinsteinAtHome mbuffer /dev/nu...
Usefull tools making your life easier dsh the distributed/dancer shell Introduction The dancer shell is very usefull to run identical commands on the whole cl...
ATLAS Web Preferences The following settings are web preferences of the ATLAS web. These preferences overwrite the site level preferences in . and , and c...
Wiki Evaluation for pulgroup or getting away from the evil elog ... Motivation and overview There are various well known problems with our current pulgroup page...
Work planned for cluster shutdown on 2013 01 15 shutdown plan The following services will be shut down 1 all compute nodes possibly with the exception of "r...
Sun Fire X4500 The Sun Fire X4500 Servers (nick named "Thumper") are the heart of our $HOME. All user data reside on these boxes. To ensure that there are as few ...