HSM upgrade

Current status

2014-02-01T12:33Z: samfsdump/final backup stopped due to too many non-archived files. Rushing to archive those as fast as possible.

2014-02-01T21:09Z: Still about 7 million files (10TB) to be archived to tape. Hopefully done within 12hrs.

2014-02-02T06:20Z: Over night, archiving finished and backup now took 45 minutes. Phase 2 can finally start, reinstalling central server and upgrade array firmware

2014-02-02T08:50Z: Oracle 6780 array upgrade done, DDN SFA10000 update underway

2014-02-02T11:09Z: DDN SFA10000 update done, a few warnings persist, contacted DDN.

2014-02-02T17:11Z: Main server reinstalled, SAM/QFS installed, next step configuring it properly

2014-02-03T10:05Z: Main server can mount all file systems just fine, however, QFS clients cannot. Investigating together with Oracle.

2014-02-03T19:10Z: Some progress has been made and we have a promising route to reestablish services soon.

2014-02-04T13:10Z: We are currently reinstalling all QFS clients to bring them into the same state, please bear with us a little bit longer, we should be back soon

2014-02-04T15:00Z: We are back in action, exceptions so far are titan1 and titan2 which still have problems with their new configuration.

Plans/Details

Scheduled starttime: 2014-01-30T09:00Z

Our hierarchical storage management (HSM) will be upgraded in the days following 2014-01-30. We will try to keep the downtime as short as possible, however, we expect a minimum downtime of 2 days.

Please ensure,
  • you stop your Condor jobs or put them on hold
  • do not have screen sessions running
  • are logged out of head nodes

During the work you may log in to the head nodes und store files under /local/user/, however you will not be able to perform many tasks.

work to be performed

Our HSM consists of 6 computers (QFS clients) which act as NFS servers to the cluster, a central computer ("meta data server" (MDS), SAM/QFS server) and a tape library. The MDS acts as the central organizer between disk arrays (about 750 disks), the tape library (8 drives and more than 2000 tapes) and the client requests (your jobs).

During the upgrade we will
  • put condor jobs on hold (for any user affected, see below) done
  • remove NFS mounts for all users on the HSM done
  • shut down QFS clients done
  • perform full file systems dumps (20 file systems in total) done
  • perform full backup of meta data server done
  • install Solaris 11 on all QFS servers done
  • perform firmware upgrades on disk arrays (Oracle 6780) done
  • perform firmware upgrades on disk arrays (DDN SFA10000) done
  • install Solaris 11 on SAMFS server done
  • install SAM/QFS on SAMFS server done
  • install QFS on all QFS clients done
  • ensuring all file systems are mountable and distributable in progress
  • bring system up again

users affected

The following users are affected by this work:

accadia                 fdonovan                matthew.edwards
adam.mullavey           fehrmann                maxime.fays
afina.neunzert          forte                   max.isi
ajith                   francesco.direnzo       mbebronn
alessandra.corsi        francesco.piergiovanni  mcoughlin
alexander.cole          frank.ohme              mdetert
alexander.mellus        frederick.coburn        michael.puerrer
alexander.urban         fritz.miot              michele
alex.nielsen            gaborg                  millerd
almir.alemic            gabor.szeifert          min-a.cho
anamaria                gabriela.gonzalez       muhammed.saleem
anderson                gabriela.hernandez      mwas
andrew.miller           gabriela.serna          namgyu.kim
andrew.rodger           gabriel.islas           nathaniel.indik
andrew.williamson       gareth.pickford         nce
andri.gretarsson        geodc                   nicole.darman
anirban.ain             gharry                  none
anthony.lefeld          giancarlocella          oriella.torre
antonio.perreca         gimazz                  paleac
anuradha.samajdar       giovanni.rabuffo        patricia.porter
ashikuzzaman.idrisy     gmartini                patricia.schmidt
ashish.mahabal          graef                   patrick.meyers
ashley.disbrow          grant.meadors           paul.hopkins
asperanz                greenley                paul.lasky
astroeer                grifonator              pbrem
atbraack                guillermo.valdes        pehrens
avecchio                halston.lim             peng.geng
ballen                  hannah.middleton        pfreire
bangalore.sathyaprakash harald.pfeiffer         praffai
bastiaan.swinkels       haris.k                 prathamesh.dalvi
bbehnke                 hbeggenstein            qi.chu
belinda.cheeseboro      hcmarroc                quitzow
bema                    hjkim                   rajesh.nayak
benacquista             hoff                    ramesh
benjamin.aylott         hpletsch                rana.adhikari
benno.puetz             hunter.gabbard          re
bernard.hall            igor.andreoni           reatough
bgarcia                 igorbilenko             reedessick
bhubbert                irene                   rhondale.tso
bianca.danilet          irina.ene               ripeschke
bose                    isantiago               robert.coyne
boyang                  jackson.henry           rolland
brandi.dunnington       jaclyn.sanders          ryan.darragh
branson.stephens        jacob.peoples           ryan.goetz
brevilo                 jade.powell             ryan.lynch
bsomhegy                jaehyun.lee             ryan.magee
byuan                   james.bell              rynge
carl.brannen            james.cowley            saeed.mirshekari
carl-johan.haster       jason.tye               salemi
carsten                 jayanti.prasad          salvatore.vitale
cbiwer                  jeong-su.ha             samantha.usman
chandramishra           jeroen.meidam           sanjit.mitra
charlton                jhcscargill             satya
chase.kernan            jiafrate                scaudill
chericoni.domizia       jialun.luo              scottmsul
chmahr                  jing.ming               sebastian
chmess                  jlogue                  serena.vinciguerra
chohs                   joey.key                sfischet
christian               john.le                 sfranco
christopher.berry       jonathan.bayless        s.gwynne.crowder
chunglee.kim            jonathan.hanks          shaltev
ckim                    josephb                 shaon
claudia.lazzaro         joseph.bowers           shi
claudio.casentini       joshua.kerrigan         shinkee.chung
clio                    jotradov                shivaraj.kandhasamy
colin.clark             jslutsky                simon
connor.skeehan          juan.bustillo           simon.stevenson
cristian.maureira       justing                 sinead.walsh
cristiano.palomba       justin.tervala          slawomir.gras
daniel.duddleston       justin.wagner           smorriss
daniele.trifiro         juve                    soenke.schuster
daniel.evans            kalina.nedkova          surabhi.sachdev
dantonio                karla.guardado          swetha.bhagwat
david.groden            katherine.grover        sydney.chamberlin
david.kelley            kawies                  szabolcs.marka
david.morate            keiko                   tania
david.stiles            kendall.ackley          tdent
dbrown                  kent                    teresa.symons
deborah.good            kg.arun                 thomas
deborah.hamm            kgrover                 thomas.adams
dietz                   kloew                   thomas.downes
dkeitel                 koutarou.kyutoku        tito
dkeppel                 kremin                  tom.wantock
dmeacher                laleh.sadeghian         tsidery
dmsima                  laszlo.gondan           vaibhav
drago                   laura.spitler           vansuch
dtalukder               lauro.salazar           vedovato
edaw                    leroy                   veronica.lockett-ruiz
eddy                    lesteves                vicere
egoetz                  lex                     vihan.pandey
eharstad                lrodriguez              vincent.roma
eheaton                 lucas                   violet.poole
einstein                lucas.giolas            wademc
einstein.temp           lucas.johns             walter
einstein.work           lwade                   weigang.liu
emacayeal               magathos                william.tritch
emaros                  manca                   wpozzo
eric                    marcel.kehl             xian.chen
eric.lebigot            marc.normandin          xiangyu.guo
evan.anders             marco.tompitak          xiaoge.wang
evan.foley              marek.szczepanczyk      xilong.fan
evan.keane              maria.tringali          yatish
fabian.magana-sandoval  marion                  yingsheng.ji
fabio.ricci             marissa.walker          yuanhao.zhang
fabrizia                matthew                 zijing.yang
fan.zhang               matthew.cowart

Misc

(Friday/Saturday) We encountered problems with two file systems. Instead of taking the risk of losing up to 20TB of data, we stopped the dump and try to force as many of these fiels to tape as fast as possible

-- CarstenAulbert - 28 Jan 2014
Topic revision: r12 - 05 Feb 2014, CarstenAulbert
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback