fs3 failure recovery and infrastructure upgrade plan

Work priority/schedule

  • Setup database replication on second DB server at UWM and do daily dumps
  • Order project front end servers (email Carsten and get info about nodes here)
  • Setup cron for gridftp of canonical results from UWM to Hannover (talk with Carsten and Scott)
After all canonical results are copied move other results.
  • Setup fileserver on spare supermicro box using xfs file system (xfs1)
  • Copy canonical results from fs3 to xfs1 is a carful way (ionice, sleep, ...)
  • Setup new mirror for download [einstein-dl.phys.uwm.edu] (this can be a cheap recycled box)
  • Research and order database hardware (8 cores, 64GB memory)
  • Make backup of project directory for short term before hardware is upgraded. (setup cron, nodelete)
  • Research how to duplicate data on fileserver (rsync, raid1, GFS, high availability linux, SAS, http://www.drbd.org, ...)
Student at Hannover may do drdb testing

Timeline for infrastructure upgrades

  • Database replication on current spare (< 1 week)
  • Cron script to copy new canonical_results to Hannover (1 week)
  • Download mirror einstein-dl.phys.uwm.edu (1 - 2 weeks)
  • Setup new fileserver with xfs (1 week), use as production project fileserver after all important results are pulled from old fileserver and copied to Hannover (1 month)
Move canonical then valid then others.
  • Move project directory from project server to fileserver (6 weeks)
  • Setup new database hardware (2 - 3 months, arrival time +2 weeks?)
  • Setup new project server hardware (2 - 3 months, arrival time +2 weeks?)
Topic revision: r3 - 21 Aug 2009, DavidHammer
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback