Distributed/clustered file systems
This page should summarize what scenarios such file systems could fulfill within Atlas and what we expect from it. Properties set for the different scenarios should include if the property at question is a
must have,
important wishlist item or something
nice to have. These should be noted down by
(M),
(I) and
(N), respectively.
We need to discuss and finally agree upon the various methods those candidates apply and what risks are associated with each one (be it an open source project, driven by a small community or a company too large or too small backing a commercial file system).
Scratch space
function
Fast storage of short lived files, performance more important than long term integrity
properties
- (mostly, need to be defined) POSIX compliant file system (M)
- local mount point on each client (M)
- client runnable on Debian (M)
- server runnable on Debian ( (I) if number of servers is <=10, (M) if larger)
- if commercial, configuration supported (M)
- file integrity checks (N)
- multiple file copies (N)
- metrics where to store files (I)
- no single point of failure (M)
- if server goes down, file system should still work for other non-affected files (I for low number of servers, M for large number of servers)
open questions
- what happens if multiple clients access the same file (all read-only, one writing, multiple trying to write)
- what about file locking, (attribute) caching
application
- 16 radio server exporting their 12x3TB disks as JBOD or RAID
- O(1000) servers export their SSD and/or HDD
candidates (alphabetical order)
pros
cons
Ceph
pros
cons
pros
cons
- broke last time around, but have new backers
GPFS
pros
cons
Lustre
pros
cons
pros
cons
Quobyte
pros
cons
SAM/QFS
pros
cons
pros
cons
Home file system
Long time data integrity much more important than fast access
properties
- (mostly, need to be defined) POSIX compliant file system (M)
- local mount point on each client (M)
- client runnable on Debian (M)
- server runnable on Debian ( (I) if number of servers is <=10, (M) if larger)
- if commercial, configuration supported (M)
- file integrity checks (I)
- multiple file copies (either on disk or with tape backend) (I)
- hierarchical storage management (I)
- metrics where to store files (N)
- no single point of failure (I)
open questions
- what happens if multiple clients access the same file (all read-only, one writing, multiple trying to write)
- what about file locking, (attribute) caching
application
- FC based SAN exported via NFS or other means to cluster
candidates (alphabetical order)
pros
cons
Ceph
pros
cons
pros
cons
- broke last time around, but have new backers
GPFS
pros
cons
Lustre
pros
cons
pros
cons
Quobyte
pros
cons
SAM/QFS
pros
cons
pros
cons
--
CarstenAulbert - 29 Jun 2014