[gpfsug-discuss] Online data migration tool
Jonathan Buzzard
jonathan.buzzard at strath.ac.uk
Thu Nov 30 22:02:35 GMT 2017
On 30/11/17 18:01, Skylar Thompson wrote:
[SNIP]
> To be fair, a lot of our biomedical/informatics folks have no choice in the
> matter because the vendors are imposing a filesystem-as-a-database paradigm
> on them. Each of our Illumina sequencers, for instance, generates a few
> million files per run, many of which are images containing raw data from
> the sequencers that are used to justify refunds for defective reagents.
> Sure, we could turn them off, but then we're eating $$$ we could be getting
> back from the vendor.
>
Been there too. What worked was having a find script that ran through
their files, found directories that had not been accessed for a week and
zipped them all up, before nuking the original files.
The other thing I would suggest is if they want to buy sequencers from
vendors who are brain dead, then that's fine but they are going to have
to pay extra for the storage because they are costing way more than the
average to store their files. Far to much buying of kit goes on without
any thought of the consequences of how to deal with the data it generates.
Then there where the proteomics bunch who basically just needed a good
thrashing with a very large clue stick, because the zillions of files
where the result of their own Perl scripts.
JAB.
--
Jonathan A. Buzzard Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
More information about the gpfsug-discuss
mailing list