French genome sequencing lab SeqOIA deployed quad-level cell (QLC) flash-based Universal Storage from Vast Data to meet government targets of 6,000 patient analyses per annum by 2025. In fact, it was on course to meet that objective well before year-end and has been able to add storage controller resources without needing to add array capacity.
“Genome sequencing helps identify genetic events in patients to better characterise their pathology and find new treatments,” said Alban Lermine, director of information systems at SeqOIA. “We can find genetic explanations for a cancer, for example, or for rare illnesses. And thanks to these analyses, doctors can determine better treatments or advise families of conditions that may run in the family.”
In 2017, the French government decided to aim for leadership in genomic sequencing and announced the Médecine France Génomique 2025 programme. The challenge was to build a network of very high-throughput sequencing platforms. Three major French health research establishments joined forces to tackle the job and created SeqOIA in 2018 with the aim of breaking down technological barriers.
A giant storage cluster that collapsed under I/O requests
“The principle is that for each genome sequence you have to launch several analyses on very large volumes of data,” said Lermine.
“In 2018, we deployed a 400TB [terabyte] storage cluster managed by Lustre, which is common in the scientific community. That storage was accessed by 2,000 processor cores in the compute cluster via 40Gbps Ethernet,” added Lermine.
“Bit by bit, we added to the load on the compute cluster. We started with one sequencing procedure, then two in parallel, then three. But when we got to four, the system collapsed.”
The hard drives in SeqOIA’s storage cluster could no longer cope with the input/output (I/O) demands of the compute servers. While waiting for write heads to finish writing files, Lustre would cache any further data that arrived. But the cache had no room to grow and it had to try to write what it could in the time it had. That resulted in incomplete writes, said Lermine, which resulted in“so, basically we ended up with corrupted files”.
To start with, Lermine’s team considered adding more storage nodes to enlarge bandwidth for access requests. “Was this a good idea? We didn’t even have the ability to ask the question,” said Lermine. “Our storage vendor couldn’t supply more anyway, so we had to look for another solution.”
The challenge: To add throughput but not capacity
By then, it was the end of 2021. SeqOIA had been established, had bought its medical equipment, deployed its IT, carried out tests and grown in its ability to handle data. But during that time, storage had evolved, most notably with the price of flash, much faster than classical HDDs, coming down.
“Flash storage interested us because it allowed us to multiply throughput while not having to augment storage capacity,” said Lermine. SeqOIA needs 400TB as working capacity, but once results are obtained, data is archived elsewhere on Scality Ring object storage.
So, SeqOIA’s IT chief set out to meet suppliers. HPE, Pure Storage and Vast Data responded.
“HPE proposed a complex solution,” said Lermine. “Meanwhile, Pure Storage would have given us the same problem as with the hard drives. They offered a solution with plenty of bandwidth, but if you got to its limits you had to add an entire array complete with storage we would not have used. Only Vast Data would allow us to add I/O management modules without adding to the number of SSDs.”
Vast Data offers arrays based on bulk storage using NVMe-connected QLC flash. QLC, while relatively cheap, is the least durable of all the flash generations and best used for sequential I/O.
To get around this limitation, Vast does its best to ensure traffic is sequentialised with data processing units (DPUs) that shape I/O into fewer, less randomised patterns.
No errors: An objective already obtained
All meetings took place in November 2021. Less than 30 days later, SeqOIA agreed to deploy a Vast Data Universal Storage array of 500TB.
Purchase was outright, without monthly subscriptions, leasing or the like, which is often pushed by suppliers but doesn’t always fit with European public sector organisations that work to budgets that can change annually.
“The purchase price included a five-year support contract and that’s all we care about,” said Lermine.
That wasn’t the only point of satisfaction, however. “Not only have corrupted files disappeared, but analysis times have been slashed to 25% of previously,” added Lermine.
“The target set by the government is to process 6,000 patient records a year by 2025. This year, 2022, is not at an end and we have already processed 5,500.”