BGI, based in China, can sequence the equivalent of 2,000 human genomes per day. While sequencing is faster and cheaper than ever, there is so much data that is cannot transmit results to clients over the internet because it takes too long- often they use computer disks and FedEx. The field of genomics is in a 'data deluge,' where it costs more to transmit and analyze the genome then to sequence it. The cost of sequencing a human genome dropped to $10,500 from $8.9 million in 2007. While this is a decline factor of 8-- over 4 years, computing costs only dropped by a factor of about 4 in that same time. Few human genomes are being sequenced for individual use, while the majority are part of large-scale genome studies such as searching for the cause of cancer. The federal online archives of raw sequencing data have been taking the majority of the hit, reaching 300 trillion DNA bases, about 700 trillion bytes of computer memory, since the beginning of 2011. The federal offices were considering shutting the archive due to budget constraints, but will now stay open at the charge of big sequencing projects. Meanwhile, the Human Microbiome Project is sequencing the microbial populations in the human digestive tract, and has produced a million times as much sequencing data as a single human genome. Scientists admit that they are not even sure exactly what to do with all of the data.
Some scientists are looking to Google to help solve the problem, as they have the capacity to do all of genomics in a day. Google's venture capital sector has recently investing in bioinformatics company. Researchers are storing all of the raw data in the hopes that new technology will be produced to help analyze it, but until that time they will continue amassing unmanageable amounts of data with only hope that the processing solution will come soon.
Article: http://www.nytimes.com/2011/12/01/business/dna-sequencing-caught-in-deluge-of-data.html?_r=1
No comments:
Post a Comment