Linux Today: Linux News On Internet Time.

More on LinuxToday

Linux Magazine: Natural Open Source

Sep 03, 2002, 09:00 (3 Talkback[s])
(Other stories by Robert McMillan)

"Within ten years, sequencing machines were fast enough that researchers could seriously consider cataloging the sequence of three billion nucleotides that make up the 30,000­35,000 human genes and creating a kind of blueprint of humanity. The Human Genome Project was born. And with it came the task of making sense of the three gigabytes of data that comprise our DNA. Add to that all of the contextual information about the human genome -- published research about certain sequences and information on the relationship between sequences -- and the various algorithms for analyzing the data, and finally the fact that the human genome is merely one of many genomes to be mapped (the mouse genome is being finished now) and you begin to have some very large and messy data management problems. This is why, today, the computer has joined the microscope and the rat's cage as an essential part of the biologist's toolbox.

"University research environments; vast amounts of data that need to be manipulated in customizable ways; a community of technical people with shared goals; new data analysis techniques cropping up on a regular basis. These are the hallmarks of the open source problem set, and if ever there was a world ready for open source software, the biological sciences are it. In the last ten years bioinformaticists (people who use computers to process biological information) have wholeheartedly embraced open source tools; in turn, the work done by biologists has begun to have an impact in the larger open source world.

"Established open source projects have proved particularly useful to biologists in two areas: in number crunching, where Linux-based Beowulf clusters are providing a high-performance and inexpensive alternative to proprietary RISC systems, and in scripting, where biologically-focused scripting libraries like BioPerl and BioPython have become extremely popular tools for writing quick queries to the numerous publicly available genomic databases..."

Complete Story

Related Stories: