I’ve been working as a Computational Biologist at Covance’s Biomarker Center of Excellence since April. I’m the first computational scientist they’ve brought on and, perhaps not surprisingly, I’ve been asked several times over the past three months, “What is a Computational Biologist?”
Biology in the post-genome world has transformed from a largely laboratory-based science to one that integrates experimental and information science. Computational biology is an interdisciplinary field that uses the techniques of applied mathematics, informatics, statistics and computer science to address biological problems. A computational biologist analyzes, integrates and interprets large-scale biological data, including clinical data, literature and high-throughput genomic and/or proteomic data. Computational biology focuses on hypothesis testing and discovery in the biological domain; the goal is to learn something new about the biology of the system under investigation.
The large size of biological data sets, the inherent complexity of biological problems and the ability to deal with error-prone data all result in large run-time and memory requirements. As such, a powerful computing environment is essential for computational biology, and enables computational biologists to perform many complex, memory intensive tasks, including:
- Gene expression profiling: measuring the activity of tens of thousands of genes at once to create a global picture of cellular function. Statistics and other analytical approaches are used to identify changes between various biological states.
- Functional genomics: integrating genomic and proteomic knowledge to understand the relationship between an organism’s genome and its phenotype. Statistics and the vast wealth of data produced by genomic projects are used to describe gene and protein functions and interactions.
- Comparative genomics: establishing a correspondence between genes (i.e. orthology analysis) or other genomic features in different organisms (i.e. an animal disease model and human patients).
- Analysis of protein expression: protein microarrays and high-throughput mass spectrometry can provide a snapshot of the proteins present in a biological sample. These methods involve the problem of matching large amounts of mass data against predicted masses from protein sequence databases, and the complicated statistical analysis of samples where multiple, but incomplete peptides from each protein are detected.
- Modeling of biological systems: involves the use of computer simulations of cellular subsystems (i.e. networks of metabolites and enzymes which comprise metabolism, signal transduction pathways and gene regulatory networks) to both analyze and visualize the complex connections of these cellular processes.
- Integrative biology: integrating multi-dimensional data from a variety of disparate sources, to provide a broader view of a biological system. Uses an integrative approach in which pathways and networks are studied.
As I work to develop the computational infrastructure in my new position, I find I’m also fulfilling the role of a bioinformatician. Although the terms bioinformatics and computational biology are often used interchangeably, bioinformatics is different than computational biology and focuses on the creation of tools (i.e. algorithms and specific computational methods) that work on biological data.
A bioinformatician enables the discovery of new biological insights through the creation and improvement of algorithms, databases, computational and statistical techniques and theory to solve formal and practical problems arising from the management and analysis of biological data. Bioinformatics focuses on developing and applying computationally intensive techniques such as pattern recognition, machine learning, data mining and visualization.
For a list of computational biology tools and resources that I’ve found useful, see the Computational Biology Resources page.
If you’re in need of computational support for a research project or require computational analysis to inform and guide a study, I’m interested in collaborating. My CV is available here on the site along with a list of publications. Feel free to send me a message using the Contact form on the homepage.
Walter Jessen is a digital strategist, writer, web developer and data scientist. You can typically find him behind the screen something with an internet connection.