Top N Reasons To Do A Ph.D. or Post-Doc in Bioinformatics/Computational Biology

For the last few years I’ve given a talk to incoming Ph.D. students in Molecular Biology on why they should consider doing Computational Biology research. I’m fairly passionate about making this pitch, since I strongly believe all 21st century Biologists should have a greater (or lesser) degree of computational training, and that the best time to gain that training is during a Ph.D. or a Post-Doc.

I’ve decided to post an expanded version of the reasons I give for why Biology trainees should gain computational skills in hopes of encouraging a wider audience to consider a research path in Computational Biology. For simplicity, I define the field of Computational Biology to include Bioinformatics as well, although there are important distinctions between these two disciplines. Also, I note that this list is geared towards convincing students with a background in Molecular Biology to consider moving into Computational Biology, but core aspects and variants of the arguments here should apply to people with backgrounds in other disciplines (e.g. Ecology, Neuroscience) as well. Here we go…

0. Computing is the key skill set for 21st century biology: As time progresses, Biology is becoming a more quantitative science. Over the last three centuries, biology has transformed from an observational science into a experimental science into a data science. As the low-hanging fruit gets picked, fundamental discoveries are getting harder to make using observation and experiment alone. In the future, new discoveries will require leveraging big datasets and using advanced analytical methods. Big data and complex models require computational skills. Full stop. There is no way to escape this reality.

But if you don’t take my word for it, listen to what Nobel-prize winning pioneer of molecular biology Walter Gilbert, who made this same argument about the future of biology over 20 years ago:

To use this flood of [sequence] knowledge, which will pour across the computer networks of the world, biologists not only must become computer literate, but also change their approach to the problem of understanding life.

Or listen to Nobel-prize winning pioneer of molecular biology Sydney Brenner, who has been banging on about this issue for years:

I spent many hours persuading people that computing was not only going to be the essential tool for biological research but would also provide models for analyzing complexity…The development of sequencing techniques and their widespread application has generated enormous databases of information, and the need for computers is no longer questioned

1. Computational skills are highly transferrable: Let’s face it, not everyone doing a Ph.D. or Post-Doc. in Biology is going to go on to a career in academic research. The Washington Post recently reported that “only 14 percent of those with a Ph.D. in biology and the life sciences now land a coveted academic position within five years“. So if there is high probability that your Ph.D. or Post-Doc training will need to be used outside of academic research, why not aquire the most broadly applicable skill set that you can? Experimental skills only transfer to laboratory jobs in the the biosciences or medical job market. Computational skills transfer across this sector, plus a much wider market outside of the (bio)science. Increasing your computational chops won’t just give you a better chance at landing a job. It will have added benefits in your own life as well, since you will have a deeper appreciation for how computers work and more mastery of when you interact with computers in your daily life.

2. Computing will help improve your core scientific skills: Biology is inherently a messy subject. While some Biologists are rigorously trained in how to cope with this messiness through good experimental design and statistical analysis (here’s looking at you my Ecologist sisters and brothers), the sad truth is that many (most?) Biologists have bad habits when it comes to data collection and analysis.  Computing forces you to confront and tame the very human tendency to do science in ad hoc ways and therefore it naturally develops core scientific skills such as: logically planning experiments, collecting data consistently, developing reproducible methodology, and analysing your data with proper statistical methodology. So even if you can’t be convinced to abandon the bench or field forever, computational training will develop scientific best-practice that crosses-over and enhances your experimental skills set.

3. You should use you Ph.D./Post-Doc to develop new skills: Most Biologists come into their Ph.D. with some experimental training from high school and undergraduate studies. OK, so maybe this training isn’t cutting edge and you haven’t done advanced research to really hone your experimental skills, but neverthless you do have some amount of training under your belt. In contrast, the vast majority of Biology Ph.D. students have no training in scientific computing skills beyond using Excel or a GUI-based statistics package. So use your Ph.D. or Post-Doc. time to for what it should be — training in something new, not just further developing a skill set that you already have.

My view is that the best time to train in Computational Biology is during a Ph.D., and the last chance to do this is likely to be as a Post-Doc. This is because during your Ph.D. you have time, secure funding and a departmental structure to protect you that you will never have again in your career. Gaining computational skills as a Post-Doc is also a great option, but shorter contracts, greater PI dependency, and higher expectations to publish mean that you typically don’t have as much time to re-train as you would during a Ph.D. Good luck finding the time to re-tool as a PI.

4. You will develop a more unique skill set in Biology: As noted above, the vast majority of Biologists have experimental training, but very few have advanced Computational training. While this is (thankfully!) changing, you will still be at a competitive advantage for at least a decade or more in terms of getting results in post-genomic Biology if you can code. And because you will be able to get results that many others cannot, plus the fact that you will have skills that set you apart from the herd, you will be more competitive on the job market. Straight up.

5. You will publish more papers: While it may not always feel like it, a Ph.D. or  Post-Doc goes by quickly. Therefore, you don’t have a lot of time to waste time with experiments that fail, if you want to stay in the game. Don’t get me wrong, Computational Biology will provide you more than your fair share of failed experiments, but crucially they will fail in hours/days instead of weeks/months, and therefore allow you to move on to something that works more quickly. As a result, you are very likely to publish more papers per unit time in Computational Biology. Whether you believe the old chestunut that experimental papers are somehow “harder” and therefore have more worth (I don’t), it is clear that publication remains the hard currency of science. Moreover, the adage that search comittees “know how to count even if they can’t read” is still as true as ever. More seriously, what employers and funding agencies want to see is junior researchers who have good ideas and can take them to completion. Publication is the proof that you can finish projects. Computational Biology will allow you to demonstrate that you are a finisher, and that you have what it takes to succeed in science, a little bit faster than the next guy or gal.

6. You will have more flexibility in your research: I would say one of the greatest thing about being a Computational Biologist is that you are not as constrained in your research as you are when you do Experimental Biology. Sure, you can only work on projects that are amenable to computational analysis, but this scope is vast — from Computational Neuroscience to Theoretical Ecology and anything and everything in between. You can also move from flexibly from topic to topic more easily than you can if your skill set is linked to specific experimental techniques. This flexibility in scope allows you to satisfy your intellectual curiosty or chase the latest trend as you wish.  Most importantly for trainees, the flexibility (and low cost, see below) afforded by Computational Biology research allows you to make the case to your PI to develop your own research programme earlier in your career. This is crucial since the more experience you have designing independent projects early in your career, the more likely you will be to succeed if/when you make it to the big time.

7. You will have more flexibility in working practices: ‘Nuff said:

Seriously though, Computational Biology has many pluses when it come to balancing work and life, but still maintaining a high level of productivity. Unlike being chained to the bench, you can do Computational Biology from pretty much anywhere, and telecommuting/working from home are standard practices in Computational Biology. Over the longer term, this flexibilty in work practice helps you to accommodate career-breaks, manage the tough times life will throw at you, and make big life decisions like starting a family easier, since you can integrate coding and submitting jobs to the cluster into your life much better than you can integrate racing back to the lab to flip stocks or harvest cells. Let me say it loud and clear right here: if you want to have a career in academic Biological research and also have a family, choosing to do a Ph.D or Post-Doc in Computational Biology will be more likely to get you to this goal than if you are stuck in the lab. This is not just true for women, as I and others can attest to:

8. Computational research is cost-effective: With the wealth of publicly avaiable data now available, Computational Biology research is cheaper than most experimental work that requires a large consumables budget. This is important for a number of reasons. Primarily, work in Computational Biology is less dependent on grant funding, and therefore you don’t have to be a slave to trends or waste inordinate time chasing grant funding — you can actually just get on with the job of doing the science you want to do. This is especially important in tough economic times like the present moment. As mentioned above, the reduced cost of Computational Biology research also allows trainees to design their own research at an earlier career stage, since you will not be as reliant on a PI to authorize expenditure for your project. Cost-efficiency is also very important when you are starting your group and for maintaining continuity of productivity when riding out troughs in funding or group size. Finally, the cost-efficiency of Computational Biology allows researchers in developing scientific economies to be on equal parity with researchers in rich countries. In my opinion, trainees from BRICS nations and other developing economies (sorry to use this somewhat judgemental term) should really consider choosing Computational Biology as a way to get to the top of the class globally without being limited by the need for big budgets.

9. A successful scientist ends up in an office: This is the kicker. If you succeed and get that “coveted” PI position, you will ultimately end up stuck in an office. True, some brave souls still find time to make it into the lab to do experiments, but they are a rare breed. The truth is that the native habitat for an academic researchers is sitting in their office in front of their computer. You can’t do a lick of wet lab or field work from the office, but you can still do Computational Biology research from behind a desk! As noted by Webb Miller, one of the most highly-cited bioinformaticians ever, continuing to do your own research is also one of the best ways to stay motivated about your work over the long haul of a career. Remember that the long term goal is to be a “Principal Investigator”, not an “In Principle Investigator,” so if you’ve really wanted to do research since you were young, then ask yourself: why train in skills you will never ultimately use for the majority of your career, while somebody else in your lab gets to have fun making all the discoveries?

[10. You will understand why lists should start with the number zero.]

A major reason I have for posting this list is to start more discussion about the benefits of doing research in Computational Biology. I have deliberately made this a top N (not a top 10 list) so that good ideas can be added to the above. I’ll update this post with good suggestions from the comments, and give full credit to the originator.

About these ads

38 Responses to “Top N Reasons To Do A Ph.D. or Post-Doc in Bioinformatics/Computational Biology”


  1. 1 Nick Harris August 1, 2012 at 12:55 am

    I’m interested in what you have to say about the same conundrum (wet lab science vs. computational science) for currently training MD/PhDs still choosing our future labs. I’m very interested in doing it but am nervous about the big jump it involves and the fair amount of risk in trying something completely new.

    • 2 caseybergman August 1, 2012 at 8:11 pm

      If you still have any remaining rotations to do, I’d sugegst approaching a Computational Biology group leader and discuss a short project that you could use to get your feet wet. If you are looking at making your final lab selection and want to do a computational project, then you’ll need to find a supervisor who will support this plan and provide the right expertise. Even if you aren’t fully commited to taking the plunge, see if you can incorporate some computational techniques into your thesis. I guess the key thing I am trying to get across here is that there is less risk to making the switch to Computational Biology than you think, and that making the switch during your PhD may be the best time to do it, since you have less support and more pressure to deliver later in your career. Also keep in mind that virtually all programmers are self-taught, so there is nothing stopping you from learning to code besides your personal motivation. Find a good problem that you really want to solve in your research, read some good tutorials like those at http://software-carpentry.org/, and then start using Google, StackOverflow and local experts to find the answers to your questions. There is no time like the present to start programming!

  2. 3 Karmel August 1, 2012 at 7:08 am

    And what about the flipside– the top N reasons for a computer scientist/software engineer to get a PhD in computational biology? From my personal experience, I can start the list:

    0. No problem in media, finance, or *-apps is as wonderfully intricate and complex as the problem of understanding the mechanisms that sustain life.
    1. You get to stand up and break eye contact with your computer screen every once in a while.

    10. You will understand that whether a list is zero- or one-indexed is a function of context and conditions, and liable to be affected by whether you plated your cells in 5% or 10% FBS.

    :) Great list, thanks!

    • 4 caseybergman August 1, 2012 at 8:16 pm

      Great idea for a riposte (any takers???). On point 0, Donald Knuth fully agrees:

      “There’s millions and millions of unsolved problems [in Biology]. Biology is so digital, and incredibly complicated, but incredibly useful…It is hard for me to say confidently that, after fifty more years of explosive growth of computer science, there will still be a lot of fascinating unsolved problems at peoples’ fingertips, that it won’t be pretty much working on refinements of well-explored things. Maybe all of the simple stuff and the really great stuff has been discovered. It may not be true, but I can’t predict an unending growth. I can’t be as confident about computer science as I can about biology. Biology easily has 500 years of exciting problems to work on, it’s at that level.

      From http://tex.loria.fr/litte/knuth-interview

  3. 5 Ethan White August 1, 2012 at 4:11 pm

    Awesome list. Another big reason for me is:

    You get to draw general conclusions from your research: There is no doubt that science regularly benefits from the expertise that comes from a person working in a particular system or with a particular organism for an extended period of time. However the results from this kind of research are inherently specific to the system of interest. Being a computational biologists allows you to draw on data from across many different systems and organisms and let’s you say things more generally about the patterns and processes of life in general.

  4. 7 christopherbare (@christopherbare) August 1, 2012 at 7:19 pm

    Nice article. I agree completely, but we shouldn’t tell everyone. Someone has to go into the lab and generate the data.

    • 8 caseybergman August 1, 2012 at 8:22 pm

      Sorry for letting the cat out of the bag :) But with NGS, I don’t think we have to fear being out of work any time soon. In fact, I’d argue if we stopped all data collection in Biology we could still do science for a decade or more.

  5. 9 Irene Gabashvili PhD (@igabashvili) August 1, 2012 at 9:18 pm

    According to FastFuture report, computational biology is not going to be among the most popular jobs of the future, rather a skill that almost all professions will master. Their list includes “Body part maker”, “nanomedic”, “recombinant farmer”, and “memory augmentation surgeon”. So wet bench training might come in handy. Besides, most NGS data will sit untouched and unloved unless there are better methods to collect information about phenotype.

  6. 11 ndsimons August 3, 2012 at 8:06 pm

    Reblogged this on A Human Canvas.

  7. 12 Sebastián August 4, 2012 at 10:51 am

    I agrree in that biologists should be trained more in computer science or in math beyond adding and substracting. I had the experience of working on both sides, and here goes my critique to you guys: most of the bioinformaticians and computational biologists, with few exceptions, think that data comes from heaven, that whatever you find in papers is absolutely true, without even considering the experimental system, because you have no idea how the experiment was done. After dealing with computational biologists for my masters, in a group conformed by physicists, I realized that most of them, if not all, think on a problem as a mathematical issue, whose biological output is not meaningful. In my lectures in systems biology the professor showed us several examples of wonderful mathematical models that predicted some behaviuor in the cell, but most of them ended in “unfortunately, it didn’t work”. I know that in the lab many things don’t work either, even when the whole path of in vitro-in cell-in animals has been followed, but honestly, if the first step, the transition in silico-in vitro doesn’t work, as it happens in most of the cases, it’s very hard to believe any outcome.

    An additional criticism to your list is the fact of “publising more papers”: off course that we, as scientists, always want to publish more papers, because unfortunately, that is the measure of succes in this bussines. But for the sake of the science, the discovery, is it the number of papers what determines that you are a good scientist? I personally think that it is better (although harder) to have 1 paper in a PhD, fully controlled, in a relatively high-impact journal, than 4 or 5 papers based on simulations that end up in computational biology journals, which as of today, most of the researchers do not believe.

    In summary, I’m not totally against computational biology, I especially encourage the training of biology-related scientist into math and programming, but I still think that at least my field, Biochemistry and Cell Biology, is an experimental science that requieres people in the lab, regardless of how many simulations or data management you can obtain.

    Best Regards,

    Sebastián

    • 13 Florian Markowetz December 12, 2013 at 10:15 am

      Hi Sebastian, very good points!

      And they are the reason why computational biology and systems biology as theoretical sciences have no future.

      All the successful labs will have a dry-lab and a wet-lab component. In these mixed labs people know how the experiment was done and how to handle the data.

      Florian

  8. 15 Torsten Seemann (@torstenseemann) August 12, 2012 at 1:07 am

    You write: “Computing forces you to not do science in ad hoc ways and skills such as: logically planning experiments, collecting data consistently, developing reproducible methodology, and analysing your data with proper statistical methodology.” (paraphrased)

    I’d argue that it doesn’t force you to any of those. Good scientists do them, no matter what their field of research. Bad scientists of any breed will leave a trail of mess – including computing based ones. Many don’t even use a proper folder/file structure for their raw data, let alone their analyses output, or comments in code :)

  9. 16 Rob August 14, 2012 at 4:47 pm

    On a related note, any suggestions on how an established software developer can break into computational biology? After being in the world of business software for 10 years, I’m interested in switching to computational biology. Do I need to stop my career and go back to school? Or are there other avenues you’d suggest?

  10. 17 caseybergman August 26, 2012 at 9:47 am

    Mason Vail notes on G+ another very good reason for doing comutational work – it is safer: “You’re far less likely to spill a pathogen on yourself if you avoid the wet lab altogether.” https://plus.google.com/108557417188243014050/posts/AfgvHuD9TzF

    See also: http://www.biocomicals.com/comics/2012/07/13/

  11. 18 caseybergman August 26, 2012 at 9:50 am

    Another reason that came up in recent discussions is that computational work is more environmentally friendly. Less plastic waste, less energy spent on autoclaving, etc. Perhaps this is offset by increased energy usage of computers, especially clusters. But in terms of workstations, most wet-lab biologist have a dedicated machine on all day anyways, mostly likely idle or effectively idle from a scientific perspective (e.g. used for surfing the webo or facebook).

  12. 19 Sudeep September 5, 2012 at 5:02 pm

    Istvan’s post on biostars took me here, I have to say I really enjoyed reading it

  13. 21 Alex January 20, 2013 at 12:09 am

    You don’t have to sell compbio to me; I want to make the transition from benchwork (molbio) to compbio, but I think it’s difficult. Why would a PI on a compbio lab hire for a postdoc an experimental biologist with no computational knowledge whatsoever (my case)? I mean, I’d love if there are PIs out there that are willing to take you anyway and bear with you while you learn, but I think it’s just not practical for them

    • 22 Ethan White (@ethanwhite) January 20, 2013 at 3:21 pm

      Alex – it’s true that without any computational background it might be difficult to make the switch unless you have some other knowledge/skill that the lab really needs. But that’s the great thing about computational knowledge, you can learn it yourself relatively easily. Most computational biologists have the equivalent of a semester or two of coursework and some experience working on particular problems. So, I’d recommend finding a good book (I like http://pragprog.com/book/gwpy/practical-programming) or something online and teaching yourself the basics of programming. The you can do things like start working on Rosalind (http://rosalind.info/about/) problems, which will get you to work on simple to increasingly complex bioinformatics problems and find a Software Carpentry workshop (http://software-carpentry.org; I’m on the advisory board) or something equivalent nearby.

      One you’ve picked up some basic skills (and you can do all of the above pretty easily in a your spare time in less than a year; probably a semester if you’re dedicated to it) then you just need to demonstrate your new skills. Get a GitHub (http://github.com) account and write some small programs. If you did this and contacted me as a prospective postdoc telling me that you wanted to make the switch and had done all of this work to be prepared to do so, I’d be pretty excited at the prospect of hiring you.

    • 23 caseybergman January 20, 2013 at 8:18 pm

      Ethan has beaten me to the punch on most of the things I would say, but let me add a few more comments to encourage you that the transition from wet->dry as a post-doc is eminently doable.

      First, we are talking about computational biology here, which is a hybrid discipline where (unlike in pure bioinformatics) the emphasis is on the biology not the programming. Learning to think like a biologist takes many years, and as a molecular biologist is something that you already have in your toolkit. Domain specific knowledge and intuition in biology is typically lacking in someone coming from a computer science/physics/math background. So whereas, they have an advantage technically, you have an advantage conceptually. I have trained people in my lab from both backgrounds, and actually lean towards taking molecular biologists with limited (i.e. self-taught) progamming skills over computer scientists who don’t know the difference between replication and transcription.

      Second, when it comes to hiring post-docs at most universities and institutes (outside the Harvard’s, MIT’s, Cambridge’s, EMBL’s, etc.), it is an open secret among PIs that finding really talented post-docs is a real challenge. From the standpoint of a PhD student coming out, you may think the post-doc market is really competitive and you’d never be able to compete against other people if you don’t have the full set of skills in the job description. But the reality is the applicant pool for a post-doc position is much bigger than the talent pool, and the set of applicants who get shortlisted and considered for jobs are those that have a proven track record of publication and finishing projects. I’d rather take on a molecular biologist who is a proven “finisher” but has limited programming skills, than the most talented programmer who has not demonstrated that they can take a project to completion (i.e. publication).

      So in addition to making sure you have some publications under your belt, I’d do exactly as Ethan suggests and start self-training in programming, use github as your online programming “CV”, and go ahead and apply for some computational biology post-docs. You may be surprised where this path may take you. Good luck!

  14. 24 D March 20, 2013 at 9:28 pm

    i’ve just graduated with a B.Sc. in bioinformatics. Ultimately I would like a career in industry. I have been given an interview for a junior bioinformatician job for a 3-4 year contract. I’ve also been given an opportunity to pursue a Masters/PhD in bioinformatics. If i’m given offers for both, I’m not sure which one to go for. What would you recommend I do?

  15. 26 seemauoh July 25, 2013 at 6:26 am

    I am also passionate about Bioinformatics teaching and research. I have written some in my blog here: http://seemauoh.wordpress.com/2013/03/29/teaching-to-honour-bioinformatics-and-computational-biology-as-a-science-and-an-art/.
    Let me know if you might have something up your sleeve how to teach those students who have no interest at all in Bioinformatics ( majority in wet lab courses where Bioinformatics is also one of the core courses), and they think that wet lab and only the wet lab is the be-all and end-all of scientific life.

  16. 28 Oscar December 3, 2013 at 10:58 am

    Hi,

    I read both articles, To and NOT to get a PhD in Bioinformatics.

    I have a B.S. in Molecular and cell biology, and a Specialization in Programming in Computing.
    Right now I’m in process of finishing a Master in Bioinformatics.

    Honest speaking, I want to go to industry rather than academics, for my interest and for the money.
    I haven’t touch math for about 10 years, so I am not at all good in math modeling and coming up with new algorithm.
    I’m one of those you called “applied Bioinformatician”, using what people developed and improve/tweak them for research.I don’t mind the wet lab in biology part, as I like doing them during the Bachelor time, and I’m comfortable doing computer programming, just don’t know how good I am. I should also mention that I know C++, Perl, R, and a bit of Java and website programming (html, xml, etc.)

    I am thinking about getting a PhD in molecular biology or cancer biology related PhD, is that a good option?
    or should I get a PhD in bioinformatics?

    Thanks in advance!

  17. 29 Heather Vincent March 7, 2014 at 7:37 am

    We run distance training courses suitable for post-docs who need to develop new skills in computational biology. There has been a huge amount of interest in the new NGS one.

    http://octette.cs.man.ac.uk/bioinformatics/

    We usually have a great mix of people from different backgrounds, so different points of view can be covered in the course discussion. Some people use the courses as a step back into work after taking a career break.

    http://octette.cs.man.ac.uk/bioinformatics/students/index.html


  1. 1 Top N Reasons To Do A Ph.D. or Post-Doc in Bioinformatics/Computational Biology | Plant Biology Teaching Resources (Higher Education) | Scoop.it Trackback on August 5, 2012 at 5:40 am
  2. 2 Top N Reasons To Do A Ph.D. or Post-Doc in Bioinformatics/Computational Biology | No silences at lunch | Scoop.it Trackback on August 5, 2012 at 3:25 pm
  3. 3 Top N Twitter Accounts (Academic) to Follow For Genomics Trackback on August 10, 2012 at 2:56 am
  4. 4 Top N Reasons NOT to do a Ph.D. in Bioinformatics/Computational Biology « Homologus Trackback on August 12, 2012 at 12:49 am
  5. 5 Useful for referring–8-12-2012 « Honglang Wang's Blog Trackback on August 13, 2012 at 2:21 am
  6. 6 Top N Reasons To Do A Ph.D. or Post-Doc in Bioinformatics/Computational Biology | plant metabolism | Scoop.it Trackback on August 17, 2012 at 11:52 am
  7. 7 Top N Reasons To Do A Ph.D. or Post-Doc in Bioi... Trackback on September 7, 2013 at 6:11 pm
  8. 8 Several articles about computational biology | Comp Bio Trackback on December 4, 2013 at 10:22 pm
  9. 9 Top N Reasons NOT to do a Ph.D. in Bioinformatics/Computational Biology — cite from homogous | Computational Biology Trackback on December 9, 2013 at 4:45 am

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s





Follow

Get every new post delivered to your Inbox.

Join 72 other followers

%d bloggers like this: