Wednesday, April 2, 2014

Raspberry Bioinformatics: what's wrong with it?



I rarely get upset about someone being wrong on the Internets, because, well my patience is finite and if I spend it on trivial things, I will run out of it pretty fast.  However, sometimes one needs to react.

While the article I am going to quote has been published last August, a colleague has pointed it out to me today.

 Simon Harold, writing at Biomed Central blog   about the coming revolution in bioinformatics education, all because of the mighty Raspberry Pi (at the moment of this writing, the RPi site is still in the thores of its April Fool's Joke, but hopefully it will recover):


"Now, an open access, open learning method developed by Daniel Barker and colleagues aims to strip this teaching back to basics by using the newly-developed Raspberry Pi computing system to let students experience full administrator rights and gain valuable insights into real-world bioinformatics. The low-costs involved (each computer typically costs around £30/$40/€35) also means that large-scale teaching may be achieved without university costing departments having to worry about whether their laptops will be returned in full working order at the end of the semester."
Elsewhere in the text, the article identifies the actual issues with bioinformatics education, specifically, with teaching Computer Science to biologists. Few courses, hard to fit necessary content into a single course (actually - impossible - if we cannot teach future computer scientists the entirety of computer science in one course, why would it be possible for biologists), lack of faculty with the right set of skills - these are all actual issues.

None of which is addressed by using Raspberry Pis as the computers of choice for teaching bioinformatics. The two features of Raspberry Pis listed in the quote above: their price point, and the immediate admin privileges they confer are not actually relevant for training biologists in computing sciences. Let me break it down:

  1. Cost. First and foremost, I do not buy the premise that biologists do not have access to better computers. If we are talking universities, every student, or almost every student has access to either a personal laptop or a desktop (or both), regardless of their major. Similarly, universities feature computing facilities - be it CS labs, or university-run computer labs, in which computers are perfectly adequate for the amount of software development that needs to happen in any bioinformatics course. 
  2. Cost revisited. The way Raspberry Pis are postulated to be the next best thing in bioinformatics education also raises the actual end cost of the computing environment significantly. Yes, the Pi board itself is $35 or so, plus another $10 for a nice looking enclosure. Yet, if you actually want to use the Pi as a desktop, here are some other things that need to be purchased to make it work:

      • a monitor ($150 - $200)
      • a keyboard ($20- $40)
      • a mouse ($15 - $30)
      • various cords and wires ($10-$20)

    Optionally, there may be a need for a wi-fi dongle and a few other accessories to turn a Pi board into a desktop computer capable of doing what is implied in the article.  So, you are looking at at least $200-250 worth of investment. That's only a $100 lower than some pretty reasonable, and certainly more powerful laptops. 
  3. Privileges. Yes, popping an SD card and booting a Pi gives you root.  This is a great way to train budding Unix system administrators. How is this relevant to my ability to teach a group of biologists to implement Smith-Waterman algorithm in (say) Python?  Now (that everyone and their grandmother are spinning off AWS instances) somewhat fewer than before, but still, a sizable percentage of computer scientists and software developers go through their professional careers without ever having root priveleges or sudo on anything other than their home machine (if they have one that runs Linux). Why in the world would we start teaching computer science to biologists with Unix administration?
More important and more glaring in all of this to me is the underlying assumption that the computer you are using is the deciding influence in how you are learning Computer Science. It is not and never should be. While clearly we don't want to teach "would-be bioinformaticians" (to quote the author) programming single-tape Turing Machines, the actual conventional hardware they will be using to run their programs written in conventional programming languages is irrelevant. Pi is no better for this purpose than any other machine running Linux - or, perhaps even any other machine running any other conventional OS. Python and Java are ubiquitous these days.

I have my own set of strongly held opinions about bioinformatics education, having been on the front lines for a while. These opinions boil down to the following two observations:
  1. Trained computer scientists and software engineers are better in designing and implementing any software, including software for bioinformatics purposes. If you need software written, use professionals in the field of computing if possible.
  2. It takes at least the equivalent of a CS minor for someone to be a competent software developer. If biologists want to be competent software developers, they need to go through about that much CS education.
There is no magic bullet here, and even if there was, Raspberry Pis ain't it for bioinformatics education.

Sorry.

No comments: