Hadoop 101 for bioinformaticians: 1 hour crash course, code and slides

Earlier this year (February-April) I ran 9 short 1 hour hands-on sessions (5 persons/session) called Hadoop 101 for bioinformaticians at the Genome Campus for European Bioinformatics Institute and Sanger Institute people. The participants were bioinformaticians, developers and sysadmins. My idea was to start with a ~20 minutes long theoretical introduction so it provides some handles on whether the participants’ particular computational problems might fit the MapReduce/Hadoop distributed computing paradigm. This was followed by a ~40 min long practical session where I aimed to provide enough code with examples to get people started with Hadoop development. I set up a github repo for this called Hadoop 101 for bioinformaticians and here are the slides I used throughout: