Hadoop 101 for bioinformaticians: 1 hour crash course, code and slides

Earlier this year (February-April) I ran 9 short 1 hour hands-on sessions (5 persons/session) called Hadoop 101 for bioinformaticians at the Genome Campus for European Bioinformatics Institute and Sanger Institute people. The participants were bioinformaticians, developers and sysadmins. My idea was to start with a ~20 minutes long theoretical introduction so it provides some handles on whether… Continue reading Hadoop 101 for bioinformaticians: 1 hour crash course, code and slides

Pleasingly Parallel MCMC: cracked wide open for MapReduce and Hadoop

MCMC methods guarantee an accurate enough result (say parameter estimation for a phylogenetic tree). But they give it to you usually in the long-run and many burn-in steps might be necessary before performing ok. And if the data size grows larger, the number of operations to draw a sample grows larger too (N -> O(N)… Continue reading Pleasingly Parallel MCMC: cracked wide open for MapReduce and Hadoop