Big list of Markov chain Monte Carlo (MCMC) applications

I became quite obsessed with Markov chain Monte Carlo Methods lately. It is said that MCMC methods form the most frequently used class of algorithms in computer science. However when I was searching for a comprehensive list of MCMC applications across different domains to my surprise I have found none. So I’d like to ask… Continue reading Big list of Markov chain Monte Carlo (MCMC) applications

Hadoop 101 for bioinformaticians: 1 hour crash course, code and slides

Earlier this year (February-April) I ran 9 short 1 hour hands-on sessions (5 persons/session) called Hadoop 101 for bioinformaticians at the Genome Campus for European Bioinformatics Institute and Sanger Institute people. The participants were bioinformaticians, developers and sysadmins. My idea was to start with a ~20 minutes long theoretical introduction so it provides some handles on whether… Continue reading Hadoop 101 for bioinformaticians: 1 hour crash course, code and slides

Changing the game: absolute protein quantification by relating histone mass spec signals to DNA amounts and cell numbers

One thing system biologists want is to have by and large absolute protein concentrations or copy numbers per cells available cheaply for their models leveraging all sorts of omics data. Looks like such results can now be easily delivered based on a study published on the 15th of September by the Mann lab in Molecular & Cellular… Continue reading Changing the game: absolute protein quantification by relating histone mass spec signals to DNA amounts and cell numbers

Pleasingly Parallel MCMC: cracked wide open for MapReduce and Hadoop

MCMC methods guarantee an accurate enough result (say parameter estimation for a phylogenetic tree). But they give it to you usually in the long-run and many burn-in steps might be necessary before performing ok. And if the data size grows larger, the number of operations to draw a sample grows larger too (N -> O(N)… Continue reading Pleasingly Parallel MCMC: cracked wide open for MapReduce and Hadoop

2 recent Global Alliance for Genomics and Health standard candidates: ADAM and Google Genomics

Global Alliance for Genomics and Health includes > 150 health and research organizations to progress/accelerate secure and responsible sharing of genomic and clinical data. GA4GH (for short) is something you will here about more and more in the short term future. In the context of genomics standards think of mainly data formats and code to access and process… Continue reading 2 recent Global Alliance for Genomics and Health standard candidates: ADAM and Google Genomics