Changing the game: absolute protein quantification by relating histone mass spec signals to DNA amounts and cell numbers

One thing system biologists want is to have by and large absolute protein concentrations or copy numbers per cells available cheaply for their models leveraging all sorts of omics data. Looks like such results can now be easily delivered based on a study published on the 15th of September by the Mann lab in Molecular & Cellular Proteomics entitled A ‘proteomic ruler’ for protein copy number and concentration estimation without spike-in standards.

This is the best, and certainly the simplest idea in proteomics I’ve seen in a while and – assuming it is on safe methodological ground – has the potential to disrupt the field of quantitative proteomics and give a final push to dry, label-free approaches over wet and expensive labeling methods. On one hand the ratio of a particular protein’s – or group of proteins’ – cellular abundance to the total protein mass can approximated and extracted with mass spectrometry proteomics as the proportion of the mass spec (MS) signal intensity of that particular protein to the total MS signal. On the other hand the amount of cellular DNA can be approximated as 6.5 pg for a diploid human cell using the human genome size multiplied by the average mass of a base pair. Cell numbers in a sample can be then inferred by dividing the total DNA mass with the cellular DNA amount. Cellular protein mass can then be gained by dividing the total protein mass with the cell numbers. At this point – and that is the out of the box simple & terrific idea , – if the particular group of proteins picked are histones then their cumulative histone mass, ie. the summed MS signal of all histone-dervied peptides – can serve as a proxy for the amount of DNA since the mass of the DNA is ~ equal to the combined mass of histones. So we have

Screenshot 2014-09-16 22.53.22

Wiśniewski et al. used four mouse cell lines to test this argument and the obtained cellular protein masses were within a factor of 1.24+/-0.29 compared to values coming from actual cell counting. One peculiarity in case of histones are obviously multiple protein modifications (PTMs) like phosphorylations, acetylations and methylations and since modifications are usually not extensively searched against the mass spec data the histone MS-signals can be skewed due to this. So they searched the data for these mods as well and without them and it turns out that with the exception of histone H3 the cumulative histone fraction changed only by 5-10% so they could ignore them in the calculations. According to the measurements the histone fraction can be recovered in a stable manner from the depth of ~ 12000 or more peptides. So with this for each protein its mass per cell can now be estimated from its MS signal as the product of its MS-signal fraction multiplied by the cellular protein mass. And so the way protein copy numbers per cell can be calculated looks as

Screenshot 2014-09-16 23.27.07Applying this to different mouse organ proteomes (brain, liver, thymus) they

“found the quantitative results of our proteomic ruler approach to be typically within a factor of two of precision measurements or literature values.”

This approach makes isotope-labeled spike-in reference peptides, cell counting and protein cc measurements slightly out of date used for absolute protein quantification purposes. And opens up the space for re-analysing already deposited whole mass spec proteome datasets in search of system biology clues nurtured by cellular and absolute protein copy number estimates.

The associated dataset is available via PRIDE/ProteomeXchange under the identifier PXD000661.