Sage Bionetworks Update: building an OA standard for human disease biology

Sage Bionetworks is a not-for-profit organization developing an open-access “pre-competitive” platform for Sagelogonetworked and annotated models of human disease. It’s a huge and unparalleled bioinformatics enterprise: starting with an anonymous $5 million donation and soon making high throughput, large-scale human and mouse biological data (largely from Merck) available in the range that’s already in the public domain today. The co-founders are real big shots, Stephen Friend, a former successful Merck Executive and Eric Schadt, now a Chief Scientific Officer of Pacific Biosciences, who is “an industry leader in network biology with a number of high-profile publications over the past 5 years that have energized the systems biology community.”

For the last couple of months there was only minimal information available on the Sage website but now scientists interested can get the big picture in more details via a significant update.

The strong motivation behind is to build an open-access standard platform for human disease biology because

human disease biology has no common languages, no accessible communal repositories and no government, corporate or foundation investment in generating an inclusive resource….The experimental data underlying disease biology, like the genome itself, needs to be open access because the data is simply the beginning of the process….

Human disease biology is so complex, interconnected and expensive to research that the existing dominant business strategies of building and patenting unique models need to be replaced by a common standard. Like the internet, disease biology models will gain strength by their very nature as public platforms for interoperability and communication – this approach is at the very heart of that strength.

At the heart of the Sage model are the so called Global Coherent Datasets that will be for the first time available for scientists working all around the world. We’re talking about a real goldmine here for researchers:SagediseasemodelsAnd if that doesn’t sound good enough for a start then the following Sage Datasets will be available in 1 to 2 years:

  • Extended D&O datasets to include >2,000 additional mouse and >1,000 additional human individuals (totals: >3,000 & >3,500 respectively)
  • Extended neurological datasets for mouse to include sleep, anxiety and depression traits
  • Extended CVD phenotypes for mouse and human to include additional relevant tissue-types (kidney, arterial wall, heart, plaque, etc)
  • Age-related phenotypes relevant to sarcopenia (Mouse)
  • Oncology datasets relevant to hepatocellular carcinoma including paired tumor/adjacent normal tissue and networks predictive of outcome. Also breast cancer and colon cancer datasets
  • Human/mouse datasets relevant to respiratory and inflammatory disease

You name it.

As an approximate, working definition Global Coherent Data Sets are those “where DNA variation, genome-wide molecular phenotypes such as gene-expression, and clinical phenotypes have been measured in a sizable population of genetically diverse individuals.” Sizable: hundreds of individuals at least.

Meet the Sage Repository and Commons concepts: Sage Commons will contain the Global Coherent Data Sets, the network models derived from those sets, and the analytical methods and code used to generate the network models. Sage Repository will contain the same triplets only those “may not be annotated or coherent to the degree required for the Sage Commons, but which are nevertheless useful to the biological research community”


How can you utilize the data once it’s available starting probably 2010ish? 2 Case Studies are mentioned (identify novel target for obesity & reposition a drug) out of which I’d like to emphasize drug repurposing:

The pharmaceutical industry has an ever expanding portfolio of compounds that have been shown to be safe in human testing and to effectively modulate particular targets. In some cases these drugs show efficacy and go on to become marketed drugs whilst in other cases they may fail to show efficacy precluding their development for the selected indication. In either case there is a significant value to the industry in finding new indications for drugs with these characteristics (safe & modulate a known target) as a means to realize value on investment. Using network models tied to disease traits it is possible to generate potential new indications for compounds. Figure 2 illustrates how this was done for a compound originally developed for asthma. A molecular signature for the drug was interrogated against disease networks and a significant enrichment of the drug signature was noted for a particular pancreatic islet network linked to obesity and insulin resistance traits in a mouse F2 population. This led to the prediction that this drug may modulate these phenotypes through an effect on the islet module. This was tested in a mouse DIO model in which the compound helped normalize insulin and glucose levels.


18 thoughts on “Sage Bionetworks Update: building an OA standard for human disease biology

  1. Saga Bionetworks is a not-for-profit organization developing an open-access “pre-competitive” platform for Sage logo networked and annotated models of human disease.przeprowadzki warszawa The transition from a linear to a network mindset would require the generation of coherent datasets, the development of predictive models to design novel therapeutic approaches, and the leveraging of social networks and other means to foster a contributor network.

  2. You have a really great, sweet and motivating blog posting style, it makes your articles look even more intersting and informative.

  3. Journaled filesystems were of course the earlier fad. In this mechanism a copy of any metadata and sometimes data that is to be modified is kept in a journal, a fixed area of the disk or another disk, that logs each modification. In this mechanism the journal is replayed on reboot and the filesystem is left in a consistent state. The problem with journaling is either it’s very simple and uses a tremendous amount of space and I/O with a strict transaction model that prevents some concurrency, or it becomes incredibly complex, to the tune of over 20,000 lines of code in xfs. Still I have questions about concurrency when multiple transactions affect a block in xfs, but I need to dig deeper to understand this.

  4. As millions of people who want to lose weight, but youDid he can not succeed in a kind of worry about your weight yanıyorsunuz more? You might think, and this process will be much easier to lose weight in a healthy way to get over … Now a much more difficult as it used to have a nice view, but you have to do is that you want to attenuation.

Comments are closed.