Google’s Palimpsest project: promiscuous distribution of all science data sets

GooglesteinGoogle’s Palimpsest project, once realized (in the near future) has the potential to change the way science is done by accepting gigantic (raw?) data sets from all disciplines and making them open and free (including dark data?). Jon Trowbridge from Google Inc. had a presentation on SciFoo, 2007 at the Googleplex not documented well, but you can download his slides on the project that was presented at XTech 2007 in Paris, this May: Making Massive Datasets Universally Accessible and Useful Presentation. You are not restricted to the zip file as Jon kindly gave a permission to publish his slides with SlideShare here. From his intro: This talk will discuss a project underway at Google to collect and distribute large scientific datasets using a 21st century “Sneakernet”: multi-terabyte disk arrays shipped via FedEx and other common carriers.
The project is strictly non-profit, but fits well with Google’s mission.

Other links:

Scifoo: Google and large scientific datasets

Google helps terabyte data swaps

43 thoughts on “Google’s Palimpsest project: promiscuous distribution of all science data sets

  1. Thanks for posting the slides. It is interesting. This is still very much for very large data volume but maybe whatever they build around this (maybe a GBase segment for scientific data) could be use for lower data volume uploaded via net.

  2. Pedro,

    At least for now they don’t necessarily have plans to do much with the data other than make it available on the web. Ideally, I think a Freebase/GBase type approach would be great. With an appropriate API and knowledge of the data structure, people could start building apps and of course, Google would do a great job of indexing the whole thing

  3. Pingback: Communications
  4. That’s great.
    I am curious what browse/search feature Google will provide. It will be nice the data be well annotated using semantic web technology.

  5. We at the Ctr. for Inherited Disease Research routinely ship data from genome scans to PIs and back-and-forth to NLM/NCBI on large encrypted disk arrays. We also continually archive and will eventually have to delete all the level 0 or “raw” data – the actual image data from which the genomic data is derived. I think someday we will regret deleting this data since better algorithms are developed every day yet many of these studies use the very last of available DNA from a given research subject who may be dead or otherwise no longer available to extract more. Having someplace to store them for future re-analysis, imho, be a great service.

    Now, what about the data from extremely high-res 3D scana of the world’s entire collection of several hundred thousand cuneiform tablets, the world’s oldest written records and in many ways the foundational documents of human civilization? It might be a few petabytes or so: http://www.jhu.edu/digitalhammurabi/

  6. Thanks for posting the slides. It is interesting. This is still very much for very large data volume but maybe whatever they build around this (maybe a GBase segment for scientific data) could be use for lower data volume uploaded via net.

    I love your blog.

  7. Thanks for posting the slides. It is interesting. This is still very much for very large data volume but maybe whatever they build around this (maybe a GBase segment for scientific data) could be use for lower data volume uploaded via net.

    I love your blog. very much for share

  8. Hey guys
    what happened to your project? Did you cancel it? Or does it runder under anoter name? This was a really cool idea.

  9. This is interesting, but sadly 2 years old and I am still unable to find solid information on the project. Guessing it was abandoned. If not feel free to contact me, I’ll keep digging and emailing since I have a 1.2 TB dataset that a lot of people would like to see hosted.

  10. Brilliant post! It was obviously inspiring, thus appreciate ones hard allow an improvement! I’m going to be sure to promote this particular having a numerous good friends whom I realize would like it.

  11. Greetings! This is my first visit to your blog! We are a group of volunteers and starting a new project in a community in the same niche. Your blog provided us valuable information to work on. You have done a extraordinary job!

Comments are closed.