Pimm – Partial immortalization

A Biotech Geek (micro)Blogger’s adventures through science, technology and the web…

  • email me

    [attilacsordas][at][gmail.com]
  • Attila on Twitter

  • Recent Comments

    Open Source Science … on Google’s Palimpsest proj…
    Reviews New Gadget on Why the Dyna-Vision G1 Android…
    name on How to read PDF files on iPhon…
    Wrinkle Reviews on Terrific Pixar-style Harvard a…
    Bony Yousuf on Systemic regmed
    Face Anti Wrinkle on Rumors on Amatokin: a skin ste…
    Liyal Blog on Rumors on Amatokin: a skin ste…
    Big Tits and Curvy A… on Spit a big in a tube, search w…
    Sean on Vadlo, the beta biomedical sea…
    Bony Yousuf on Rumors on Amatokin: a skin ste…
  • licence

    Creative Commons License
  • c

  •  

    September 2007
    M T W T F S S
    « Aug   Oct »
     12
    3456789
    10111213141516
    17181920212223
    24252627282930

Google’s Palimpsest project: promiscuous distribution of all science data sets

Posted by attilachordash on September 25, 2007

GooglesteinGoogle’s Palimpsest project, once realized (in the near future) has the potential to change the way science is done by accepting gigantic (raw?) data sets from all disciplines and making them open and free (including dark data?). Jon Trowbridge from Google Inc. had a presentation on SciFoo, 2007 at the Googleplex not documented well, but you can download his slides on the project that was presented at XTech 2007 in Paris, this May: Making Massive Datasets Universally Accessible and Useful Presentation. You are not restricted to the zip file as Jon kindly gave a permission to publish his slides with SlideShare here. From his intro: This talk will discuss a project underway at Google to collect and distribute large scientific datasets using a 21st century “Sneakernet”: multi-terabyte disk arrays shipped via FedEx and other common carriers.
The project is strictly non-profit, but fits well with Google’s mission.

Other links:

Scifoo: Google and large scientific datasets

Google helps terabyte data swaps

24 Responses to “Google’s Palimpsest project: promiscuous distribution of all science data sets”

  1. Deepak said

    Attila

    This is great. Thanks for posting.

  2. [...] Google about Google’s efforts in this direction. While the talk from Scifoo is not available, Attila got permission to upload Jon’s talk up on Slideshare. The presentation is quite similar to the talk at [...]

  3. Thanks for posting the slides. It is interesting. This is still very much for very large data volume but maybe whatever they build around this (maybe a GBase segment for scientific data) could be use for lower data volume uploaded via net.

  4. Deepak said

    Pedro,

    At least for now they don’t necessarily have plans to do much with the data other than make it available on the web. Ideally, I think a Freebase/GBase type approach would be great. With an appropriate API and knowledge of the data structure, people could start building apps and of course, Google would do a great job of indexing the whole thing

  5. [...] Google’s Palimpsest project [...]

  6. [...] (12:01pm): Attila Csordas of Pimm has a lot more details on the project, including a set of slides that Jon Trowbridge of Google gave at a presentation in [...]

  7. [...] “Palimpset”. The Wired piece also links to this blog, “Pimm”, which has a presentation about this project available on Slideshare. Pimm’s  blog said that this project is strictly nonprofit (I [...]

  8. [...] For more on Google and it’s data efforts, keep tabs on Attila’s blog, included his post on the promiscuous distribution of large datasets [...]

  9. [...] annotation and commentary solution. What does that mean, exactly? Heck if I know. Venture over to this Pimm blog post to cycle through a brief slide show to get some measure of what one will likely encounter on launch [...]

  10. [...] data on http://research.google.com/. Alexis Madrigal mentions more details in a post on Wired. Pimm’s post on the same topic displays the device to send data on slide 10/16. Google seems to plan to collect [...]

  11. [...] (including about why Google intend to import data by shipping RAID arrays around the world) here and (more up to date) [...]

  12. [...] Il progetto sarà accessibile a tutti gli scienziati che desiderano condividere con la comunità i propri dati, vista la mole di dati da trasportare, il pogetto sfrutterà il vecchio adagio informatico: “niente ha la larghezza di banda di un TIR che viaggi in autostrada carico di hard disk”, e così faranno: gli scienziati riceveranno una valigetta contentente un black box sul quale caricare 3TB di dati che poi saranno fisicamente spediti a Googl per l’inclusione nel grande database. Il dataset con il quale il progetto partirà è quello delle fotografie dell’Hubble Space Telescope. Maggiori dettagli e alcune slide sull’aromento disponibili qui. [...]

  13. [...] Wired is reporting that Google will begin hosting terabytes of open-source data at http://research.google.com(the project was supposed to open this week, but missed the deadline, but will debut soon). The space will be free to scientists and the data will be available to all. The project is known as Palimpsest. The storage will allow scientists to explore amazing amounts of data. Tons of more information on the project is available at Pimm. [...]

  14. [...] Csordas di Pimm ha maggiori dettagli sul progetto, compresa una serie di diapositive, che Jon Trowbridge di Google ha mostrato ad una presentazione a [...]

  15. [...] [via Wired and Pimm] [...]

  16. [...] y discutir sus estudios y datos con otros científicos del mundo. El proyecto tiene el nombre de palimpsest, que será manejado desde el dominio [...]

  17. Google is playing to win in the 700 MHz auctions

    Many say Google will bid to lose in the upcoming 700 MHz auctions and many more are equivocating. The idea is Google’s entry alone will induce enough openness, and besides they couldn’t afford to become an operator. This shows a

  18. Laser said

    That’s great.
    I am curious what browse/search feature Google will provide. It will be nice the data be well annotated using semantic web technology.

  19. [...] ” Wired” meldet, hätte das Projekt mit dem Namen Palimpsest eigentlich schon letzte Woche online gehen sollen. Doch der Start musste verschoben werden, soll [...]

  20. We at the Ctr. for Inherited Disease Research routinely ship data from genome scans to PIs and back-and-forth to NLM/NCBI on large encrypted disk arrays. We also continually archive and will eventually have to delete all the level 0 or “raw” data – the actual image data from which the genomic data is derived. I think someday we will regret deleting this data since better algorithms are developed every day yet many of these studies use the very last of available DNA from a given research subject who may be dead or otherwise no longer available to extract more. Having someplace to store them for future re-analysis, imho, be a great service.

    Now, what about the data from extremely high-res 3D scana of the world’s entire collection of several hundred thousand cuneiform tablets, the world’s oldest written records and in many ways the foundational documents of human civilization? It might be a few petabytes or so: http://www.jhu.edu/digitalhammurabi/

  21. [...] method works well for all large datasets. A presentation by Jon Trowbridge at SciFoo (slides available here) makes a compelling argument that disk hardware capacity has consistently outpaced network [...]

  22. [...]        Google透露了一个新的针对科学社区的项目:Palimpsest,网址research.google.com,用于储存数以TB计(一开始是3TB,可能会进一步扩充到20T)的开放性科学数据。科学家可免费储存数据,任何人都可以自由访问,网站已经在去年8月被科学家预先测试过。Palimpsest将以Google去年收购的数据分析公司Trendalyzer研究的数据可视化技术为基础,加上自己开发的信息检查和查询算法。新网站将提供YouTube风格的注释和评论功能。这里有一位参与测试的生物学家在去年写下的评述及幻灯片。 [...]

  23. [...] by Google engineer Jon Trowbridge at SciFoo 2007 — the slides from a later version of the talk is archived on the Partial Immortalization blog — the project was going to store, for free, some of the world’s largest scientific [...]

  24. [...] Et des informations complémentaires ici January 20, 2008 | Filed Under Google, Open Source, OpenAcademic, hosting, research, science, [...]

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <pre> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>