O'Reilly Product Metadata Interface
Most publishers are familiar with the ONIX standard for exchanging metadata about books among trading partners. Anyone who's actually spent time working with ONIX knows that its syntax is abstruse at best. While ONIX does use XML, there are more modern, more general, and more immediately comprehensible standards out there, particularly for the basic details like "author," "title," and "edition."
One of those standards is RDF, or "Resource Description Framework." This experimental O'Reilly Product Metadata Interface (OPMI) exposes RDF for all of O'Reilly's titles, organized by ISBN. Here's a snippet of the RDF metadata for iPhone: The Missing Manual, 2e from the OPMI at http://opmi.labs.oreilly.com/product/9780596521677:
<?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <om:Product xmlns:om="http://purl.oreilly.com/ns/meta/" rdf:about="urn:x-domain:oreilly.com:product:9780596521677.BOOK" xmlns:dc="http://purl.org/dc/terms/" xml:lang="en"> <dc:isFormatOf rdf:resource="urn:x-domain:oreilly.com:product:955988693.IP"/> <dc:issued>2008-08-13</dc:issued> <dc:creator> <rdf:Seq rdf:ID="creator"> <rdf:li rdf:resource="urn:x-domain:oreilly.com:agent:pdb:350"/> </rdf:Seq> </dc:creator> <dc:rightsHolder>David Pogue</dc:rightsHolder> <dc:description>The new iPhone 3G is here, and bestselling author David Pogue is back with a thoroughly updated edition of <em>iPhone: The Missing Manual</em>. With its faster downloads, touch-screen iPod, and best-ever mobile Web browser, the new affordable iPhone is packed with possibilities. But without an objective guide like this one, you'll never unlock all it can do for you. Each custom designed page helps you accomplish specific tasks for everything from web browsing, to new apps, to watching videos.</dc:description> <dc:extent>376 pages</dc:extent> <dc:type rdf:resource="http://purl.org/dc/dcmitype/PhysicalObject"/> <dc:format>6 x 9 in</dc:format> ...
In addition to Dublin Core, the OPMI includes elements from::
- FOAF ("Friend of a Friend") for describing people that contribute to a title
- MARC Relators codes for used to describe how a person contributed to a work, such as the Cover Designer or Editor
- MODS (Metadata Object Description Schema), used for all sorts of... no, not really it's used for exactly one thing: to specify the edition of the work
rdf:about="urn:x-domain:oreilly.com:product:9780596521677.SAF" rdf:about="urn:x-domain:oreilly.com:product:9780596153960.EBOOK" rdf:about="urn:x-domain:oreilly.com:product:9780596521677.BOOK" rdf:about="urn:x-domain:oreilly.com:product:9780596801007.APP"
The URLs are structured by ISBN. Once you have the
ISBN for an O'Reilly book, you can get the full metadata via HTTP request to:
To get you started, here's direct links to the public RDF for our current top-5 bestsellers:
- Mac OS X Leopard: The Missing Manual
- iPhone: The Missing Manual, 2e
- iPod: The Missing Manual, 7e
- Head First HTML with CSS & XHTML
- Photoshop Elements 7: The Missing Manual
After working through those five, start your exploration with an ISBN of a title you find interesting. Identifying which ISBNs are interesting to you is something in your court for now, but we're brainstorming on filtering and querying. Here's a snapshot of more than 1100 titles available on 12 February, 2009 from the O'Reilly Store, if you'd like to go whole hog (be kind to our servers, please). You can find an O'Reilly ISBN on the back of all your O'Reilly books, while reading on Safari Books Online, on the oreilly.com Store, or on Amazon in the "Product Details" section. For this example, let's use Practical RDF's ISBN, 9780596002633.
Using your favorite browser, programing language, or command-line utility do an HTTP GET of http://opmi.labs.oreilly.com/product/9780596002633.
Client Server | | | 1.) GET to OMPI URI | |------------------------------------------>| | | | 2.) 200 Ok | | RDF Product Representation | |<------------------------------------------| | |
You'll get back an RDF/XML document containing all the metadata for not only the ISBN of the product you asked about, but all of it's directly related products as well. For example, while we asked about the Print form of Practical RDF, the document we
get back will also include information about the eBook and Safari Books Online version
of the product as well. There will also be a
foaf:Person record for every person who
contributed to the work as well as an author biography.
If you're frightened by RDF, we have a few links that might help. The RDF Primer from the W3C is an excellent, if dense, guide to getting started with RDF. Our own Practical RDF was read extensively by our developers while working on our applications. We've posted the second chapter, RDF: Heart and Soul, an overview of RDF, to get your feet wet. For more advanced topics including reasoning and the use of OWL in conjunction with RDF we found Semantic Web for the Working Ontologist to be invaluable.
Happily, there are a large number of open source tools for working with RDF. Some of the ones we use here at O'Reilly are:
- The Tabulator Extension, a Firefox extension that allows for good visualization and browsing of RDF data.
- The Jena framework, for our Java based applications.
- RDFLib, for our Python-based applications
There's a lot more we'll be doing here to both provide more data and to add some human-friendly views into the data, but we wanted to let this out in the wild, if a bit unpolished, in time with the 2009 TOC Conference. If you're familiar with XML and RDF and don't mind poking around among the angle brackets, we'd love to hear what you come up with!
We've described a very very simple use in O'Reilly Product Metadata Interface (OPMI) Usage At Tweet Length.
Stay tuned to the O'Reilly Labs blog for updates and more information on experimental projects from O'Reilly.