Building a world wide database using RSS

Adam Bosworth's talk at the MySQL Users Conference in April revealed some interesting thinking going on at Google regarding RSS.  I don't know if he's making a case for using MySQL to take down Oracle, but he's definitely making a case for using MySQL to leverage RSS to expose more data on the Internet.

He starts by discussing ways in which database engines need to scale in a more distributed way as opposed to centralizing functions into monolithic and isolated structures.  The key to scaling databases is to simplify and standardize how databases present data and then let them talk openly and freely.  Databases should be flexible enough to serve up data however and whenever the wider world wants to use it whether that is a machine making queries, a PHP programmer creating web pages that talk to databases or the end-user trying to find information.

Paraphrasing..."What's not open today is how you talk to the database.  Databases are not something that anybody with any kind of query engine can talk to. The lesson of the last ten years is that you open up your data and amazing things start happening.  If we make it easy to serve up any kind of information you can think of, if that information is effortlessly available, then you won't just be querying your own tables.  You'll be querying any information that is out there.  If this happens, just as we saw the web change computing, we will see an equally important change in computing happen again.  But we need a simple grammar that everyone can understand, a simple, perhaps sloppy, scaleable, and updateable format for data."

RSS and Atom are the data format solution that will open up this new vision and make it possible. These feed formats are going to be to data what HTML was to content.  Anyone who can code an HTML web page can build a RSS feed.  Furthermore, URLs can include the query language that will tell the site what a user wants to receive, and if the results are returned as RSS, then you suddenly have an open database model that anybody on the planet can use at any given time on whatever device using any web-based software.

Bosworth goes on to explain that this model, though compelling, is not scaleable if datasets are locked to individual servers.  Data needs to reside on a federation of servers, but changes need to propagate out.  This is a harder problem to solve, and, unfortunately, he didn't have enough time to explain his vision for the solution.

The implications of this model are pretty exciting.  It means that big centralized databases in a few locations will be displaced by more, smaller databases across the world wide web.  It means that more people will be able to build better online services with fewer technical hurdles.  HTTP gave the web a skeleton.  HTML and the browser gave it a skin.  RSS will give it muscles and create some very powerful connective tissue.

This will clearly help search engines identify better results in terms of specific data points rather than just pages that have content on them.  People want to know the time when a class starts or the price of an iPod or the ingredients of a Powerbar, but search engines only help us find pages that link to pages that have information that may or may not tell you the specific datapoint you want to find.  Tabbed browsing may help people compare search results and locate the best answer to a search, but there has to be a way to avoid all the HTML clutter and use just the data we want.

He's a good speaker who makes a lot of sense.  Too bad he doesn't share more thoughts on his blog.

(I've been a podcast fan for a while now, but it has suddenly become a necessity now that my commute increased from 5 minutes to over an hour each way.  As you can see from this post, I'm getting caught up on some great stuff that I probably would have missed otherwise.)



No comments found.

Trackbacks:

TrackBack URL:
http://www.mattmcalister.com/blog/_trackback/1086575

No trackbacks found.