He starts by discussing ways in which database engines need to scale in a more distributed way as opposed to centralizing functions into monolithic and isolated structures. The key to scaling databases is to simplify and standardize how databases present data and then let them talk openly and freely. Databases should be flexible enough to serve up data however and whenever the wider world wants to use it whether that is a machine making queries, a PHP programmer creating web pages that talk to databases or the end-user trying to find information.
Paraphrasing..."What's not open today is how you talk to the database. Databases are not something that anybody with any kind of query engine can talk to. The lesson of the last ten years is that you open up your data and amazing things start happening. If we make it easy to serve up any kind of information you can think of, if that information is effortlessly available, then you won't just be querying your own tables. You'll be querying any information that is out there. If this happens, just as we saw the web change computing, we will see an equally important change in computing happen again. But we need a simple grammar that everyone can understand, a simple, perhaps sloppy, scaleable, and updateable format for data."
RSS and Atom are the
data format solution that will open up this new vision and make it
possible. These feed formats are going to be to data what HTML was to
content. Anyone who can code an HTML web page can build a RSS feed.
Furthermore, URLs can include the query language that will tell the
site what a user wants to receive, and if the results are returned as
RSS, then you suddenly have an open database model that anybody on the
planet can use at any given time on whatever device using any web-based
software.
Bosworth goes on to explain that this model, though compelling, is not scaleable if datasets are locked to individual servers. Data needs to reside on a federation of servers, but changes need to propagate out. This is a harder problem to solve, and, unfortunately, he didn't have enough time to explain his vision for the solution.
The
implications of this model are pretty
exciting. It means that big centralized databases in a few
locations will be displaced by more, smaller databases across the world
wide web. It means that more people will be able to build better
online services with fewer technical hurdles. HTTP gave the web a
skeleton. HTML and the browser gave it a skin. RSS will
give it muscles and create some very powerful connective tissue.
This will clearly help search engines identify better results
in terms of specific data points rather than just pages that have
content on them. People want to know the time when a class starts
or the price of an iPod or the ingredients of a Powerbar, but search
engines only help us find pages that link to pages that have
information that may or may not tell you the specific datapoint you
want to find. Tabbed browsing may help people compare search
results and locate the best answer to a search, but there has to be a
way to avoid all the HTML clutter and use just the data we want.
He's a good speaker who makes a lot of sense. Too bad he doesn't share more thoughts on his blog.
(I've been a podcast fan for a while now, but it has suddenly become a necessity now that my commute increased from 5 minutes to over an hour each way. As you can see from this post, I'm getting caught up on some great stuff that I probably would have missed otherwise.)
No comments found.