How to hail a London cabbie using Twitter

This article titled “How to hail a London cabbie using Twitter” was written by Jemima Kiss, for theguardian.com on Wednesday 4th August 2010 07.52 UTC

Next time you’re in London and need a cab, you might like to try tweeting @tweetalondoncab for one. Richard Cudlip, Karl James and a small circle of tech-inclined cabbies have spent the last year building up a black cab service on Twitter, and while Cudlip says they don’t handle more jobs than in their street-hailing days, it’s the data the service generates that is the really interesting part.

There’s now 100 cabbies using tweetalondoncab and nearly 7,000 followers, which means they are nearing a critical mass where the service starts getting really useful with enough cabs to match the number of punters. The drivers are self employed and tweetalondoncab is a voluntary, cooperative project, but the founders want to build it into a business and are looking for funding. They’ve already met Channel 4’s 4ip.

So what’s the real advantage? The account acts as an aggregator for requests, and cabbies can also flag up their location. Interestingly, isn’t too far away from the courier update service idea started Twitter in thefirst place.

“We’re getting more and more bookings, and the quality of bookings is better, with longer trips,” said Cudlip, who says a few minor celebrities use the service because they find a direct message more discreet than flagging down cabs on the street. All the drivers are full licenced black cab drivers with ‘The Knowledge’ – and they now have a tweetalondoncab sticker in the window.

The surprise has been the real-time data, and the value of aggregating and sharing information about demand or surplus around the city – a tube line down for an hour, or too much of a queue at St Pancras. “We didn’t even think of that when we started,”said Cudlip. “In two years, I’d like us to rival the black cab circuits like ComCab and RadioTaxis. We want more information to come in so we can share it with more people, and that information might be useful to other people in the same way TFL’s data is shared.”

The data challenge is quite a temptation for developers – three have already approached the team and suggested a mobile app – but there’s a problem compiling data between a few hundred sole traders that has put developers off so far. Twitter has been the best solution to date, although a couple of developers are experimenting with Foursquare – setting themselves up as a virtual taxi rank and checking in when they are on duty.

That’s pretty smart, but with clued-up, GPS smartphone-enabled cabbies spread across the city, surely that’s just the start. It’s a classic business ripe for disruption. Is anyone up for helping with the challenge?

guardian.co.uk © Guardian News & Media Limited 2010

Published via the Guardian News Feed plugin for WordPress.

Republishing articles from ProPublica

Just testing out the ProPublica article republishing tool.Â Liking the idea of a simple button and copy/paste user experience, but not sure if I’m at risk should ProPublica make corrections on an article that I haven’t reflected in my copy.Â Maybe that’s more of a concern in the libel-happy UK.Â Regardless, I hope more publishers see why this is a good idea.

ProPublica Wins Innovation Award

by Mike Webb ProPublica, July 19, 11:55 a.m.

For the second year in a row, ProPublica has received a Special Distinction Award from the Knight-Batten Awards for Innovations in Journalism. ProPublica’s Distributed Reporting Project was honored for “systematizing the process of crowdsourcing, conducting experiments, polishing their process and tasking citizens with serious assignments.” The judges called it “a major step forward with how we understand crowdsourcing.”

Additionally, ProPublica2019s News Apps were recognized by the Awards as Notable Entries. The judges said apps like our Recovery Tracker, Unemployment Insurance Tracker and Leadership PAC database continue “to pave the way in inventive collaborative work, developing a number of news applications to make data accessible to many.”

The winner of the top prize this year was the Sunlight Foundation’s “Sunlight Live,” which blends data, video, blogging and social networking tools to cover live news events. A complete list of the honorees is available at the Knight-Batten J-Lab website.

Last year, ProPublica’s ChangeTracker received a Special Distinction Award.

Congratulations to all of those recognized for their innovations!

Behind the scenes of the Open Platform’s evolution

When I came to the Guardian two years ago, I brought with me some crazy California talk about open strategies and APIs and platforms. Little did I know the Guardian already understood openness. It’s part of its DNA. It just needed new ways of working in an open world.

Last week, The Guardian’s approach to openness and mutualisation took a giant step forward when we brought the Open Platform out of Beta.

It’s a whole new business model with a new technology infrastructure that is already accelerating our ambitions.

I’ll explain how we got to this point, but let me clarify what we just announced:

We’ve implemented a tiered access model that I think is a first in this space. We have a simple way to work with anyone who wants to work with us, from hobbyist to large-scale service provider and everything in between.
We’ve created a new type of ad network with 24/7 Real Media’s Open AdStream, one where the ads travel with the content that we make available for partners to reuse.
That ad network is going to benefit from another first which is Omniture analytics code that travels with the content, as well.
License terms that encourage people to add value are rare. Using many of the open license principles we developed T&Cs that will fuel new business, not stop it.
Hosted in the cloud on Amazon EC2 the service scales massively. There are no limits to the number of customers we can serve.
The API uses the open source search platform Solr which makes it incredibly fast, robust, and easy for us to iterate quickly.
We introduced a new service for building apps on our network called MicroApps. Partners can create pages and fully functional applications on guardian.co.uk.

We’re using all the tools in the Open Platform for many of our own products, including the Guardian iPad app, several digital products and more and more news apps that require quick turn-around times and high performance levels.

There’s lots of documentation on the Open Platform web site explaining all this and more, but I figured I could use this space to give a picture of what’s been happening behind the scenes to get to this point.

It’s worth noting that this is far from the full picture of all the amazing stuff that has been happening at the Guardian the past 12 months. These are the things that I’ve had the pleasure of being close to.

Beginning with Beta

First, we launched in Beta last year. We wanted to build some excitement around it via the people who would use it first. So, we unveiled it at a launch event in our building to some of the smartest and most influential London developers and tech press.

We were resolute in our strategy, but when you release something with unknown outcomes and a clear path to chaos people get uneasy. So, we created just large enough hurdles to keep it from exploding but a wide enough berth for those who used it to take it to its extreme and to demonstrate its value.

It worked. Developers got it right away and praised us for it. They immediately started building things using it (see the app gallery). All good signs.

Socializing the message

We ran a Guardian Hack Day and started hosting and sponsoring developer events, including BarCamp, Rewired State, FOWA, dConstruct, djugl, Music Hack Day, ScaleCamp, etc.

Next, we knew the message had to reach their bosses soon, and their bosses’ bosses. So, we aimed right for the top.

Industry events can be useful ways to build relationships, but Internet events have been really lacking meaning. People who care about how the Internet is changing the world and who are also actively making that change happen were the types of people we needed to build a long term dialog with.

So, we came up with a new kind of event: Activate Summit.

The quality of the speakers and attendees at Activate was incredible. Because of those people the event has now turned into something much more amazing than what we initially conceived.

Nick Bostrom’s darkly humorous analysis of the likelihood of human extinction as a result of technology haunts me frequently still, but the event also celebrates some brilliant ways technology is making life better. I think we successfully tapped into some kind of shared consciousness about why people invest their careers into the Internet movement…it’s about making a difference.

Developers, developers, developers!

Gordon Brown was wise in his decision to put Tim Berners-Lee and Nigel Shadbolt on the task of opening government data. But they knew enough to know that they didn’t know how to engage developers. Where did they turn for help? The Guardian!

We couldn’t have been more excited to help them get data.gov.uk out the door successfully. It was core to what we’re about. As Free Our Data champion Charles Arthur joked on the way to the launch presentation, “nice of them to throw a party for me.”

We gave them a platform to launch data.gov.uk in the form of developer outreach, advice, support, event logistics, a nice building, etc., but, again, the people involved made the whole effort much more impressive than any contribution we made to it.

Tom Taylor’s Postcode Paper, for example, was just brilliant on so many levels. The message for why open government data could not have been clearer.

Election data

Then when the UK election started to pick up some momentum, we opened up the Guardian’s deep politics database and gave it a free-to-use API. We knew we couldn’t possibly cover every angle of the election and hoped that others could use the Guardian’s resources to engage voters. We couldn’t have asked for a better example of that then Voter Power.

A range of revenue models

All along there were some interesting things happening more behind the scenes, too.

The commercial team was experimenting with some new types of deals. Our ad network business grew substantially, and we added a Food Ad Network and a Diversity Network to our already successful Green Ad network.

It was clear that there was also room for a new type of ad network, a broader content-targeted ad network. And better yet, if we could learn about what happens with content out across the web then we might have the beginnings of a very intelligent targeting engine, too.

24/7 Real Media’s Open Ad Stream and Omniture were ready to help us make this happen. So, we embedded ads and analytics code with article content in the Content API. We’ve launched with some house ads to test it out, but we’re very excited by the possibilities when the network grows.

The Guardian’s commercial teams, including Matt Gilbert, Steve Wing, Dan Hedley and Torsten de Reise, also worked out a range of different partnerships with several Beta customers including syndication, rev share on paid apps, and rev share on advertising. We’re scaling those models and working out some new ones, as well.

It became obvious to everyone that we were on to something with a ton of potential.

Rewriting the API for scale

Similarly, the technology team was busily rebuilding the Content API the moment we realized how big it needed to be.

In addition to supporting commercial partners, we wanted to use it for our own development. The new API had to scale massively, it had to be fast, it had to be reliable, it had to be easy to use, and it had to be cheap. We used the open source search platform Solr hosted on Amazon’s EC2. API service management was handled by Mashery.

The project has hit the desk of nearly every member of the development team at one point or another. Here are some of the key contributions. Mat Wall architected it. Graham Tackley made Mat’s ideas actually work. Graham and Stephen Wells led the development, while Francis Rhys-Jones and Daithi O’Crualaoich wrote most of the functions and features for it. Martyn Inglis and Grant Klopper handled the ad integration. The wonderful API Explorer was written by Francis, Thibault Sacreste and Ken Lim. Matthew O’Brien wrote the Politics API. The MicroApps framework included all these people plus basically the entire team.

Stephen Dunn and Graham Tackley provided more detail in a presentation to the open source community in Prague at Lucid Imagination’s Solr/Lucene EuroCon event.

The application platform we call MicroApps

Perhaps even more groundbreaking than all this is the MicroApp framework. A newspaper web site that can run 3rd party apps? Yes!

MicroApps makes the relationship between the Guardian and the Internet feel like a two-way, read-write, permeable membrane rather than a broadcast tower. It’s a very tangible technology answer to the openness vision.

You can learn more by reading 2 excellent blog posts about MicroApps. Dan Catt explains how he used MicroApps for Zeitgeist. Since most of the MicroApps that exist today are hosted on Google AppEngine, the Google Code team published Chris Thorpe’s insights about what we’re doing with MicroApps on their blog.

The MicroApps idea was born out of a requirement to release smaller chunks of more independent functionality without affecting the core platform….hence the name “MicroApps”. Like many technology breakthroughs, the thing it was intended to do becomes only a small part of the new world it opens up.

Bringing it all together

At the same time our lead software architect Mat Wall was formulating the MicroApp framework, the strategy for openness was forming our positioning and our approach to platforms:

…to weave the Guardian into the fabric of the Internet; to become ‘of‘ the Web, not just ‘on‘ the Web

The Content API is a great way to Open Out and make the Guardian meaningful in multiple environments. But we also knew that we had to find a way to Open In, or to allow relevant and interesting things going on across the Internet to be integrated sensibly within guardian.co.uk.

Similarly, the commercial team was looking to experiment with several media partners who are all thinking about engagement in new ways. What better way to engage 36M users than to offer fully functional apps directly on our domain?

The strategy, technology and business joined up perfectly. A tiered business model was born.

The model

Simon Willison was championing a lightweight keyless access level from the day we launched the Beta API. We tested keyless access with the Politics API, and we liked it a lot. So, that became the first access tier: Keyless.

We offered full content with embedded ads and analytics code in the next access level. We knew getting API keys was a pain. So, we approved keys automatically on signup. That defined the second tier: Approved.

Lastly, we combined unfettered access to all the content in our platform with the MicroApp framework for building apps on the Guardian network. We made this deep integration level available exclusively for people who will find ways to make money with us. That’s the 3rd tier: Bespoke. It’s essentially the same as working in the building with our dev team.

We weren’t precisely clear on how we’d join these things up when we conceived the model. Not surprisingly, as we’ve seen over and over with this whole effort, our partners are the ones who are turning the ideas into reality. Mashery was already working on API access levels, and suddenly the last of our problems went away.

The tiers gave some tangible structure to our partner strategy. The model felt like it just started to create itself.

Now we have lots of big triangle diagrams (see below) and grids and magic quadrants and things that we can put into presentation slides that help us understand and communicate how the ecosystem works.

Officially opening for business

Given the important commercial positioning now, we decided that the launch event had to focus first and foremost on our media partners. We invited media agencies and clients into our offices. Adam Freeman and Mike Bracken opened the presentation. Matt Gilbert then delivered the announcement and gave David Fisher a chance to walk through a deep dive case study on the Enjoy England campaign.

There was one very interesting twist on the usual launch event idea which was a ‘Developer Challenge’. Several members of the development team spent the next 24 hours answering briefs given to us by the media partners at the event. It was run very much like a typical hack day, but the hacks were inspired by the ideas our partners are thinking about. Developer advocate Michael Brunton-Spall wrote up the results if you want to see what people built.

Here is the presentation we gave at the launch event:

(Had we chosen a day to launch other than the same day that Google threw a press release party I think you’d already know all this.)

Do the right thing

Of all the things that make this initiative as successful as it is, the thing that strikes me most is how engaged and supportive the executive team is. Alan Rusbridger, Carolyn McCall, Tim Brooks, Derek Gannon, Emily Bell, Mike and Adam, to name a few, are enthusiastic sponsors because this is the right thing to do.

They created a healthy environment for this project to exist and let everyone work out what it meant and how to do it together.

Alan articulated what we’re trying do to in the Cudlipp lecture earlier this year. Among other things, Alan’s framework is an understanding that our abilities as a major media brand and those of the people formerly known as the audience are stronger when unified than they are when applied separately.

Most importantly, we can afford to venture into open models like this one because we are owned by the Scott Trust, not an individual or shareholders. The organization wants us to support journalism and a free press.

“The Trust was created in 1936 to safeguard the journalistic freedom and liberal values of the Guardian. Its core purpose is to preserve the financial and editorial independence of the Guardian in perpetuity, while its subsidiary aims are to champion its principles and to promote freedom of the press in the UK and abroad.”

The Open Platform launch was a big day for me and my colleagues. It was a big day for the future of the Guardian. I hope people also see that it was a major milestone toward a brighter future for journalism itself.

Socially linked data

The semantic web folks, including Sir Tim Berners-Lee, have been saying for years that the Internet could become significantly more compelling by cooking more intelligence into the way things link around the network.

The movement is getting some legs to it these days, but the solution doesn’t look quite like what the visionaries expected it to look like. It’s starting to look more human.

The more obvious journey toward a linked data world starts with releasing data publicly on the Internet.

Many startups have proven that opening data creates opportunity. And now the trend has turned into a movement within government in the US, the UK and many other countries.

Sir Tim Berners-Lee drove home this message at his 2009 TED talk where he got the audience to shout “Raw data now!”:

“Before you make a beautiful web site, first give us the unadulterated data. You have no idea the number excuses people come up with to hang on to their data and not give it to you even though you’ve paid for it as a taxpayer.”

Openness makes you more relevant. It creates opportunity. It’s a way into people’s hearts and minds. It’s empowering. It’s not hard to do. And once it starts happening it becomes apparent that it mustn’t and often can’t stop happening.

The forward-thinking investors and politicians even understand that openness is fuel for new economies in the future.

We held a sort of hack day type event at the Guardian for the Cabinet Office recently where the benefits to open data in government were catalyzed in the form of a postcode newspaper built together by Tom Taylor, Gavin Bell and Dan Catt:

“Itâ€™s a prototype of a service for people moving into a new area. It gathers information about your area, such as local services, environmental information and crime statistics.”

Opening data is making government matter more to people. That’s great, but it’s just the beginning.

After openness, the next step is to work on making data discoverable. The basic unit for creating discoverability for content on a network is the link.

Now, the hyperlink of today simply says, “there’s a thing called X which you can find over there at address Y.”

The linked data idea is basically to put more data in and around links to things in a specific structure that matches our language:

subject -> predicate -> object

Source: T.J. VanSlyke — Linked data by T.J. VanSlyke

This makes a lot of sense. Rather than derive meaning, explicit relationship data can eliminate vast amounts of noise around information that we care about.

However, there are other ways to add meaning into the network, too. We can also create and derive meaning across a network of linked data with short messages, as we’ve seen happening organically via Twitter.

What do we often write when we post to Twitter?

@friend said or saw or did this interesting thing over here http://website.com/blah

The subject is a link to a person. The predicate is the verb connecting the person and the object. And the object is a link to a document on the Internet.

Twitter is already a massive linked data cloud.

It’s not organized and structured like the links in HTML and the semantic triple format RDF. Rather it is verbose connectivity, a human-readable statement pointing to things and loosely defining what the links mean.

So, now it starts to look like we have some opposing philosophies around linked data. And neither is a good enough answer to Tim Berners-Lee’s vision.

Short messages lack standard ways of explicitly declaring meaning within links. They are often transient ideas that have no links at all. They create a ton of noise. Subjectivity rules. Short messages can’t identify or map to collections of specific data points within a data set. The variey of ways links are expressed is vast and unmanageable.

The semantic web vision seems like a far away place if its dependent on whether or not an individual happens to create a semantic link.

But a structural overhaul isn’t a much better answer. In many ways, RDF means we will have to rewrite the entire web to support the new standard. The standard is complicated. Trillions of links will have to obtain context that they don’t have today. Documents will compete for position within the linked data chain. We will forever be reidenitfying meaning in content as language changes and evolves. Big software will be required to create and manage links.

The issue isn’t about one model versus another. As people found with tags and taxonomies, the two are better when both exist together.

But there’s another approach to the linked data problem being pioneered by companies like MetaWeb who run an open data service called Freebase and Zemanta who analyze text and recommend related links.

The approach here sits comfortably in the middle and interoperates with the extremes. They focus on being completely clear about what a thing is and then helping to facilitate better links.

For example, Freebase has a single ID for everything. There is one ID and one URL that represents Abraham Lincoln:
http://www.freebase.com/view/en/abraham_lincoln

They know that Wikipedia, The New York Times and the Congressional Biography web sites who are all very authoritative on politicians have a single URL representing everything they each know about Abraham Lincoln, too.

So, Freebase maintains a database (in addition to the web site that users can see) that links the authoritative Abraham Lincoln pages on the Internet together.

This network of data resources on Abraham Lincoln becomes richer and more powerful than any single resource about Abraham Lincoln. There is some duplication between each, but each resource is also unique. We know facts about his life, books that are written about him, how people were and still are connected to him, etc.

Of course, explicit relationships become more critical when the context of a word with multiple meanings enters the ecosystem. For example, consider Apple which is a computing company, a record company, a town, and a fruit.

Once the links in a network are known, then the real magic starts to happen when you mix in the social capabilities of the network.

Because of the relationships inherent in the links, new apps can be built that tell more interesting and relevant stories because they can aggregate data together that is connected.

You can imagine a whole world of forensic historians begging for more linked data. Researchers spend years mapping together events, geographic locations, relationships between people and other facts to understand the past. For example, a company called Six to Start has been working on using Google Maps for interactive historical fiction:

“The Six to Start team decided to literally â€œmapâ€ Cummingâ€™s story, using the small annotation boxes for snippets of text and then illustrating movement of the main character with a blue line. As users click through bits of the story, the blue line traces the protagonistâ€™s trajectory, and the result is a story that is at once text-based but includes a temporal dimensionâ€”we watch in real time as movement takes placeâ€”as well as an information dimension as the Google tool is, in a sense, hacked for storytelling.”

Similarly, we will eventually have a bridge of links into the physical world. This will happen with devices who have sensors that broadcast and receive short messages. OpenStreetMap will get closer and closer to providing a data-driven representation of the physical world, built collectively by people with GPS devices carefully uploading details of their neighborhoods. You can then imagine that games developers will make the real world itself into a gaming platform based on linked data.

We’ve gotten a taste of this kind of thing with Foursquare. “Foursquare gives you and your friends new ways of exploring your city. Earn points and unlock badges for discovering new things.”

And there’s a fun photo sharing game called Noticin.gs. “Noticings are interesting things that you stumble across when out and about. You play Noticings by uploading your photos to Flickr, tagged with ‘noticings’ and geotagged with where they were taken.”

It’s conceivable that all these forces and some creative engineers will eventually shrink time and space into a massive network of connected things.

But long before some quasi-Matrix-like world exists there will be many dotcom casualties who have benefitted from the existence of friction in finding information. When those challenges go away, so will the business models.

Search, for example, is an amazingly powerful and efficient middleman linking documents off the back of the old school hyperlink, but its utility may fade when the source of a piece of information can hear and respond directly to social signals asking for it somewhere in the world.

It’s all pointing to a frictionlessness information network, sometimes organized, sometimes totally chaotic.

It wasn’t long ago I worried the semantic web had already failed, but I’ve begun to wonder if in fact Tim Berners-Lee’s larger vision is going to happen just in a slightly different way than most people thought it would.

Now that linked data is happening on a more grassroots level in addition to the standards-driven approach I’m starting to believe that a world of linked data is actually possible if not closer than it might appear.

Again, his TED talk has some simple but important ideas that perhaps need to be revisited:

Paraphrasing: “Data is about our lives – a relationship with a friend, the name of a person in a photograph, the hotel I want to stay in on my holiday. Scientists study problems and collect vast amounts of data. They are understanding economies, disease and how the world works.

A lot of the knowledge of the human race is in databases sitting on computers. Linking documents has been fun, but linking data is going to be much bigger.”

New York Times Iterates Linked Open Data (techstartups.com)
Linking Open Data: An Emerging Practice Area for the Semantic Web (phaneron.rickmurphy.org)
Linked Data and the Semantic Web: What Are They and Should I Care? (slideshare.net)

Positioning real-time web platforms

Like many people, I’ve been thinking a lot about the live nature of the web more and more recently.

The startup world has gone mad for it. And though I think Microsoft’s Chief Software Architect Ray Ozzie played down the depth of Microsoft’s commitment to it in his recent interview with Steve Gillmor, it’s apparent that it’s at the very least a top-of-mind subject for the people at the highest levels of the biggest companies in the Internet world. As it should be.

The live web started to feel more tangible in shape and clearer for me to see because of Google Wave. Two of the Guardian developers here, Lisa van Gelder and Martyn Inglis, recently shared the results of a DevLab they did on Wave.

My brain has been spinning on the idea ever since.

(A DevLab is an internal research project where an individual or team pull out of the development cycle for a week and study an idea or a technology. There’s a grant associated with the study. They then share their findings with the entire team, and they share the grant with the individual who writes the most insightful peer review of the research.)

Many before me have noted the ambition and tremendous scale of the Wave effort. But I also find it fascinating how Google is approaching the development of the platform as a service.

The tendency when designing a platform is to create the rules and restrictions that prevent worst-case scenario behavior from ruining everything for you and your key partners. You release capability gradually as you understand its impact.

You then have to manage the constant demand from customers to release more and more capability.

Google turned this upside down and enabled a wide breadth of capability with no apologies for the unknowns. Developers won’t complain about lack of functionality. Instead it will probably have the opposite effect and invite the developers to tell Google how to close down the risks so their work won’t get damaged by the lawlessness of the ecosystem.

That’s a very exciting proposition, as if new land has been found where gold might be discovered.

But on the other hand, is it also a bit lazy or even irresponsible to put the task of creating the rules of the world that your service defines on the customers of your service? And do those partners then get a false sense of security because of that, as if they could influence the evolution of the platform in their favor when really it’s all about Google?

Google takes no responsibility for the bad things that may happen in the world they’ve created, yet they have retained full authority on their own for decisions about the service.

They’ve mitigated much of their risk by releasing the code as “open source” and allowing Wave to run in your own hosted environment as you choose. It’s a good PR move, but it may not have the effect they want it to have if they aren’t also sharing the way contributions to the code are managed and sharing in the governance.

They list the principles for the project on the site:

Wave is an open network: anyone should be able to become a wave provider and interoperate with the public network
Wave is a distributed network model: traffic is routed peer-to-peer, not through a central server
Make rapid progress, together: a shared commitment to contribute to the evolution and timely deployment of protocol improvements
Community contributions are fundamental: everyone is invited to participate in the public development process
Decisions are made in public: all protocol specification discussions are recorded in a public archive

Those are definitions, not principles. Interestingly, there’s no commitment to opening decision-making itself, only sharing the results of decisions. Contrast that with Apache Foundation projects which have different layers of engagement and specific responsibilities for the different roles in a project. For example,

“a Project Management Committee member is a developer or a committer that was elected due to merit for the evolution of the project and demonstration of commitment. They have write access to the code repository, an apache.org mail address, the right to vote for the community-related decisions and the right to propose an active user for committership.”

That model may be too open for Google, but it would help a lot to have a team of self-interested supporters when things go wrong, particularly as there are so many security risks with Wave. If they are still the sole sponsor of the platform when the first damage appears then they will have to take responsibility for the problem. They can only use the “we don’t control the apps, only the platform” excuse for so long before it starts to look like a cop out.

Maybe they should’ve chosen a market they thought would run with it and offer it in preview exclusively for key partners in that market until Google understood how to position it. With a team of launch partners they would have seemed less autocratic and more trustworthy.

Shared ownership of the launch might also have resulted in a better first use-case app than the Wave client they invented for the platform. The Google Wave client may take a long time to catch on, if ever.

As Ray Ozzie noted,

“When you create something that people donâ€™t know what it is, when they canâ€™t describe it exactly, and you have to teach them, itâ€™s hard…all of the systems, as long as Iâ€™ve been working in this area, the picture that Iâ€™ve always had in my mind is kind of three overlapping circles of technology, social dynamics, and organizational dynamics. And any two of those is relatively straightforward and understandable.”

I might even argue that perhaps Google actually made a very bad decision to offer a client at all. This was likely the result of failing to have a home for OpenSocial when it launched. Plus, it’s never a good idea to launch a platform without a principle customer app that can drive the initial requirements.

In my opinion, open conference-style IM and email or live collaborative editing within docs is just not groundbreaking enough as an end-user offering.

But the live web is fractionally about the client app.

The live web that matters, in my mind, harnesses real-time message interplay via multiple open networks between people and machines.

There’s not one app that runs on top of it. I can imagine there could be millions of client apps.

The Wave idea, whether it’s most potent incarnation is Wave itself or some combination of a Twitter/RabbitMQ mesh or an open XML P2P server or some other new approach to sharing data, is going to blow open the Internet for people once again.

I remember trying very hard to convince people that RSS was going to change the Internet and how publishing works several years ago. But the killer RSS app never happened.

It’s obvious why it feels like RSS didn’t take off. RSS is fabric. Most people won’t get that, nor should they have to.

In hindsight, I think I overvalued RSS but undervalued the importance of the idea…lubricating the path for data to get wherever it is needed.

I suspect Wave will suffer from many of the same issues.

Wave is fabric, too.

When people and things create data on a network that machines can do stuff with, the world gets really interesting. It gets particularly interesting when those machines unlock connections between people.

And while the race is on to come up with the next Twitter-like service, I just hope that the frantic Silicon Valley Internet platform architects don’t forget that it’s about people in the end.

One of the things many technology innovators forget to do is to talk to people. More developers should ask people about their day and watch them work. You may be able to breakthrough by solving real problems that real people have.

That’s a much better place to start than by inventing strategic points of leverage in order to challenge your real and perceived competitors.

Google Wave: Do You Think It Will Succeed? (mashable.com)
On brand consistency and BHAGs (factoryjoe.com)
Google Wave and Social Software 2.0 : Finally the raise of horizontal CESA Platforms and vertical CESApps? (stephanecroisier.jahia.com)

The thinking behind the Activate Summit event

The premise that the Internet is changing everything is only more potent now than it was when many people first considered that it might be true. Today we’re seeing how its capabilities have found their way into the hands of those who are actively changing the world.

But the key questions haven’t yet been played out enough. What does the Internet mean? How far will the changes go? Which aspects of civilization itself will become something different, perhaps even unrecognizable to us today through the pervasive effect of the network?

This is what we want to surface with The Guardian’s Activate Summit. Activate is an event about the people who are uncovering the answers to those questions.

Who are the ‘Activators’?

Weâ€™ve designed the event to get into the heads of the people driving the most important changes in politics, society, technology and the economy. Here are a few examples of the types of people and the things they are doing that we’ll see at the event…

There are new ways to elect our government leaders demonstrated by people like Thomas Gensemer of Blue State Digital who orchestrated Obama’s digital campaign.
Adam Afriyie MP is leading innovation across the public sector for David Cameron. He said in an interview about Activate:

“I’ve started looking at cumbersome Whitehall IT and the way IT policy can be improved to strengthen society and kick-start the digital economy. [Dormant Whitehall data sets] can be re-used by the public, adding both commercial and social value to these public assets.”
Tom Steinberg of MySociety is forcing a new kind of transparency in our government, a perspective we now expect of the publicly funded institutions that serve us in a way that we could only hope for before the Internet existed. And William Heath of Mydex and the Open Rights Group, among other things, is surfacing some of the implications of these changes and how to protect the individual. As he stated in an interview:

“In UK public services it’s clear to me that feedback, transparency and a stronger voice for the individual are all healthy. So I’m very optimistic, but I think we’re only half way there. In e-commerce we’ve tooled up the big organisations. Now we need to get properly tooled up ourselves.”
Iqbal Qadir, Charles Leadbeater and Umair Haque are demonstrating new forms of capitalism and the shape of the new economy for an age of scarcity..
Sugatra Mitra’s Hole in the Wall research that inspired Vikas Swarup to write Slumdog Millionaire demonstrates that education can be refactored into more self organized learning environments. Similarly, Richard Baraniuk is developing new open educational resources to revolutionize knowledge sharing.
Nobel Peace Prize winner Dr. R.K. Pachauri is driving policy change and promoting sustainable development around the globe through his research on climate change at The Energy and Resources Institute.
Innovators like Arianna Huffington and Gerry Jackson are reinventing the news business. Huffington is developing a next generation distributed news organisation, and Jackson, who operates the only non-state run radio station for Zimbabweans, is finding ways to use technology as an invisible medium to bypass censors and tell the important stories on the global stage that would otherwise never be heard.
Researchers like Andy Baio and Jon Udell are uncovering brilliant ways people can use tools to connect with other communities near them both physically and intellectually.
Channel 4’s Matt Locke is empowering young people to deal with issues they face with projects like the International Digital Emmy winner Battlefront. Similarly, William Perrin of the Kings Cross Environment and Talk About Local is networking together community campaigners across the country to help people get things done more effectively.
There are some amazing data-driven projects that are changing the world such as Steve Coast’s OpenStreetMap, a sort of wikipedia of location information which grows richer every day by 10’s of thousands of active volunteers who are creating a collaborative view of the world. And there’s also Gavin Stark’s AMEE project which aims to measure the carbon footprint of everything on earth.
Dr Ian Lipkin is identifying, studying and tracking the trajectory of infectious diseases throughout the globe. And Jay Parkinson is revolutionizing healthcare by changing the way people communicate with their doctors:

“Technology will not solve healthcareâ€™s problems. New business models combined with todayâ€™s technology and transparent market forces will…Healthcare needs to be Amazoned, Zipcarred, Facebooked, Etsyed, Tumblred, Appled, and Zapposed.”
Forward thinking designers like Matt Webb are reintegrating the networked and physical worlds. And Ryan Carson is innovating on the concepts of the social web.
And while John Van Oudenaren is using the Internet to preserve the past, Nik Bostrom is challenging where we’re going at the Future of Humanity Institute and Oxford University.
Of course, the foundation services enabling these visionaries to do their work are in many cases is powered by the accomplishments of people like Werner Vogels at Amazon and Bradley Horowitz at Google who are opening the vast technological capabilities and resources of their organizations.

Crucially, though, the technology behind all these movements is a tool in a larger agenda than the technology itself.

And this is why the event matters now. We’re tying to focus heavily on the do-ers, the type of people who break things to see how they work, people who are committed to larger agendas in life, leaders with global perspectives and deep concerns for the future.

It’s about the people actively changing the world and showing us all how to do it, too, hence the name – ‘Activate’.

Why now?

Brian Eno painted the picture that I hope Activate will convey when he described his Sydney Opera House light display:

“To imply ‘Oh God, there’s a crisis, no time for imagining any more’ – it’s not true. This is the time for imagining…The human ability to imagine made people capable of surviving. By allowing ourselves to let go of the world that we have to be part of every day, and to surrender to another kind of world, we’re allowing imaginative processes to take place.”

But perhaps a more tangible answer to ‘why now’ was captured by John Heilmann who observed via twitter:

“Amazing how much important campaign 08 stuff happend in 06. More amazing how oblivious I was at the time – and I was paying attention!”

I suspect a lot of people feel the same way and wish to recalibrate their perspective of what this revolution is all about. Hopefully, Activate will be the platform for people to reset and point forward again.

The cost of transparency and accountability

The MP Expenses issue is a very interesting story. There are lots of reasons why it is spreading so aggressively…

in hard times people look for someone to blame for what’s wrong in the world
people are learning to expect more transparency from their government
the availability of the data compells the curious to dive into all the detail and look for trends and interesting nuggets
activities are surfacing that people want to understand
the prospect of uncovering abuses that result in the downfall of a politician is too exciting for people who crave gossip to resist
…and on and on

The Internet is optimized for this kind of story.

A big pile of personal data was posted publicly in a usable format. (This data has been available via parliament.co.uk as PDFs for years, but once it’s in a convenient spreadsheet format it suddenly becomes meaningful and very shareable.) People then started finding interesting trends with very little effort. And then we got a very public flame war.

Here’s a snapshot of some of the recent triggers around this issue:

Guardian Data Store announced the publication of raw data via twitter

Tony Hirst built a travel expenses heat map to analyze the anomalies in the data

The debate by the people raged on twitter

The Telegraph launched ‘The Expenses Files’ The Expenses Files — how politicians exploit the system of parliamentary allowances to subsidise their lifestyles and multiple homes.’

Now, despite the fact that it’s incredibly important for this kind of thing to be possible, I think the scale of the conversation about it is very much a distraction. Stephen Fry captured this sentiment in a quip for a BBC journalist:

“Let’s not confuse what politicians get really wrong with things like wars with the rather tedious obsessions about whether or not they charged for wisteria. It’s not that important. It really isn’t. It isn’t what we’re fighting for. It’s a journalistic made up frenzy.” (Only, I disagree that’s a journalistic made up frenzy. The MPs, the public and the mainstream media organizations are all contributing to the noise together.)

If twitter activity can be considered an insight into what MP’s are spending their valuable time thinking about, then this issue is definitely becoming too big. This Tweetminster chart shows that ‘expenses’ are much more relevant today than ‘banks’ amongst MP’s who use twitter, for example:
<br />

Again, I’m happy that we live in a world where our politicians who we pay to represent us are accountable in a very open and public way and that we have the ability to ask them hard questions directly from wherever we sit in the social hierarchy.

Interestingly, this case also provides a view into the cost of openness.

Response to the Open Platform launch

The Open Platform launch earlier this month was one of the more exciting days I’ve had in a long time. We’ve done a good thing at the Guardian, and it seems we’re not alone in thinking that.

Here are some of my favorite Twitter posts about the launch (more here):

@IanYorston “Guardian Open Platform may be the most interesting thing to happen to newspapers in ages”
@netspaze “When major newspapers are closing down, UKâ€™s Guardian is opening up.”
@lisov “Oh my! Take a look at what The Guardian’s done! An open platform!”
@newscred “The Guardian’s Open Platform is an awesome initiative, their support & developer engagement is impressive.”
@dreamingspires “The Guardian open platform is a genius idea. maybe the way forward for newspapers.”
@estragon “Is this the future of journalism?”
@evirtus “Seriously impressed by The Guardian Open Platform! The future of news?”
@kate_butler “THANK YOU http://www.guardian.co.uk/open-platform i love you more than ever.”
@matlock “Guardian Open Platform is a fantastic piece of work. not jealous at all, honestly. 😉 ”
@r1tz “Serious kudos to the Guardian for launching the open API”
@tomskitomski “Thinking that Guardian’s Open Platform is what BBC Backstage could and should have become.”
@adrianholovaty “Super impressed with the Guardian’s API.”
@SamShepherd “Guardian Open Platform – I am a) impressed and b) disheartened. the gulf between the likely-to-survive and the soon-to-be-bankrupt widens”

And here are some of the best quotes from the media and blogosphere (more here):

“The Guardian is showing some guts by embracing new business models instead of clinging on to old, defunct ones.” (Mashable)
“The media brand is less a destination and a magnet to draw people there than a label once youâ€™ve found the content, wherever and however you found it. So the more places you can find it, the better.â€ (Jeff Jarvis)
“a whole bunch of smart, enthusiastic developers are going to leap on the Open Platform API and use it to make products that use news content in interesting ways. And the insights won’t cost Guardian a cent (penny?) in testing or development costs.” (LiveMint)
“Guardian Open Platform is a chasmic leap into the future. It is a work of simplistic beauty that Iâ€™m sure will have a dramatic impact in the news market. The Guardian is already a market leader in the online space but Open Platform is revolutionary. It makes all of their major competitors look timid.” (Tom Watson, MP)

This is my personal favorite…

“If content is king, then this is service is a hundred of the kingâ€™s best horses, and thousands of his best messengers, sending the Guardian far and wide. A misstep online is unlikely to cost the Guardian much, and should only encourage competitors innovationâ€”the industry sure needs it.
With this move, the Guardian redraw of where the boundaries of the newspaper industry lie, using to technology to reach as far as possible. Itâ€™s enough to make Conrad Black spit his prison breakfast all over his email-inbox.” (Bad Idea Magazine)

The feedback we’ve heard while participating in various events this month has also been very very positive.

Phil Wills and Mat Wall were both part of QCon London.
Simon Willison was on a panel at SXSW with The New York Times, NPR, Wired and DayLife talking about “the technical hurdles, the internal arguments, the surprising ways in which people have discovered new ways of looking at the news.”
Simon was also part of Future of Web Apps in Dublin which was a few days before our launch.
I shared some of the thinking behind the Open Platform at Changing Media Summit in London last week in a talk called “The Open Strategy“. That presentation is here:

(Changing Media Summit is “for anyone concerned with creative and commercial success in the digital age. It is aimed at senior executives responsible for strategies in digital, online, new media, mobile, marketing, branding, finance, comms, content, audio and more.”)
We also hosted the Rewired State event earlier in the month, and BarCamp London is going to be in our Kings Place offices this weekend which we’re really looking forward to.

There’s a ton of work to do to move closer to our vision for the platform. If we are going to be successful in weaving the Guardian into the fabric of the Internet, then we need to grow the services we rolled out already and develop the additional services that will round out the offering.

But if the Open Platform launch was about creating the conditions for positive things to happen, then I think we’re off to a really good start.

A few interesting data projects

The contagious data bug must be sweeping through the office, as several very different but very interesting data-driven publishing projects rolled out almost simultaneously.

First, infographics editor Paddy Allen explains the global recession through a very elegant interactive piece “Where did all the money go?“. Paddy has quite a collection of brilliant work from his interactive infographics such as the Energy-hungry houses piece to his storytelling through interactive visualization like the map of Heathrow’s planned 3rd runway.

Second, a strong team led by editor David Leigh has begun posting their investigations into “The tax gap,” a study of tax avoidance by big business.

“It has taken a team of specialists more than three months and involved checking scores of trademark registers and sets of company accounts in Luxembourg, the Netherlands, Switzerland and Ireland.”

One of the many ouputs of the investigation is the raw data that is informing some of the work, such as the interactive guide to corporate tax. For example, you can see what British Airways has reported paying compared with what is notionally due against their stated profits. The information is available in XML format, such as this year-by-year feed.

british-airways-tax-gap-guardiancouk

Third, and this is my personal favorite, the Football guys have outdone themselves with a new feature called Chalkboards. The Guardian’s head of sport Ben Clissett explains:

“No football debate will ever be the same again â€“ it’s not about opinion any more, it’s about facts. And our chalkboards give you the ammunition to settle the argument. You can also compare two players side by side â€“ if you want to compare Robbie Keane and Steven Gerrard in the same position for Liverpool, or Michael Essien and Mikel John Obi for Chelsea.

And when you have built your chalkboard, you can save it and start a discussion with your mates simply by pressing the save button and explaining your point. You can also embed images you have created on your blog, and use the tool with social networking sites.”

For example, I can see clearly for myself that Aston Villa’s draw against Wigan on Saturday was not due to a lack of offense. They had 16 attempts on goal, in fact, 4 on target and 3 shots blocked. The level of detail is amazing. I can also see where the teams focused their passing during the game.

by Guardian Chalkboards

This is the kind of data that typically only team owners and managers have access to. And even though the super fans can keep much of this in their heads, they can’t watch every game.

Now, perhaps the best part of this is the embeddable Chalkboard image. Since much of the Premier League discussion is happening in places all over the Internet, it makes sense to share the Chalkboards that both editors and users are creating both on and off guardian.co.uk.

Simple but very clever.

I love that each of these is so different. But there can be no doubt that data is starting to drive a lot of very creative approaches to the journalism process.

Building communities from Twitter posts

I spent a little time over the last couple of weeks playing around with some Twitter data. I was noticing how several people, myself included, were sharing the funny things their kids say sometimes:

So then I wondered whether there was a way to capture, prioritize and then syndicate the best Twitter posts into a ‘kiddie quote of the day’ or something like that.

My experiment only sort of works, but there are some lessons here that may be useful for community builders out there. Here’s what I did:

Get the quotes: I ran some searches through Twitter Search and collected the RSS feeds from those results to create the pool of content to use for the project. In this case, I used ‘daughter said‘ and ‘son said‘. I put those feeds into Yahoo! Pipes and filtered out any posts with swear words. Then I had a basic RSS feed of quotes to work with.
Prioritize the quotes: I’m not sure the best way to prioritize a collection of sources and content, but the group voting method may do what you want. Jon Udell has another approach for capturing trusted sources using Del.icio.us. For voting, there’s an open source Digg clone called Pligg. I set it up on a domain at Dreamhost (I called it KidTwits…Dreamhost has a one-click Pligg installer that works great) and then pumped the RSS feed I just made into it. In no time I had a view into all the Twitter posts which were wrapped in all the typical social media features I needed (voting, comments, RSS, bookmarking, etc.).
Resyndicate the quotes to Twitter: While you might be able to draw people into the web site, it made more sense in this case to be present where interested people might be socializing already. First, I created a Twitter account called KidTwits. Then I took a feed from the web site and sent it through an auto-post utility called twitterfeed. Now the KidTwits Twitter account gets updated when new posts bubble up to the home page of kidtwits.com.
Link everywhere possible: When building the feed into Pligg I made sure that the twitter ID of each post was captured. This then made it possible to “retweet” with their IDs intact. Thus, the source of the quote would then see the KidTwit posts in their Twitter replies. It works really well. People were showing up at the web site and replying to me on Twitter the same day I began the project.
Again, I used Yahoo! Pipes to clean up and format the feed back out to Twitter to include the ‘RT’ and @userid prefix to each entry. I played around a bit before arriving at this format.

I also included a Creative Commons copyright on all the pages of the web site to make sure the rights ownership issues were clear.

Lastly, I added a search criteria for my feed collector that looks for references to KidTwits. This means people can post directly to the web site either by adding @kidtwits to their posts or #kt. There was already a New Zealand Twitter community forming who began using ‘kt’ to join their posts (short for kiwitweets), but they gave it up. I then had to filter out references to the kidtwits Twitter posts to avoid an infinite loop.
Improve post quality: Now, here’s where things have been failing for me. I can’t think of better search terms to capture the pool of quotes I want, but there are so many extraneous Twitter posts using those words that it seems like I’m getting between 5% and 10% accuracy. Not bad, but certainly not good enough. The good news is that it’s pretty easy to kill the posts you don’t want through the Pligg interfaces. I just don’t have the time or desire to maintain that.
Optimize the site: I then did a bunch of the little things that wrapped up the project. I added Google Analytics tracking, created a simple logo and favicon, customized the Twitter background, and configured Pligg to import the Twitter Search pipe automatically.

There are several things I like and a few I dislike about this little project.

I really like the fluidity of Twitter’s platform. It’s amazingly easy to capture and resyndicate Twitter posts.
I love the effects of the @reply mechanism. I can essentially notify anyone who gets their Twitter post listed on the home page of kidtwits.com without lifting a finger. And they get credit automatically for their post.
I already knew this, but Yahoo! Pipes is just brilliant. I can’t imagine I would have even considered this project without it.
Pligg is pretty good, too. It does everything I want it to do.
I would love to hand over the management of the voting and quality checks to someone else. Voting naturally invites gaming. At the end of the day, however, the quality control and community management function is what makes a community service interesting to people. You can’t automate everything.
I’m actually not a fan of voting approaches to prioritizing content. It will ultimately result in dumbing down the quality. That’s less of an issue for highly niched topics like this, though.
The rights issues are a little weird. This wouldn’t be a problem in forming a community whose purpose is noncommercial naturally. But I’m not sure the Twitterverse would respond well to aggregators that make money off their posts without their knowledge or consent. (To be clear, KidTwits is not and never will be a commercial project…it’s just a fun experiment.)
Auto-retweeting feels a bit wrong. I wouldn’t be surprised if the KidTwits account gets banned. But I have explicitly included the source and clearly labeled each Twitter post with ‘RT’ to be clear about what I’m doing. I’m not building traffic to my account, the web site, nor am I intentionally misrepresenting anything.
By adding “RT @userid” I’ve killed the first 10 or so characters of the post that I’m retweeting. This means the punchline is often dropped which kills the meaning of the retweeted post.
Some conversational Twitter posts get through which include @replies to another user. When the KidTwits retweet of that post goes out it’s very confusing.

The potential here, among other things, is in creating cohesive topical communities around what people are saying on Twitter. You can easily imagine thousands of communities forming in similar ways around highly focused interest areas.

In this method the community doesn’t necessarily have the typical collective or person-to-person dynamics to it, but the core Twitter account can act as a facilitator of connections. It can actually create some of the authority dynamics people have been wanting to see. It becomes a broker of contextually relevant connections.

In a very similar way the web site serves as a service broker or activity driver. It’s a functional tool for filtering and fine-tuning the community experience at the edge. The web site is not a destination but more of a dashboard or a control panel for the network.

The experiment feels very unfinished to me still. There’s much more that can be done to create better activity brokering dynamics across the network through the combination of a Twitter account and a web site, I’m sure.

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28

Share this:

ProPublica Wins Innovation Award

Share this:

Share this:

Related articles by Zemanta

Share this:

Related articles by Zemanta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: