10 October 2009
Web2.0, a definition
You may have heard the term Web2.0, a term first used in 2004. If you ask an expert what it means you'll probably get differing answers depending on who you ask because there is no real clear definition of it. So this is my one.
There are two main feature of Web2.0 which distinguish it from sites that aren't Web2.0.
- Web2.0 is about people creating their own content for publishing online
- it is also about the supporting technology for this content
It is easier to explain Web2.0 if you set it in context of what there was previously.
In the early days of the web, despite it originally being conceived as a document sharing and editing environment, the editing part rarely happened. Early sites were generally about a company, organisation or individual producing content, publishing it on their website and then people reading that content or transacting with it, e.g. reading the news on-line or buying a book.
However, following the emergence of blogs it became easier for larger number of people to author their own content and have others comment on it, just as you can do here. Similarly, Amazon allowed others to post their own reviews. This activity, together with the very long standing Internet tradition of news groups, forums, bulletin boards and so on going back to the 1970's - all these came together to form the early implementation what we now call Web2.0.
When you consider that most people think of Web2.0 as twitter, facebook and other similar sites they think of it as a social platform which allows them to publish their own content easily and share it with their friends. However, this facility has been around on-line for almost 30 years. In 1979 with the invention of usenet groups it was possible to easily share content online and from my own personal experience I used to run a mailing list called Gaelic-L that was founded in 1989 and allowed people with similar interests to share content with their online connections even way back then. In 1990 I also proposed an early browser with user generated content and personalised news, based on the fact that many people were by that time doing much of that anyway.
Web2.0 is therefore more than just being able to publish content and share it with your friends, this has been possible for decades, it's about the types of technology that make it happen as well and how these combine together. In the early days if I wrote an article in a newsgroup, people might reply to it. With Web2.0 you can not only reply to it but you might be able to vote on it and even edit the original, this is how wikipedia works - people collaborate together using a wiki as a tool for sharing information. The articles in a wiki are often authored by several people rather than just one. Similarly it wasn't just that blogs made it easy for people to write their own content, the platforms they used to write their blogs held and published the content in a structured way and this allowed the content to be easily reused in other contexts using a technology called RSS (Really Simple Syndication). What this means is that you didn't have to go to the blog to read the post, you could pick up the notifications of new posts via an RSS reader or another website entirely. Sites can also publish a programming interface called an API which can support the same functionality as RSS and more besides. RSS feeds are particularly useful at following new content - e.g. new news article, new blog posts or more specialised searches such as new jobs matching your requirements on a job board. API calls are better for more generalised searches e.g. "how many twitter users are based in Edinburgh" or "Who posted the first tweet about Michael Jackson's death" or "give me the data to plot a graph of the number of times President Obama's Nobel prize was mentioned in the hours after the announcement was made", etc.
As an example of RSS in action, my posts here automatically feed out to twitter and friendfeed. My friendfeed is then published on my facebook pages. This sharing of data across many sites and applications and interpreting the content in different ways is one of the key distinguishing features of web2.0 over web1.0. This is quite a long post, too long for the 140 character limit for twitter, but the connection between my blog and twitter takes care of that. Similarly when I post something new to the photo sharing platform Flickr, it also appears via a link on Twitter even though twitter doesn't directly support photos - the sites all interact with the same content but in different ways.
Taking this example of data sharing further you can combine (mash) information from different sites to produce something new, this is called a mashup. An example might be pulling in data from Google maps, geotagged photos from Flickr, public rights of way information from the government or council and accommodation information and reviews from a hotel booking site. Combining this information together using the publicly available data would allow you to show walks overlaid on a map together with examples of the views you could expect to see along the way and recommended places to stay en-route.
So Web2.0 is about people creating content (blogs, photos, statuses) together with the supporting technology (facebook, wikis, twitter) allowing this content to be shared, connected and reused in many different ways. It isn't really about endless "beta", rounded graphics, pastel shades and large fonts although these are incidental elements of the Web2.0 scene.
Just as there's no single definition of Web2.0, there is even less clarity about what might come next for Web3.0. The leading consensus is this will be about the semantic web. This represents a bigger challenge than web2.0 because it is about taking the largely unstructured and often ambiguous content on the web and tagging it in ways that allow it to be more clearly defined and reused. For instance if I type London Bridge into Google, there is no way at present to distinguish if I meant the actual bridge itself, the railway station with the same name, the underground station with the same name, the hospital with the same name or the bridge that got shipped to Arizona. Another example is differentiating text with a particular meaning from the same text that occurs by coincidence - e.g. a Digital Will is a type of Will (a legal document for when someone dies) that covers digital assets such as your emails, photos, MP3s, on-line contacts, etc. However, if you search for this term in Google you get some references to both the legal document but also the same phrase occurring in entirely different contexts such as "Digital will overtake print" and "Western Digital will move to Irvine". The semantic web will not only help to classify how words are used from a linguistic point of view but it will also allow content to be queried as data - for instance on a restaurant website you could mark-up your opening hours and this would allow people to search using a semantic search engine for restaurants open at a particular time of day. The biggest challenges faced by Web3.0 are in agreeing the common vocabularies and then deploying them effectively across the billions of web pages that already exist.
As you can see, although Google is quite good at being able to find pages containing certain terms it is currently very poor at making sense of the data in a structured way. This is because without the data being marked up in a semantic way (either through the use of markup directly or by attempting to deduce the context), it is an exceptionally difficult task for a search engine to provide this functionality. Web3.0 will make this job a lot easier but the means by which Web3.0 will emerge is still unclear. What we do know though it that it should make searching for information a lot more powerful and specific. Google is also exceptionally poor at searching sites that already have structure - for instance if I wanted to find a hotel room for tonight I would use an accommodation search engine and Google would find me the site which listed the accommodation rather than the accommodation itself. Google can't tell me what rooms are available tonight but it can point me towards sites that are likely to have this information. This will all change with Web3.0 and the use of intermediary sites will significantly decline as the information they hold begins to open up to more generalised search engines.
I hope this has been helpful. If anyone is looking for a Web2.0 or Web3.0 specialist, please get in touch via firstname.lastname@example.org, twitter, facebook or linkedin.
I do Internet things, manage large websites, play around with language, campaign for good causes, try to explain things and have fun singing along the way (not all at the same time!).