Wednesday, June 18, 2008

The next revolution in data

I came across a CNET article today discussing Google's challenges in determing what the user actually wants when searching. We have data. We have search engines that can go through vast amounts of data now and return millions of possibilities. The problem of course is that we are drowning in data and it is getting progressively harder for a user to find things that he/she actually wants while at the same time more problematic for companies and individuals to promote their material/products/services within this morass of information.

One of my main interests in the theoretical side of software and computers for the past few years is trying to figure out how to contextualize a user's search so that relevant information is returned. Then, the user can go through a manageable amount of data to find what is needed. This, I think is the next frontier of data and whoever can pull it off will become incredibly wealthy. Hence why I try and think about it...

My own thoughts is that it will take a number of factors, but it will primarily be a combination of the browser that the user is working with, the search service that it is calling upon and the hardware that the browser relies on:
  • The Browser - this will play a key role I think. My guess is that the browser will have to learn what the user is looking for through common search behaviour. For example, in my work I'm often looking for sites & blogs that will solve programming issues that I'm having problems with. Over time, the browser would figure that if I put something like "C#" or "SQL Server" that I'm not looking for training, or wanting to buy products but information specific to programming.
  • The Service - the search engine (Google, MSN, Yahoo, or whoever) would need to communicate with the browser to determine what parameters to search from. As such, this would take cooperation from both the search engines and browser manufactures to create a standard API that would not deviate (hello Microsoft - how close to W3C standards is IE?)
  • Hardware - I threw this in because location is important in searches, and mobile technology is developing at such a fast pace. Really, unless I specifically state otherwise, searches for things such as restaurants, shopping or tourist areas should be within the context of the current physical location that the device is at. If I want to look up fish, either in a buying for pets or for a seafood market, the search should be looking reasonably close to where I am located - say 100km. This also ties back to the browser learning through past behaviour - if I have been in this location before and looking for restaurant it should default to seafood restaurants.
Finally, meta data will need to be much more strictly controlled, especially for websites. As it is easy to overload/fake meta tags for a website, it is very difficult to really organize data in that way (i.e. the reason behind Google's mysterious search algorithms).

No comments: