Keyword retrieval
Context
I had just got into topic maps in a major way and was thinking about possible applications for topic maps. At this point Google hadn’t come into its own and search directories such as Yahoo were beginning to show their shortcomings. Other indexers such as Alta Vista couldn’t provide context to searches and were being polluted by fake submissions. Topic maps describe relationships in the subject domain, rather than keywords in the resource domain. This made them a good candidate for solving the problems search engines were facing.
Problem
There are too many documents in the world: Too many resources. Search engines can only do so much – not enough. Humans think in concepts and subjects, not keywords and phrases. Navigating the resource layer is hit and miss and a time consuming process. Users need to be able to jump to the subject layer from the resource layer.
Solution
A central repository of subject keywords needs to be held to provide lookup services to clients. Clients need to be able to analyse documents and mark-up the appropriate keywords. Links could then be made from resources to subjects. The repository would have to be quality controlled to stop malicious users from corrupting it with junk entries.
Inspiration
This idea came to me at the end of a particularly bad holiday to New York in 1998. I was staying with one friend, had a falling out with her for no good reason and found myself out on the street at 11pm. I then hooked up with another friend who said he could get me into his hotel room. Well, there were no spare rooms in the hotel and he just disappeared. I found myself on the streets at 2am. I didn’t really know what to do so I wandered around, drank some coffee, hung out in a park full of male prostitutes parading in front of me, visited the emergency room of a hospital for the toilet and then shivered my way through to day break. It wasn’t the best of nights. At daybreak I walked up to Central Station and then walked downtown and caught the Statton Island ferry. I had the ideas on the return journey and feverishly wrote them up in Central Park. I recommend that ferry trip to anyone.
Implementation
You would need a big server somewhere registering the names and some client software to mark up the documents. Alternatively, the mark-up could be done on the server before being sent to the browser.
Strengths and Weaknesses
The main weakness was that the system was open to abuse. You only get out what you put into such a system. That is probably its strength as well. This means that only trusted parties should be able to enter data into the system.
Business Model
The failure of Real Names proves that even a well-funded venture in this field could fail. I never thought that there would be much money in extorting cash from businesses in name placement. I thought that money could be made by adding value to content, especially where there was a vertical market in data/information. For example, a publisher of primary materials could benefit greatly by providing links from that content to concepts which link to articles discussing the concepts at a secondary or tertiary level.
Outcome
Real Names had the “keywords” idea and had the money to implement it. Interestingly enough it was announced in 2002 that Microsoft would no longer partner with Real Names, leading to the likely demise of the organisation and system. Other companies have implemented rating systems for web pages and suggest related links. Yet more companies have made browser toolbars to mark up documents. So the field was pretty much covered. I don’t think that any of them made money.
http://www.realnames.com/
I was to have another crack at a similar idea in Metadata Producer/Consumer where the focus was on channels rather than on the whole Internet.
Tags
idea
metadata