Archive for May, 2006

What happens when you cross Tagging with Machine Learning? Well you get a tool that learns as you tag. Sounds interesting? Then read on.

Learns? But what.

Thats what I asked myself yesterday. What I visualised was a tool that learns how to tag and what to tag from you. It would learn from past experiences, what you would like to tag and also what keywords (tags) you would likely use for them. And finally one day, when it has a large enough knowledge base, it could probably automate the entire task for you.


Imagine having a your own personal crawler, spidering the web in search of pages that might interest you and even saving the most likely ones. Imagine coming to office and seeing your toRead list already populated by the bot.

Sounds too optimistic? Well I’ll give it a try. Until then you’ll have to do what humans do best – tagging – on your own.

The Evolution Of Tagging

| May 18th, 2006

The Present And The Future

Tagging has been there for quite sometime now, although it seems to be picking momentum after the Web2.0 meme. But the question one needs to answer is that “Has tagging evolved?“. It has yet to evolve out of its stone age era.

Tagging basically deals with organizing information retrieval. But yet current systems don’t seem to apply any of the information retrieval optimizations to it. It could prove useful and relevant if it was treated as a mere search rather than a whole new concept (the very reason why non-techie users are not lured into tagging). It would also prove to be more accurate if the IR preprocessing like stemming, synonym etc could be applied to it. The knowledge acquired in the process is very valuable due to the human intelligence behind it and can be exploited in many useful ways.
But tagging has evolved to some extent. It has evolved from single word tags to multi-word tags. It has evolved in terms of granularity from the Document ( to the Content (recoja, Google Notebook).

What we need to focus on is what more can be done with it (tagging) rather than just replicate what already can be done with it. What do you think could be the evolutionary steps in tagging?

Although the ubiquitous $() function of prototype.js helps developers from retyping the DOM methods for getting an element, it should be used with utmost care. As they say ‘the devil’s in the details‘, we take a look at the protoype definition for $():

function $() {
var elements = new Array();

for (var i = 0; i < arguments.length; i++) { var element = arguments[i]; if (typeof element == 'string') element = document.getElementById(element); if (arguments.length == 1) return element; elements.push(element); } return elements; }

As we see above, the document.getElementById(element) is called on every function call. Therefore all the $() function does is facilitate code resuse.

But I have seen developers use the $() without regard to its implementation just because its handy, short and sweet. People tend to write code like the following:

for(var i=0; i< $('element').length; i++) { var value = $('element').value; ....... } Here in effect, for every iteration the DOM dynamically looks up for the element. This is inefficient which can be avoided using Object Caching (storing the objects referenced many times into a local variable). Here is the same code using object caching:

var element = $(‘element’);
for(var i=0; i
var value = element.value;

Sometimes these minute details can make a big difference.

Something which was bound to happen. After all even Google has to rely on human intelligence to do some of its work. Google Coop helps people contribute their expertise (in other terms bookmark links) by adding labels (categories) and annotating (description) them. But it goes way beyond the mode by using this information in improvising search.

In all the forums I visited I came across this one point again and again: “Google Coop is susceptible to spammers“. I don’t agree. Like any other social app which is fuelled by the people, at the first glance it does seem susceptible. But it is the social factor that seems to ward of spammers. Quoting a FAQ from Google Coop:

Who will see my labels?

Users who subscribe to you will see your labels for relevant searches. As your labels become higher quality and more comprehensive, and as more users subscribe to you, your labels may start surfacing to more Google users than just those who explicitly subscribed. A number of factors help determine how broadly your labels appear — such as the number of subscribers you have, how many websites you’ve labeled, and, most importantly, how often your labels make it easier for users to find what they’re looking for.

Google seems to have realised that it can achieve a lot more by utilizing the human intelligence, Intelligence which is the core of the Web 2.0. After all the ubiquitous search is also based on human knowledge (creation of links between pages).