Quality of (re)search – it’s all about the Metadata | FirstRain

Quality of (re)search – it’s all about the Metadata

Your average search engine user has the patience of a toddler – they expect to leverage one or a few keywords to get the answer they are expecting and they’ll only try once or twice before giving up. If the results are bad, they’re not likely to try again – you’ve lost a user, or customer.

This is a hard problem to solve when the user is a senior professional looking for business information about a competitor – or the market trend behind a stock’s movement.

FirstRain is all about only the highest quality, high precision results for professional users and a critical piece of our system behind our results is our rich Metadata library. To put it simply, Metadata is data about data; for us, it’s the information architecture that captures what the documents in our system are about – and they can be about any number of concepts drawn from the thousands we’ve modeled. They can be literal (this is about Apple Computer) or they can be crossed by any number of building blocks (this is about layoffs at Motorola in Illinois).

The base layer of FirstRain is the categorization engine that identifies and tags what an article is about – a company, a market trend, an event. Since these tags determine what users see when doing research in our system (and remember – no patience with poor results), the rules that go into identifying them need to be spot on. For example, it’s necessary to ensure a new video game launch from a company like Electronic Arts is tagged appropriately, but a blog about how to reach new levels in one of their games is ignored. A single web document may have any number of tags reflecting the content in it. The article about Electronic Arts could also mention other relevant information about competitors and partners, as well as trends like video game sales; and it’s important to identify and tag each appropriate facet.

But just tagging the data to the right topics is only half the magic. Imagine what you can do if you can also analyze the frequencies of the Metadata. Keep in mind we have millions of documents in FirstRain, each containing numerous tags telling us what it’s about and where it’s from. Go back to the Electronic Arts example: FirstRain can identify the number of times EA is mentioned in conjunction with any related topic, such as new products, or any other company. We highlight spikes in these counts for our users and so show up emerging trends in number of mentions above or below their competition. These counts can give great insight into what a company is doing, or is about to do.

By treating the Metadata as a database in itself, and analyzing it for spikes and patterns we can identify emerging trends for users that they simply could not see from even the highest quality individual search results.