Boston Analytics looks like a decent enough company. Its business is research and analytics and among other things, it brings out research reports on private equity investments. In April, the firm released a report on private equity investments in India. But it turns out that the report lifted copyrighted data from Chennai-based Venture Intelligence’s private equity deal database.
What seems to have happened, going by Venture Intelligence’s press release, is that Boston Analytics bought Venture Intelligence’s database and then used that information to compile its own report and sell it in the market. Juvenile conduct for a company whose management team hails from institutions such as MIT and the Indian Institute of Management. Now Venture Intelligence has won a High Court injunction against that copyright violation. So looks like the matter has been laid to rest.


Hi Satsheel, thanks for writing and the link.
One such exciting scraper, which I stumbled upon http://www.monitor110.com/
Hi Subrata, I’m going to try this. But need some help. Let’s talk offline.
Hi Snigdha
Let me take a shot at explaining web-scrapers. Computer programs process data and/or information and can potentially do two things with it. One, they send it out for consumption of another computer for further processing. Two, they put the information out for human consumption, mostly as a visual display (like this text). Web scrapers are programs that wade through the second type of output – on the public internet – and extract information. The intelligence of the scraper depends on whether it can recognize and categorize tags within the page it scrapes and how it picks up which pages to scrape. The next step after scraping is to direct the output to a structured database.
If you use Firefox as browser then you can get a sense by installing the Gnosis plug-in from ClearForest. It is quite good at auto-categorization. Do drop me an e-mail if you wish to discuss in a separate thread.
Sincerely
Subrata
Hi Subrata, when newspapers pick up VI’s data or for that matter any other data, they attribute the source. I understand that was not the case in this particular instance. Even E&Y and PWC use VI’s data in their reports, but they attribute source. And btw, what’s a web scraper? Interesting idea.
Snigdha
Don’t all content aggregators want their content to be attributed just so they become popular? I wonder what ticked VI off in this case (perhaps the way the contract was written or maybe Boston Analytics didn’t properly attribute the source). I’ve seen Indian journals/newspaper print a fair amount of VI content and VI doesn’t seem to mind that!
Given that PE/VC information in India is at such a premium, perhaps a neat idea is to create an intelligent web-scraper that sweeps blogs and public resources to create a database
Sincerely
Subrata
PS: I read your other blog – darjeelingblack – for the first time this morning. Nicely written.
What has anything to do with the background of people? There are atleast as many examples of Havard & Yale alumni implicated in securities fraud, insider trading, misinformation as many US presidents! Bringing in a reputed school’s name neither proves or disproves a point. Something ‘can’ be right or wrong on its own merits, bringing in inconsequential, irrelavant information makes a good blog read, but achieves little.