Harvard's Berkman Center Seeds the MediaCloud | Linux Today

Harvard’s Berkman Center Seeds the MediaCloud

Written By
Web Webster
Web Webster
Mar 13, 2009

[ Thanks to Tom
Dunlap
for this link. ]

“From there, the story text goes into a full text
search engines to retrieve specific terms or phrases, gets dumped
into a database, and becomes source material for the three simple
tools currently on the site to let people start playing with the
service. Being able to throw text against Calais and get pretty
high quantity entities and terms out of it, Zuckerman says, was a
“big step forward.”

“The open source and open data project runs off the Amazon
cloud. The Berkman Center tried it on its own server first, but
with terabyte file systems and hundreds of gigabytes of relational
databases, it couldn’t keep up. “It’s pretty exciting that by
signing up with Amazon we were able to scale massively and very
quickly,” Zuckerman says. The service hopes ultimately to scale to
15,000 RSS sources.

“What’s currently live — showing the top ten most mentioned
terms for up to three media sources at a time, or the top ten most
mentioned term for each media source that occurs in stories along
with a term you specify, or a world map of each media source that
indicates which countries get more coverage–is meant as just of a
taste of what you can do with the data.”

Complete
Story

Web Webster

Web Webster

Web Webster has more than 20 years of writing and editorial experience in the tech sector. He’s written and edited news, demand generation, user-focused, and thought leadership content for business software solutions, consumer tech, and Linux Today, he edits and writes for a portfolio of tech industry news and analysis websites including webopedia.com, and DatabaseJournal.com.

Linux Today Logo

LinuxToday is a trusted, contributor-driven news resource supporting all types of Linux users. Our thriving international community engages with us through social media and frequent content contributions aimed at solving problems ranging from personal computing to enterprise-level IT operations. LinuxToday serves as a home for a community that struggles to find comparable information elsewhere on the web.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.