---

Content mining with Apache Tika

Wazi: Apache Tika is a content-mining library that allows you to pull both metadata and text content out of documents of many different types.