Linux Today: Linux News On Internet Time.
Search Linux Today
Linux News Sections:  Developer -  High Performance -  Infrastructure -  IT Management -  Security -  Storage -
Linux Today Navigation
LT Home
Contribute
Contribute
Link to Us
Linux Jobs

Partner Sites
JustLinux.com
Linux Planet
PHPBuilder
Technology Jobs

Top White Papers

More on LinuxToday


Apache Nutch 2.0 indexes at web scale

Jul 11, 2012, 07:00 (0 Talkback[s])

The Apache Nutch developers have announced that version 2.0 of the network crawling and indexing search framework is now available. Built on top of other Apache projects including Solr, Tika, Hadoop and Gora, Nutch has been designed to crawl "at web scale" to allow organisations to create searchable indexes of their web-published content. Nutch adds web-specific functionality to Solr with a link-graph database and uses Tika to parse web pages and a number of other document formats.

Complete Story

Related Stories: