---

Yahoo open sources Anthelion web crawler for parsing structured data on HTML pages

Yahoo today announced that it has released the source code for its Anthelion web crawler designed for parsing structured data from HTML pages under an open source license.

Web crawling is at the very core of Yahoo, even though it has many other applications, including Yahoo Mail, Yahoo Finance, Yahoo Messenger, Flickr, and Tumblr. For Yahoo to share code in an area as competitive as web search is significant.