Linux Today: Linux News On Internet Time.
Search Linux Today
Linux News Sections:  Developer -  High Performance -  Infrastructure -  IT Management -  Security -  Storage -
Linux Today Navigation
LT Home
Contribute
Contribute
Link to Us
Linux Jobs


More on LinuxToday


NewsForge: Extract Data from the Internet with Web Scraping

Mar 31, 2006, 08:30 (0 Talkback[s])
(Other stories by Rob Reilly)

[ Thanks to drtorque for this link. ]

"Even if you don't know how to access databases using a Web browser or use an RSS reader, you can extract information from the Internet through Web page scraping. Here's how you can use some Linux-based tools to get data.

"First, you need to decide what data you want and what search strings you'll use to get it. Obviously, if you need only three lines of information, cutting and pasting is the best way to go. However, if a Web page has 640 lines of useful data that you need to download once a day for the next week, automating the process makes a lot of sense..."

Complete Story

Related Stories: