Linux Today: Linux News On Internet Time.

More on LinuxToday

NewsForge: Extract Data from the Internet with Web Scraping

Mar 31, 2006, 08:30 (0 Talkback[s])
(Other stories by Rob Reilly)

WEBINAR: On-Demand

No-Size-Fits-All! An Application-Down Approach for Your Cloud Transformation REGISTER >

[ Thanks to drtorque for this link. ]

"Even if you don't know how to access databases using a Web browser or use an RSS reader, you can extract information from the Internet through Web page scraping. Here's how you can use some Linux-based tools to get data.

"First, you need to decide what data you want and what search strings you'll use to get it. Obviously, if you need only three lines of information, cutting and pasting is the best way to go. However, if a Web page has 640 lines of useful data that you need to download once a day for the next week, automating the process makes a lot of sense..."

Complete Story

Related Stories: