Linux Today: Linux News On Internet Time.

Linux Magazine: Getting a Handle on Traffic

Jan 02, 2003, 22:00 (0 Talkback[s])
(Other stories by Jeremy Zawodny)

"When running a Web site of any size, it helps to learn about the visitors you're attracting. The traditional solution for monitoring Web traffic is a log file analysis tool such as analog (http://www.analog.cx). analog is very fast, but what if you'd like real-time or near-real-time statistics? You could run analog from cron every five minutes, but what if you also want to issue ad-hoc queries against your logs to answer very specific questions like, 'What's the average number of pages that each Internet Explorer user views?'

"Things begin to get difficult when you try to customize most Web log reporting tools. And that's a shame. There's a lot of interesting questions you might want to ask about your Web traffic: 'Where are users coming from? How do users find my site? Do users usually enter through the home page, or does a search engine send them to a more specific page? Which browsers are being used? What are my most popular '404-ing' (missing) URLs? Is another site using my images?'

"Unfortunately, most tools can't answer all those kinds of questions out of the box. To answer those questions, you typically need to extend an existing tool, write your own tool, or spend some quality time with awk, grep, and wc.

"Or, you can turn to LAMP. By combining Linux, Apache (and mod_log_sql), MySQL, and PHP, you can build a customized logging and reporting system without a lot of effort. This month, let's look at how to set up logging..."

Complete Story

Related Stories: