he marketing hype surrounding log analysis software and services could have you believing that tracking Web traffic accurately is as easy as installing some software and running your reports. Far too manypeople are under the impression that Web server log files are accurate, says Morgan Davis, director of operations for CTS Network Services an independent Internet provider in San Diego.
CTS’ Davis tells clients not to
treat log files as gospel.
If the log files aren’t consistently reliable, then reports produced by them can’t be either. “It’s not so much a problem with servers as it is with browsers and the inconsistent ways they query Web servers,” Davis says. “Different browsers can have completely different effects on a server logging an identical transfer.”
Still, different servers can come up with different log entries for the same events, he adds. But there are even bigger problems: The actions of caching proxy servers are largely immeasurable. Interrupted transfers are difficult to track reliably. And even different settings in a browsers configuration options affect servers’ logging traffic. “It’s like voodoo, and you can hardly make sense of it all even when you understand all of these variables,” Davis says.
Davis is cautious about overcoming the shortcomings of server-logging with new HTTP versions, Web servers, and technologies. “Are we asking for more demographic information as part of basic server-logging practices? Or are we talking about how the current data is being recorded and utilized?” he asks. “The former is a controversial subject.” In fact, some people would like to enist the help of browser developers to require users to enter more information about themselves into their browsers’ configuration screens. But is this something users should be forced to do?
While it can be trying to work through the inconsistencies, there’s a wealth of untapped data in existing server log files, Davis says. Still, “You just can’t use your log files as an absolute measure of the activity at your site,” he notes.
So why don’t we have better tools for logging? According to Davis, the people designing Web servers are primarily concerned with performance and features. “It’s the age-old problem of the guy with the tie can’t understand the guy in the lab, and the guy in the lab doesn’t care,” Davis says.
— Rick Stout
PHOTO © JIM COIT/BLACK STAR