IntroInstallationSetupOptionsCustomizationInformationFAQ



Information

This document will supply you with some additional information that may be helpful in modifying MKStats for your own needs, or understanding how it works.

What counts as a 'hit'? Are the numbers accurate?
There are a number of things to consider when reading in web server logs in an attempt to get accurate, meaningful statistics. Simply counting lines in the log file using 'grep' will not give you accurate information. Erik Dorfman has written a document on how not to use http log files that explains how these simple methods fail miserably.
To get accurate counts, you first must filter out any hits on images and other non-content files. Since a single HTML page may easily contain 10 images or more, a visitor loading that page will account for 11 lines in the log file. This gives you an accurate count of how many requests your server has been answering, but that is not a measure of hits. So the first step is to filter out all files that aren't HTML files or other file types that you want to keep track of.
Next, you must look at the HTTP response code that the server gives back to the browser. Each request is answered in a certain way, and a number is given to each type of response. If a page isn't found, the code will be 404. Obviously these should not be counted as hits, because a page wasn't delivered. There are other codes that mean the server had a problem delivering what the user requested, and each of these should be filtered out.
An HTTP code of 302 means that the user was directed to a different URL fro what they requested, and this should not be counted as a hit either. For example, if you request "http://www.site.com/~user" you are given a response back to request "http://www.site.com/~user/" instead, and this is a 302 response code (all transparent to the user). If you didn't take this into account, this would appear like 2 separate requests, when in fact it was the same person just having their URL corrected and being forced to make another request.
All codes in the 200 range mean that the document was delivered okay and should count as a hit, but this isn't enough. Response code 304 means that the browser was checking if a file has been updated since the last time it was loaded, or whether it should use the one in its cache. Even though a page has not been delivered, this counts as a valid hit as well (if it's an HTML file) because the person was actually requesting a document.
Although that explanation may be confusing to someone who has never even looked at a log file, you luckily never have to worry about it. You can rest assured that MKStats will handle the logs correctly to give you accurate information.

How do I create plug-in reports?
If you know perl well enough and can understand how MKStats works, then you should have no problem creating custom plug-in reports to visualize the log information any way you wish. Take a look at the Plugin Documentation for more information.

Creating referer_log, agent_log with Netscape Servers
Chris Gullete wrote this document explaining how this can be done. You may also view the technical notices at Netscape's site for more information.

IntroInstallationSetupOptionsCustomizationInformationFAQ