Celebrazio Net

Contact Us

Regular Expression bug in AWStats - fixed now

July, 2014

How I fixed a bug or two in AWStats (linux, perl, virtual servers, custom logfile format).


awstats bugs. 


1. a bug was found relating to nested config files.  Bug introduced during 7.0 to 7.3 upgrade, on
 linux ubuntu system (perl awstats).  

old way:
   only 1 line: 
      Include /home/myuser/runtime/stats/mysite.com/awstats.conf
           which contains 6 lines, one of which reads
      Include /home/myuser/runtime/stats/awstats.conf
       (which contains all the constant settings for my 8 or so Virt Hosts)

New way:
     contains 6 lines, including the line
        Include /home/myuser/runtime/stats/awstats.conf 

For some reason, there were errors doing the includes the old way, beginning with my recent
 upgrade from 7.0 to 7.3 
The "New way" only has 1 file and one include and it's working. 
The "old way"  tries to start with one file, then include the 2nd one, which includes the 3rd one
 - this way doesn't work right now. 


2.  a different bug that's existed since around 2012.
  Because I do Virt Hosts on my server(s), I have a custom log file Format. 
   Until this week's fix, I was getting no results on "Operating System" and "Search Phrase"
and  "Search Keywords" in the staticlinks reports.

old way: 
  I tried to use the custom setting for LogFormat:

LogFormat = "%host %virtualname %other %time1 %methodurl %code %bytesd %refererquot %uaquot"

This was failing, probably a bad regex.

New way / workaround / hack:
First, hard-code the config file like: 

Then, of course, hack the source code like:

   elsif ( $LogFormat eq '3' ) {
      $PerlParsingFormat =
   # commented out above line
   "([^ ]+) ([^ ]+) [^ ]+ \\[([^ ]+) [^ ]+\\] \\\"([^ ]+) ([^ ]+)(?: [^\\\"]+|)\\\" ([\\d|-]+) ([\\d|-]+) \\\"(.*?)\\\" \\\"([^\\\"]*)\\\"";
   # used this line which is very similar to the first line to match my LogFormat with Virtual Hosts. 

and also adjust the fieldlib: 

                        $pos_host    = 0;
                        $pos_logname = 1;
                        $pos_date    = 2;
                        $pos_method  = 3;
                        $pos_url     = 4;
                        $pos_code    = 5;
                        $pos_size    = 6;
                        $pos_referer = 7;
                        $pos_agent   = 8;
                        @fieldlib    = (
                                'host', 'logname', 'date', 'method', 'url', 'code',
                                'size', 'referer', 'ua'

this fieldlib is taken from LogFormat 1 and is still usable for my log format. 
Yours may vary.  The field positions basically map from the matched regular expressions
captured in the () parentheses.  If you overwrite LogFormat 3 (or any other) 
matching string, also remember to modify the fieldlib and $pos variables accordingly. 

----- / end of excerpt ---

Note:  don't try to copy this verbatim.  If you have that AWstats bug as described, you can hack around
 it by hard-coding your logfile regex, *but don't use mine* above.  You need to modify that regex to 
exactly match your logfile format.  My logfile format was very close to the Common format, which is 
in AWStats perl code LogFormat=1.   You might need to study regular expressions or
at least carefully review the LogFormat examples shown to know how to modify the 
matching string, the position "$pos" variables, and the fieldlib.  

Mine -  I fixed it by hard-coding the regex rather than allowing AWStats to create it 
on the fly.  I'd guess something about on-the-fly regular expression building is broken. 

Just confirmed that these bugs are worked around and I'm getting expected reporting output again. 

-----  Note, Apr 2016 ----

I'm still using AWStats 7.3 and using the hard-coded LogFormat hack above, which is
still effective.   

"With a PC, I always felt limited by the software available. On Unix, I am limited only by my knowledge."
-- Peter J. Schoenster

1998-2024 Celebrazio.net