EFW Support

Development => Contribute Your Customisations & Modifications => Topic started by: sanchezjuanc on Thursday 04 February 2010, 01:25:01 pm



Title: LightSquid on Endian FW
Post by: sanchezjuanc on Thursday 04 February 2010, 01:25:01 pm
Hi everybody, this is my firts contribution ill hope help somebody

i think there its a better way to do it but i did it just on hours, i almost dont have time, but i gonna try to make better

ok let see, LigthSquid its a web interface to generate a report from the squid`s logs, i like how it show me the reports, you can check it on
http://lightsquid.sourceforge.net/ (http://lightsquid.sourceforge.net/)

The idea its to have the LigthSquid Working on Endian FW but the endian's squid log its not very compatible with the LigthSquid requirements, so we did have to modified the LigthSquid Parser to generate the reports.

So lets start
1. We have to dowload the last LigthSquid version (actually the 1.8 ) from  http://lightsquid.sourceforge.net/ (http://lightsquid.sourceforge.net/)

2. Unpack the file and move the folder to /home/httpd/html/lsquid <- the last directory coulbe anyone name you want

3. edit the ligthsquid.cfg to this

Code:
	# -------------------- GLOBAL VARIABLES  ---------------------------

#path to additional `cfg` files
$cfgpath             ="/home/httpd/html/lsquid";
#path to `tpl` folder
$tplpath             ="/home/httpd/html/lsquid/tpl";
#path to `lang` folder
$langpath            ="/home/httpd/html/lsquid/lang";
#path to `report` folder
$reportpath          ="/home/httpd/html/lsquid/report";
#path to access.log
$logpath             ="/var/log/squid";
#path to `ip2name` folder
$ip2namepath         ="/home/httpd/html/lsquid/ip2name";

and this $squidlogtype set to 3

why set to 3 well its was a number i did choice to call the code on parser script :D 

Code:
	# -------------------- LightParser VARIABLES  ---------------------------
#squid log type
#if native squid format (default squid, see in doc) - must be 0
#if EmulateHttpdLog ON - set 1
#digit - for speed optimization
#try it set to 1 if parser generate warning
#
#see also month2dec below !!!!
#
$squidlogtype        = 3;


4. open the file lightparser.pl and add this code between        

Code:
	$recoveredlines++;
   }

} else {
   #emulated httpd log

   around line 155   
      
      
Code:
	}elsif (3 == $squidlogtype) {
#For endian Squid Log by jcsanchez sanchezjuanc@gmail.com
#Feb  3 15:52:43 efw-1265133329 squid[10435]: 1265241163.648    761 127.0.0.1 TCP_MISS/200 315 POST http://gest.ivelog.com/url.asp - DIRECT/74.55.113.194 text/html
#Feb  3 15:52:43 efw-1265133329 squid[10435]: 1265241163.650    767 10.10.10.245 TCP_MISS/200 443 POST http://gest.ivelog.com/url.asp - FIRST_UP_PARENT/content1 text/html
($garbacemonth,$garbaceday,$garbacenameserver,$garbace__,$garbacesquid,$Ltimestamp,$Lelapsed,$Lhost,$Ltype,$Lsize,$Lmethod,$Lurl,$Luser,$Lhierarchy,$Ltype)=split;
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime($Ltimestamp);
$mon++; #fix, month start from 0
$date  =sprintf("%04d%02d%02d",$year+1900,$mon,$mday);

#check row with invalid record
if ( ($#Lrest >= 0) && ($#Lrest < 4) ) {
$str=$_;
#maybe two concatenated record (first - truncated)
if ($str =~ m/(\d+\.\d+\s+\d+\s+(\d{1,3}\.){3}\d{1,3}\s+\w+\/\d+\s+\d+\w+\s+\S+\s+\S+\s+\S+\s+\w+\/\S+\s(-|([a-zA-Z\-]+\/[a-zA-Z\-]+)))$/) {
$newstr=$1;
($Ltimestamp,$Lelapsed,$Lhost,$Ltype,$Lsize,$Lmethod,$Lurl,$Luser,$Lhierarchy,$Lconttype)=split /\s+/,$newstr;
} else {
# maybe source url contain SPACES, try concatenate ...
while ($#Lrest != -1) {
$Lurl.="_$Luser";$Luser=$Lhierarchy;$Lhierarchy=$Lconttype;$Lconttype=shift @Lrest;
}
#do some sanity check
unless (($Lhierarchy =~ m/\w+\/\S+/) and ($Lconttype =~ m/-|([a-zA-Z\-]+\/[a-zA-Z\-]+)/)) {
$notrecoveredlines++;
next;
}
}
  $recoveredlines++;
}
5. save it and thats all, now you have to run the script with
./lightparser.pl

wait until finish and open a browser to you endian's ip

by example

htpp://10.10.10.1/lsquid/

and thats all, well any question shoot me, i still have a lot, but we can try to resolv it :D





Title: Re: LightSquid on Endian FW
Post by: Steve on Thursday 04 February 2010, 04:45:04 pm
Excellent! works great.
Thanks for sharing :)


Title: Re: LightSquid on Endian FW
Post by: nmatese on Friday 05 February 2010, 06:43:47 am
Could you please post your lightparser.pl, I am having some compiling issues.

Thanks,
Nick


Title: Re: LightSquid on Endian FW
Post by: sanchezjuanc on Friday 05 February 2010, 07:01:20 am
lightparser.pl

Code:
#!/usr/bin/perl
#
# LightSquid Project (c) 2004-2005 Sergey Erokhin aka ESL
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
#
# detail see in gnugpl.txt

#parse access.log
# make per user report in 'report' direcotry

#usage: lightparse.pl {param}
#if param omit - parse full access.log file
# today - only current day
# yesterday - yesterday
# data in format YYYYMMDD - parse day
# access.log.{\d}.{gz|bz2} - parse file (for process archived)

# function prototypes
sub MakeReport();
sub InitSkipUser();
sub getLPS($);
sub LockLSQ();
sub UnLockLSQ();
sub LOCKREMOVER();

use File::Basename;
use Time::Local;

push (@INC,(fileparse($0))[1]);

require "lightsquid.cfg";
require "common.pl";

#include ip2name function
require "ip2name/ip2name.$ip2name";

$SIG{INT} = \&LOCKREMOVER; # traps keyboard interrupt
my $lockfilepath   ="$lockpath/lockfile";

my $skipurlcntr   = 0;
my $skip4xxcntr   = 0;
my $skipfilterdatecntr= 0;

my $firstrun = 1;
my $totallines = 0;
my $parsedlines = 0;
my $daylines = 0;

my $catname   ="cat";
my $filename  ="access.log";

undef $workday;

exit unless (LockLSQ()); #Lock LSQ (block multiple instance)

if ($skipurl eq "") {
   $skipurl = "skipurl MUST be defined!!!";
   print "WARNING !!! \$skipurl is empty\n";
}

($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime;
$month=sprintf("%02d",$mon+1);;
 
my $filterdatestart=0;
my $filterdatestop =timelocal(59,59,23,31,12-1,2020-1900)+1000;

$fToday=1 if ($ARGV[0] eq "today");
$fToday=1 if ($ARGV[0] eq "yesterday");

if ($fToday) {
   ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime;

   $filterdate=sprintf("%04d%02d%02d",$year+1900,$mon+1,$mday);;
   $filterdatestart=timelocal( 0, 0, 0,$mday,$mon,$year);
   $filterdatestop =timelocal(59,59,23,$mday,$mon,$year);
   print ">>> filter today: $filterdate\n" if ($debug);
}

if ($ARGV[0] eq "yesterday") {
   $filterdatestart=$filterdatestart-(24*60*60);
   $filterdatestop =$filterdatestop -(24*60*60);
   ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime($filterdatestart);
   $filterdate=sprintf("%04d%02d%02d",$year+1900,$mon+1,$mday);;
   print ">>> filter yesterday: $filterdate\n" if ($debug);
}

if ($ARGV[0] =~ m/^(\d\d\d\d)(\d\d)(\d\d)$/) {
   $filterdate=$ARGV[0];
   $filterdatestart=timelocal( 0, 0, 0,$3,$2-1,$1);
   $filterdatestop =timelocal(59,59,23,$3,$2-1,$1);
   print ">>> filter date: $filterdate\n" if ($debug);
}

if ($ARGV[0] =~ m/access\.log\.(\d)/) {
   $filename=$ARGV[0];
   $catname="zcat" if ($ARGV[0] =~ m/\.gz$/);
   $catname="bzcat" if ($ARGV[0] =~ m/\.bz2$/);
}

print ">>> use file :: $logpath/$filename\n" if ($debug);
#open FF, "$logpath\\$filename" || die "can't access log file\n";
open FF, "$catname $logpath/$filename|" || die "can't access log file\n";

InitSkipUser();

StartIp2Name();

undef %bigfile; $bigfilecnt=0;
while (<FF>) {
chomp;
$totallines++;

if (0 == $squidlogtype) {
   #squid native log
   #970313965.619 1249   denis.local TCP_MISS/200 2598 GET    http://www.emalecentral.com/tasha/thm_4374x013.jpg - DIRECT/www.emalecentral.com image/jpeg
   # timestamp   elapsed host   type    size method url user  hierarechy type

   #speed optimization for FILTERDATE mode
   $Ltimestamp=substr $_,0,11;
   if ($Ltimestamp<$filterdatestart or $Ltimestamp>$filterdatestop) {
  print ">>>> skipDafteFilter URL $Lurl\n$_" if ($debug2 >= 2 );
  $skipfilterdatecntr++;
  next;
   };

   ($Ltimestamp,$Lelapsed,$Lhost,$Ltype,$Lsize,$Lmethod,$Lurl,$Luser,$Lhierarchy,$Lconttype,@Lrest)=split;
   ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime($Ltimestamp);
   $mon++; #fix, month start from 0
   $date  =sprintf("%04d%02d%02d",$year+1900,$mon,$mday);

   #check row with invalid record
   if ( ($#Lrest >= 0) && ($#Lrest < 4) ) {
  $str=$_;
  #maybe two concatenated record (first - truncated)
  if ($str =~ m/(\d+\.\d+\s+\d+\s+(\d{1,3}\.){3}\d{1,3}\s+\w+\/\d+\s+\d+\w+\s+\S+\s+\S+\s+\S+\s+\w+\/\S+\s(-|([a-zA-Z\-]+\/[a-zA-Z\-]+)))$/) {
$newstr=$1;
($Ltimestamp,$Lelapsed,$Lhost,$Ltype,$Lsize,$Lmethod,$Lurl,$Luser,$Lhierarchy,$Lconttype)=split /\s+/,$newstr;
  } else {
# maybe source url contain SPACES, try concatenate ...
while ($#Lrest != -1) {
   $Lurl.="_$Luser";$Luser=$Lhierarchy;$Lhierarchy=$Lconttype;$Lconttype=shift @Lrest;
}
#do some sanity check
unless (($Lhierarchy =~ m/\w+\/\S+/) and ($Lconttype =~ m/-|([a-zA-Z\-]+\/[a-zA-Z\-]+)/)) {
   $notrecoveredlines++;
   next;
}
  }
  $recoveredlines++;
   }
}elsif (3 == $squidlogtype) {
#For endian Squid Log by jcsanchez sanchezjuanc@gmail.com
#Feb  3 15:52:43 efw-1265133329 squid[10435]: 1265241163.648    761 127.0.0.1 TCP_MISS/200 315 POST http://gest.ivelog.com/url.asp - DIRECT/74.55.113.194 text/html
#Feb  3 15:52:43 efw-1265133329 squid[10435]: 1265241163.650    767 10.10.10.245 TCP_MISS/200 443 POST http://gest.ivelog.com/url.asp - FIRST_UP_PARENT/content1 text/html
($garbacemonth,$garbaceday,$garbacenameserver,$garbace__,$garbacesquid,$Ltimestamp,$Lelapsed,$Lhost,$Ltype,$Lsize,$Lmethod,$Lurl,$Luser,$Lhierarchy,$Ltype)=split;
#printf( "$gmonth,$gday,$gnameserver,$g__,$gsquid,$Ltimestamp,$Lelapsed,$Lhost,$Ltype,$Lsize,$Lmethod,$Lurl,,$Luser,$Lhierarchy,$Ltype\n\n");
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime($Ltimestamp);
$mon++; #fix, month start from 0
$date  =sprintf("%04d%02d%02d",$year+1900,$mon,$mday);

#check row with invalid record
if ( ($#Lrest >= 0) && ($#Lrest < 4) ) {
$str=$_;
#maybe two concatenated record (first - truncated)
if ($str =~ m/(\d+\.\d+\s+\d+\s+(\d{1,3}\.){3}\d{1,3}\s+\w+\/\d+\s+\d+\w+\s+\S+\s+\S+\s+\S+\s+\w+\/\S+\s(-|([a-zA-Z\-]+\/[a-zA-Z\-]+)))$/) {
$newstr=$1;
($Ltimestamp,$Lelapsed,$Lhost,$Ltype,$Lsize,$Lmethod,$Lurl,$Luser,$Lhierarchy,$Lconttype)=split /\s+/,$newstr;
} else {
# maybe source url contain SPACES, try concatenate ...
while ($#Lrest != -1) {
$Lurl.="_$Luser";$Luser=$Lhierarchy;$Lhierarchy=$Lconttype;$Lconttype=shift @Lrest;
}
#do some sanity check
unless (($Lhierarchy =~ m/\w+\/\S+/) and ($Lconttype =~ m/-|([a-zA-Z\-]+\/[a-zA-Z\-]+)/)) {
$notrecoveredlines++;
next;
}
}
  $recoveredlines++;
}

} else {
   #emulated httpd log
   #192.168.3.40 - - [15/Apr/2005:11:46:35 +0300] "GET http://mail.yandex.ru/mboxjscript? HTTP/1.0" 200 2262 TCP_MISS :DIRECT
   #192.168.3.40 - - [15/Apr/2005:11:46:35 +0300] "GET http://css.yandex.ru/css/mail/search.js HTTP/1.0" 200 4199 TCP_HIT:NONE
   #192.168.3.12 - - [15/Apr/2005:11:46:35 +0300] "CONNECT aero.lufthansa.com:443 HTTP/1.0" 200 35992 TCP_MISS:DIRE
   # ($Lhost,   $Luser,$Luser2,$Ldate,    $u2,   $Lmethod,$Lurl,   $u3,    $Ltype,$Lsize,$u4)=split
 
   ($Lhost,$Luser,$Luser2,$Ldate,$u2,$Lmethod,$Lurl,$u3,$Ltype,$Lsize,$u4)=split;
  
   $Ldate =~ m#^\[(\d\d)/(...)/(\d\d\d\d):(\d\d):(\d\d):(\d\d)#;
   $mday=$1;$mon=$month2dec{$2};$year=$3-1900;
   $hour=$4;$min=$5;$sec=$6;

   $date  =sprintf("%04d%02d%02d",$year+1900,$mon,$mday);
   if ($filterdate) {
  if ($date ne $filterdate) {
print ">>>> skipDafteFilter URL $Lurl\n$_" if ($debug2 >= 2 );
$skipfilterdatecntr++;
next;
  };
   }
  
   if (($Luser eq "-") && ($Luser2 ne "-")) {
$Luser = $Luser2;
   }
  
   $u4 =~ m/(.*?)\s?:(.*)/;
   $Ltype = "$1/$Ltype";
}  #if ($squidlogtype)

if ($year < 2000-1900) { ; #invalid record
   print ">>>> skipn Bad Year  $Lurl\n$_" if ($debug2 >= 1 );
   $skipbadyear++;
   next;
}

#skip intranet
if ($Lurl =~ m/$skipurl/o) {
  print ">>>> skipURL $Lurl\n$_" if ($debug2 >= 2 );
  $skipurlcntr++;
  next;
};

# skip Access denied records (TODO: report)
if ($Ltype =~ m#DENIED#io) {
  $skipDenied++;
  print ">>>> skipDenied $Ltype\n$_" if ($debug2 >= 2);
  next;
};

if ($Ltype =~ m/HIT/) {
  $CacheHIT+=$Lsize;
} else {
  $CacheMISS+=$Lsize;
}

$parsedlines++;

if ($date ne $workday) { # close prev day, prepare for new
  if ($firstrun) {
undef $firstrun;
$workday=$date;
  } else {
MakeReport();
undef %totalsize; undef %sitesize; undef %sitehit;undef %totalhit;undef %totalputpost;
undef %hashhost;undef %hashname;
undef %bigfile; $bigfilecnt=0;
undef %sitetime;undef %sitetimesize;
$daylines=0;
$workday=$date;
$sqlreq=0;
$CacheHIT=0;$CacheMISS=0;
  }
}
$daylines++;

$user=lc $Luser;
 
$user = Ip2Name($Lhost,$user,$Ltimestamp);

next if (defined $hSkipUser{$user});

#simplified some common banner system & counters
$url=$Lurl;
$url =~ s/([a-z]+:\/\/)??.*\.(spylog\.com)/$1www.$2/o;
$url =~ s/([a-z]+:\/\/)??.*\.(yimg\.com)/$1www.$2/o;
$url =~ s/([a-z]+:\/\/)??.*\.(adriver\.ru)/$1www.$2/o;
$url =~ s/([a-z]+:\/\/)??.*\.(bannerbank\.ru)/$1www.$2/o;
$url =~ s/([a-z]+:\/\/)??.*\.(mail\.ru)/$1www.$2/o;
$url =~ s/([a-z]+:\/\/)??.*\.(adnet\.ru)/$1www.$2/o;
$url =~ s/([a-z]+:\/\/)??.*\.(rapidshare\.de)/$1www.$2/o;
$url =~ s/([a-z]+:\/\/)??.*\.(rapidshare\.com)/$1www.$2/o;

$url =~ s/([a-z]+:\/\/)??.*\.(vkontakte\.ru)/$1www.$2/o;
$url =~ s/([a-z]+:\/\/)??.*\.(odnoklasniki\.ru)/$1www.$2/o;


#extract site name
if ($url =~ m/([a-z]+:\/\/)??([a-z0-9\-]+\.){1}(([a-z0-9\-]+\.){0,})([a-z0-9\-]+){1}(:[0-9]+)?\/(.*)/o) {
   $site=$2.$3.$5;
} else {
   $site=$Lurl;
}


$site=$Lurl if ($site eq "");

$totalsize   {$user} +=$Lsize;
$totalhit   {$user} ++;
$totalputpost {$user} +=$Lsize if (($Lmethod eq "PUT") or ($Lmethod eq "POST"));
$sitesize   {$user}{$site}+=$Lsize;
$sitehit   {$user}{$site}++;

$sitetime   {$user}{$site}[$hour]+=$Lelapsed;
$sitetimesize {$user}{$site}[$hour]+=$Lsize;

#.bigfile support
if ($Lsize > $bigfilelimit) {
$bigfile [$bigfilecnt]{date}=sprintf("%02d:%02d:%02d",$hour,$min,$sec);
$bigfile [$bigfilecnt]{link}=$Lurl;
$bigfile [$bigfilecnt]{size}=$Lsize;
$bigfile [$bigfilecnt]{user}=$user;
$bigfilecnt++;
}
}

MakeReport();
StopIp2Name();
UnLockLSQ();

if ($debug) {
$worktime = ( time() - $^T );
print "run TIME: $worktime sec\n";
print "LightSquid parser statistic report\n\n";
printf( "    %10u lines processed (average %.2f lines per second)\n",
$totallines, getLPS( $worktime, $totallines ) );
printf( "    %10u lines parsed\n",   $parsedlines );
printf( "    %10u lines recovered\n",   $recoveredlines );
printf( "    %10u lines notrecovered\n",   $notrecoveredlines );
printf( "    %10u lines skiped by bad year\n",   $skipbadyear );
printf( "    %10u lines skiped by date filter\n",   $skipfilterdatecntr );
printf( "    %10u lines skiped by Denied filter\n", $skipDenied );
printf( "    %10u lines skiped by skipURL filter\n", $skipurlcntr );

if ( $parsedlines == 0 ) {
print "\nWARNING !!!!, parsed 0 lines from total : $totallines\n";
print "please check confiuration !!!!\n";
print "may be wrong log format selected ?\n";
}

}



# The END ---------------------------------------------------------

##Subroutines
# return Line Per Second value (check 0 values and correct)
sub getLPS($) {
  my $time=shift;
  my $lines=shift;
  $time||=1;
  $lines||=1;
  return ($lines/$time);
}

sub MakeReport() {
#generate report
#use global var

return if ($daylines < 2);

print ">>> Make Report $workday ($daylines - log line parsed)\n" if ($debug);

$reppath="$reportpath/$workday";

unless ( -d $reppath )
{
  mkdir $reppath, 0755 or die "Can't create dir '$reppath': $!";
}

open TOTALFILE,">$reppath/.total" || die "can't create file $reppath/.total - $!";

$tmp="";$tmpsize=0;$tmpuser=0;$tmpoveruser=0;

foreach $tuser (sort {$totalsize{$b} <=> $totalsize{$a}} keys %totalsize) {
#    $tmp.="$tuser\t$totalsize{$tuser}\t$totalhit{$tuser}\t$totalputpost{$tuser}\n";
  $totalputpost{$tuser}+=0; #prevent empty value
  $tmp.=sprintf("%-20s %15s %15s %15s\n",$tuser,$totalsize{$tuser},$totalhit{$tuser},$totalputpost{$tuser});
  $tmpuser++;
  $tmpsize+=$totalsize{$tuser};
  $tmpoveruser++ if ($totalsize{$tuser} >= $perusertrafficlimit);

  open REPFILE,">$reppath/$tuser" || die "can't create file $reppath/$tuser - $!";

  print REPFILE "total: $totalsize{$tuser}\n";

  foreach $tsite (sort {$sitesize{$tuser}{$b} <=> $sitesize{$tuser}{$a}} keys %{$sitesize{$tuser}} ) {
  printf REPFILE ("%-29s %12s %10s\t",$tsite,$sitesize{$tuser}{$tsite},$sitehit{$tuser}{$tsite});
  if ($timereport != 0) {
for ($hour=0;$hour<24;$hour++) {
printf REPFILE ("%d-%s ",int($sitetime{$tuser}{$tsite}[$hour]/3600),$sitetimesize{$tuser}{$tsite}[$hour]+0);
}
  }
  print REPFILE "\n";
  }
  close REPFILE;
}

$CacheMISS=1 if ($CacheMISS == 0);

print TOTALFILE "user: $tmpuser\n";
print TOTALFILE "size: $tmpsize\n";

print TOTALFILE "$tmp";
close TOTALFILE;

my ($sec_,$min_,$hour_,$mday_,$mon_,$year_,$wday_,$yday_,$isdst_) = localtime;$mon_++;$year_+=1900;
my $moddate=sprintf("%02d:%02d",$hour_,$min_)." :: $mday_ $MonthName[$mon_] $year_";

open FILE,">$reppath/.features" || die "can't create file  $reppath/.features - $!";
print FILE "overuser: $tmpoveruser\n";
print FILE "cachehit%: ".sprintf("%3.2f",($CacheHIT*100)/($CacheHIT+$CacheMISS))."\n";
print FILE "cachehit: $CacheHIT\n";
print FILE "cachemiss: $CacheMISS\n";
print FILE "cacheall: ".($CacheHIT+$CacheMISS)."\n";
print FILE "modification: $moddate\n";
close FILE;

unlink "$reppath/.bigfiles";
if ($bigfilecnt != 0) {
  open MAXFILE,">$reppath/.bigfiles" || die "can't create file $reppath/.bigfiles - $!";
  for ($i=0;$i<$bigfilecnt;$i++) {
print MAXFILE "$bigfile[$i]{user}\t$bigfile[$i]{date}\t$bigfile[$i]{size}\t$bigfile[$i]{link}\n";
  }
  close MAXFILE;
}

#create list of user that use more than $perusertrafficlimit bytes
unlink "$reppath/.overuser";
if ($tmpoveruser) {
open OVERFILE,">","$reppath/.overuser" || die "can't create file  $reppath/.overuser - $!";
foreach $tuser (sort {$totalsize{$b} <=> $totalsize{$a}} keys %totalsize) {
print OVERFILE "$tuser\t$totalsize{$tuser}\n" if ($totalsize{$tuser} >= $perusertrafficlimit);
}
close OVERFILE;
}

CreateGroupFile($reppath);
CreateRealnameFile($reppath);
}

sub InitSkipUser() {
 open F,"<$cfgpath/skipuser.cfg";
 while (<F>) {
   chomp;
   next if (/^#/);
   $hSkipUser{$_}=1;
 }
 close F;
}
# Lock support
sub LockLSQ() {
   if (-f "$lockfilepath") {
  #read data from `lockfile`
  print STDERR "Warning, `$lockfilepath` exist, maybe anoter process running !\n";
  open FF,"<","$lockfilepath" or die "can't read lock file `$lockfilepath`\n";
  $pid=<FF>;chomp $pid;$pid =~ s/PID: //;
  $ts =<FF>;chomp $ts ;$ts =~ s/Timestamp: //;
  close FF;
  #check timedelta
  $tsdelta=time - $ts;
  print STDERR "LockPID : $pid\n" ;
  print STDERR "tsdelta : $tsdelta second(s) (maxlocktime: $maxlocktime)\n";

  return 0 if ($tsdelta<$maxlocktime);

  print STDERR "OLD lock file ignored and removed!\n";
  UnLockLSQ();
   }

   open FF,">","$lockfilepath" or die "can't create lock file `$lockfilepath`\n";
   print FF "PID: $\n";
   $ts=time;
   print FF "Timestamp: $ts\n";
   print FF "Creation time: ".localtime($ts)."\n";
   close FF;

   return 1;
}

sub UnLockLSQ() {
  unlink $lockfilepath or die "can't remove lock file `$lockfilepath`\n";
}

sub LOCKREMOVER() {
   print "INT happents, remove LOCK\n";
   UnLockLSQ();
   exit;
}

__END__
2004-04-23 : initial version
2004-09-01 FIX : error in parse invalid file
2004-09-09 ADD : add create .bigfile file contain links greater $bigfilelimit
2004-11-08 ADD : skip 4xx records (dirty :-() TODO: do error report
2004-11-09 ADD : use DB only if not define user name...
2005-04-13 ADD : LightSquid publication cleanup
2005-04-14 ADD : $debug and $debug2 variable for generate statistic
: if parsed lines = 0 print WARNING
2005-04-17 ADD : add support fot HTTPDlike log file
2005-04-19 ADD : add .bz2 support
: add cache hit calculationn (if Ltype contain HIT - hit else - MISS), wrong ??
: add oversized user calculation
2005-04-20 ADD : .features file added, with additional info   
2005-04-22 FIX : httpdlike parser bug;
      FIX : mkdir 655 -> mkdir 755
2005-04-30 ADD : Rewrite archive support, now support access.log.{D},access.log.{D}.gz,access.log.{D}.bz2
   ADD : time report
2005-05-03 FIX : fix wrong .features file output
2005-05-12 FIX : empty line report only if $debug
   FIX : date filter now ^\d\d\d\d\d\d\d\d$ ...   
2005-11-21 FIX : cosmetical changes
2006-07-02 ADD : try recovery some type of broken log record (url contain spaces, two concatenated record)
: fix negative number in user file (printf -%d <2g $u <4g), now use simple print
2006-07-05 ADD : Put & Post addet into .total file
2006-07-10 ADD : SkipUser support
: GetNameByIP -> IP2NAME (see doc)
: $cfgpath in config
: .features modification: parameter support
2006-07-29 ADD : add LOCKing, for prevent multiple LightSquid parser instance ...
   ADD : improve SKIP speed for native squid log format (more that 3 time !!!!)
   ADD : report line per second speed LPS in debug report
2006-11-23 FIX : Yet another printf trouble in time report fixed
2007-01-05 FIX : Wrong modification data writen in .features
2008-11-28 NEW : Odnoklasniki & Vkontakte agregator added
   FIX : Perl 5.10 fix. in several cases incorrect name was used, but size calculated correctly.
2009-06-30 NEW : .overuser support


Title: Re: LightSquid on Endian FW
Post by: nmatese on Friday 05 February 2010, 07:05:53 am
Thank you so much, this works great!! I appreciate you work.  One last question, is it possible to run this as a cron job or something?  Any suggestions would be appreciated.


Title: Re: LightSquid on Endian FW
Post by: Steve on Friday 05 February 2010, 10:06:36 am
To run the update as a cron job, create a file called lsquid containing the following code:

Code:
#!/bin/sh
/home/httpd/html/lsquid/lightparser.pl &>/dev/null
exit 0

Copy this file to /etc/cron.cyclic directory (placing it here will cause the file to be executed every every 5 minutes after the hour ie: 2:05, 3:10, 3:15 ...)
Make the file executable (chmod 700 or 755)


Title: Re: LightSquid on Endian FW
Post by: nmatese on Sunday 07 February 2010, 10:41:02 am
Not sure what the story is, but for example I am getting in a sense duplicated.  I am getting reports by each Active Directory user, but then I am also getting reports for each IP address that those users were connecting from.  And a report from 127.0.0.1 is that normal?  Maybe I am just confused, should I be getting an IP report as well as a user-level report?


Title: Re: LightSquid on Endian FW
Post by: jimmyzshack on Tuesday 09 February 2010, 03:14:07 am
Can someone tell me how to get these requirements installed?

   1.  Perl
   2. http server (Apache, lighthttpd, etc)


Title: Re: LightSquid on Endian FW
Post by: mrkroket on Tuesday 09 February 2010, 04:42:37 am
Very Very nice program!!!!!
Easy to install, easy to configure, and easy to get stats and graphs.
That's what exactly I was looking for months!!!!!
  :D :D :D :D :D :D :D :D :D :D

Just a tip, to remove 127.0.0.1 logs, you just need to configure a second cfg file, called skipuser.cfg. Just write these IPs to remove local logs:
127.0.0.1
localhost
localhost.localhost

There is a thing I still can't get to work. It's supposed that the logger has the ability to parse IP 2 names from DHCP file, although I configured $ip2name as 'dhcp', it doesn't work for me.

I'm working hard with old logs, to have a nice stat site. 700Mb of compressed squid logs, millions and millions of http access  8)

About cron jobs, a side note. install.txt file from lightsquid package says:
10. Setup crontab to run lightparser once per hour

      crontab -e
      This example will execute parser at 55 minutes after every hour:

      */55 * * * * /var/www/htdocs/lightsquid/lightparser.pl today
     
      if you have small log and fast machine, you may run lightparser with smaller delay
      !!!warning!!! not set interval less 10 min !!!!!



Title: Re: LightSquid on Endian FW
Post by: nmatese on Tuesday 09 February 2010, 09:02:19 am
I really love light squid, but is it possible maybe not with lightsquid to somehow report on how much time someone is spending surfing the internet in any way? Either by site or as a total?


Title: Re: LightSquid on Endian FW
Post by: jimmyzshack on Wednesday 10 February 2010, 12:54:19 am
I also have another question. We are going to use NTLM authentication and we run a terminal services farm. I see in the logs that names show up based on the authentication in LightSquid does it use this name or is it trying to map user-names to ip? because we have 100 users sharing 4 ip addresses.
thanks

I'm also still trying to find this out

Can someone tell me how to get these requirements installed on the endian server?

   1.  Perl
   2. http server (Apache, lighthttpd, etc)



Title: Re: LightSquid on Endian FW
Post by: nmatese on Wednesday 10 February 2010, 01:24:49 am
It maps to NTLM, I have a similiar environment 120 users 4 citrix servers.


Title: Re: LightSquid on Endian FW
Post by: mrkroket on Thursday 11 February 2010, 05:26:04 am
Can someone tell me how to get these requirements installed on the endian server?

   1.  Perl
   2. http server (Apache, lighthttpd, etc)


No extra reqs, just unpack, configure .cfg files and enjoy.


Title: Re: LightSquid on Endian FW
Post by: jimmyzshack on Thursday 11 February 2010, 08:23:45 am
anyone know what i can do to fix this error?


root@firewall:/home/httpd/html/lsquid # ./lightparser.pl
Too many arguments for main::getLPS at ./lightparser.pl line 321, near "$totallines ) "
Execution of ./lightparser.pl aborted due to compilation errors.
root@firewall:/home/httpd/html/lsquid #

I used the lightparser.pl that is posted above.


Title: Re: LightSquid on Endian FW
Post by: jimmyzshack on Friday 12 February 2010, 03:23:24 am
I did a total reinstall and now have this working. Thanks alot for the program!

i'm getting reports from users but some are showing up as the ip of the terminal server so you don't know who it is, unless it's just a duplicate from a user. Anyone else seeing this with citrix or terminal server?


Title: Re: LightSquid on Endian FW
Post by: nmatese on Friday 12 February 2010, 03:41:49 am
You will notice that the IP listing is a duplicate of all users who are logged onto that server.  So the user1 + user2 = IP traffic if that makes sense.  I am trying to get Squint working on my server.  Squint gives more in depth reports daily/weekly/monthly.  Only thing I am struggling with is getting the cron job to work properly.


Title: Re: LightSquid on Endian FW
Post by: jimmyzshack on Friday 12 February 2010, 04:34:03 am
mine isn't running either using

#!/bin/sh
/home/httpd/html/lsquid/lightparser.pl &>/dev/null
exit 0

does it need to be ran as root?


Title: Re: LightSquid on Endian FW
Post by: jimmyzshack on Friday 12 February 2010, 05:52:35 am
You will notice that the IP listing is a duplicate of all users who are logged onto that server.  So the user1 + user2 = IP traffic if that makes sense.  I am trying to get Squint working on my server.  Squint gives more in depth reports daily/weekly/monthly.  Only thing I am struggling with is getting the cron job to work properly.

i used this

#!/bin/sh
cd /home/httpd/html/lsquid
./lightparser.pl
exit 0

and put it in the cron.cyclic folder and ran chmod 700 "filename" and it works now


Title: Re: LightSquid on Endian FW
Post by: nmatese on Friday 12 February 2010, 05:55:37 am
I am finding out that any Cron that i add to the crontab file, none of them run.  Anyone have any suggestions on how I could get this script to run nightly at a specific time?


Title: Re: LightSquid on Endian FW
Post by: jimmyzshack on Friday 12 February 2010, 06:24:18 am
I am finding out that any Cron that i add to the crontab file, none of them run.  Anyone have any suggestions on how I could get this script to run nightly at a specific time?

I didn't add anything to the crontab file, looks to me that it is setup to tell the folders( cron.hour, cron.daily etc) when to run. From what i can tell is that you put the job in the folder that runs on the time table you want.

to test i put this in a text file with no extension

#!/bin/sh
cd /home/httpd/html/lsquid
./lightparser.pl
exit 0

then ran from terminal chmod 700 "filename" (this look important to do bc it didnt work till i did it) and put it in the /etc/cron.minutely to test where i didn't have to wait an hour.

on the skip user is there a way to skip all ip address are do i have to add each address from all the subnets?


Title: Re: LightSquid on Endian FW
Post by: nmatese on Friday 12 February 2010, 06:25:24 am
I figured out that fcron, which is the daemon for crontab only reloads itself with a reboot.  Therefore anything in crontab is not recognized yet.


Title: Re: LightSquid on Endian FW
Post by: pwizard on Tuesday 16 March 2010, 12:58:07 pm
Very Very nice program!!!!!
Easy to install, easy to configure, and easy to get stats and graphs.
That's what exactly I was looking for months!!!!!
  :D :D :D :D :D :D :D :D :D :D

Just a tip, to remove 127.0.0.1 logs, you just need to configure a second cfg file, called skipuser.cfg. Just write these IPs to remove local logs:
127.0.0.1
localhost
localhost.localhost

There is a thing I still can't get to work. It's supposed that the logger has the ability to parse IP 2 names from DHCP file, although I configured $ip2name as 'dhcp', it doesn't work for me.

I'm working hard with old logs, to have a nice stat site. 700Mb of compressed squid logs, millions and millions of http access  8)

About cron jobs, a side note. install.txt file from lightsquid package says:
10. Setup crontab to run lightparser once per hour

      crontab -e
      This example will execute parser at 55 minutes after every hour:

      */55 * * * * /var/www/htdocs/lightsquid/lightparser.pl today
     
      if you have small log and fast machine, you may run lightparser with smaller delay
      !!!warning!!! not set interval less 10 min !!!!!


command crontab not available on endian.
I create script at /etc/cron.hourly for generate report every hour.


Title: Re: LightSquid on Endian FW
Post by: mrkroket on Tuesday 16 March 2010, 05:11:09 pm
command crontab not available on endian.
I create script at /etc/cron.hourly for generate report every hour.

Well, I was refering to the 1 hour limit on readme, but thanks for pointing out the crontab


Title: Re: LightSquid on Endian FW
Post by: pwizard on Tuesday 16 March 2010, 05:35:38 pm
How to show detail in each user ?


Title: Re: LightSquid on Endian FW
Post by: pwizard on Tuesday 16 March 2010, 05:37:06 pm
Why not show detail ?


Title: Re: LightSquid on Endian FW
Post by: turitopa on Tuesday 29 June 2010, 11:01:01 am
hi people,

is my stmpmail details logged in here?

I would like to see how much bandwidth my MailServer is using...


Title: Re: LightSquid on Endian FW
Post by: jayanthan on Wednesday 09 November 2011, 03:44:27 pm
Super explanation and very easy to install and to get work.

FYI,
i put that file in "/etc/cron.cyclic", it generates the report at every 5 minutes.

After that i moved that file to cron.hourly. So, that 1ly it every hourly...

One more thing,
If an ip is Bypassed. it can't generates report for that IP.


Title: Re: LightSquid on Endian FW
Post by: gsr1985 on Monday 12 December 2011, 07:06:03 pm
Hai ,
               I receive an error message  as below when compiling


 bash ./lightparser.pl
./lightparser.pl: line 23: syntax error near unexpected token `('
./lightparser.pl: line 23: `sub MakeReport();'


Can you please help me in rectifying this error


Title: Re: LightSquid on Endian FW
Post by: mgr9500 on Thursday 09 February 2012, 07:00:04 pm
I did all steps(first post) but when run:
Code:
./check-setup.pl 
WARNING
Code:
LightSquid Config Checker, (c) 2005-9 Sergey Erokhin GNU GPL
 
WARNING: Log format Look like CUSTOM log, Lightsquid can't parse this format! Please check  documentation !
Invalid access.log format or can't check format type ...


Title: Re: LightSquid on Endian FW
Post by: alertamaxima on Friday 23 November 2012, 10:44:17 am
hello thank you very much for the article
I have this error on the website
Access to 'home/httpd/html/lsquid/report' folder   NO !!!!!!!!!!!!

by console this

root@endian:/home/httpd/html/lsquid # ./lightparser.pl
can't create lock file `home/httpd/html/lsquid/report/lockfile`
 ???


Title: Re: LightSquid on Endian FW
Post by: sree on Tuesday 04 June 2013, 09:47:25 pm
Dear sanchezjuanc,
I am always getting line errors for lightparser.pl and tried to copy the one you shared here and still getting syntax error, if you don't mind kindly share the lightparser.pl file here or mail me to sksree at gmail.com.

Regards,
Sree