Skip to content


Analog的安装与配置,分析汇总多域名web日志

Analog是一款基于C语言功能强大的开源的网站访问日志分析软件,支持多语言(含中文),可以运行在linux,windows下,支持apache、ngix、iis等主流WEB日志.速度飞快,10分钟内可以处理2千万条日志,数据统计以PV为主,相比AwstatsWebalizer 的报告页面简单了点,更漂亮的图表可用Report Magic 2.21.

目前最新版为analog-6.0,作者自19-Dec-04后就没更新过.演示地址 安装很简单,到:http://www.analog.cx/download.html 下载相应的版本,这里以源码版为例:将下载回来的源码包解压到安装目录,再进入该目录执行make命令即可.

wget http://www.analog.cx/analog-6.0.tar.gz tar zxvf analog-6.0.tar.gz cp -ar analog-6.0 /usr/local/ cd /usr/local/analog-6.0 make ln -s analog-6.0 analog mkdir /opt/htdocs/www/analog chown www:website /opt/htdocs/www/analog cp images /opt/htdocs/www/analog/ mkdir conf cp analog.cfg conf/c1g.cfg

配置

vi conf/c1g.cfg

#定义为中文 LANGUAGE SIMP-CHINESE #nginx日志格式 LOGFORMAT (%s – %j [%d/%M/%Y:%h:%n:%j %j] “%j %r %j” %c %b “%f” “%B”\n) #日志文件 LOGFILE /opt/log/Y.%M/*/*c1gstudio.com.log.gz #输出文件 OUTFILE /opt/htdocs/www/analog/c1gstudiolY.%M/index.html #主机名 HOSTNAME “c1gstudio.com” #主机URL HOSTURL http://www.c1gstudio.com/ #web图片目录 IMAGEDIR ../images/ #只列出访问最高的200个页面URL REQFLOOR 1000p #forum.php文件算一个文件 FILEALIAS /forum.php* /forum.php #统计子目录 SUBDIR */*

LOGFORMAT 说明

%S host (the client hostname, or address of the computer making the request) %s numerical IP address of client (if recorded in a separate field; used when %S is empty) %r file requested %q query string (part of filename after ?, if recorded in a separate field) %B browser %A browser with +’s instead of spaces %f referrer %u user (tip: a cookie or session id can usefully be defined as %u too) %v virtual host (the server hostname, also called the virtual domain) %d day of the month %m month in digits %M month, three letter English abbreviation %y year, last two digits %Y year, four digits %Z year, two or four digits (less efficient) %h hour of the day %n minute of the hour %a a or A for am, or p or P for pm, if %h is in the 12-hour clock. (So to match “am” you need %am and to match “AM” you need %aM) %U “Unix time” (seconds since beginning of 1970, GMT). If it includes decimals, use %U.%j %b number of bytes transferred %t processing time in seconds %T processing time in milliseconds %D processing time in microseconds %c HTTP status code %C code words used instead of HTTP status code in some servers — only used internally %j junk: ignore this field (field can be empty too) %w white space: spaces or tabs %W optional white space %% % sign \n new line \t tab stop \\ single backslash

我的nginx日志格式

‘$remote_addr – $remote_user [$time_local] “$request” ‘ ‘$status $body_bytes_sent “$http_referer” ‘ ‘”$http_user_agent” $http_x_forwarded_for’; 183.62.5.13 – – [06/Aug/2014:17:16:44 +0800] “GET /aboutc1g.html HTTP/1.1” 200 6642 “http://www.c1gstudio.com/web/hello.html” “Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36” 183.62.5.13

我这多了个$http_x_forwarded_for’,后面也要加个%j表示丢弃,它不会处理”-“

LOGFORMAT (%s – %j [%d/%M/%Y:%h:%n:%j %j] “%j %r %j” %c %b “%f” “%B” %j\n)

更多参考

LOGFILE 和OUTFILE说明

LOGFILE new1.log,old.log LOGFILE /opt/log/%Y.%M/%D/.c1gstudio.com.log.gz 支持通配符,日期变量及gz压缩,OUTFILE不会自动创建目录

%D date of month %m month name, in English %M month number %y two-digit year %Y four-digit year %H hour %n minute %w day of week, in English

但是日期不支持运算有点麻烦,需要外部用shell来解决了 更多参考

2014-8-26更新

The arguments to LOGFILE and CACHEFILE commands are checked for containing only certain allowed characters (specifically, letters, digits, /\.:_*? space, and – between two {letter, digit, underscore}’s). This is because they could match an UNCOMPRESS command and thus be passed to the shell when the uncompress command is popen()’ed.

可以将一个月份分成3部分来减轻压力 LOGFILE /opt/log/%Y.%M/[2-3]?/*.c1gstudio.com.log.gz Analog运行时会将日志读到内存中,想要运行快最好准备比日志大的内存,CACHEOUTFILE和CACHEFILE会占用大量空间,感觉没什么用.

配置文件内统计开关变量

MONTHLY ON # one line for each month WEEKLY ON # one line for each week DAILYREP ON # one line for each day DAILYSUM ON # one line for each day of the week HOURLYREP ON # one line for each hour of the day GENERAL ON # the General Summary at the top REQUEST ON # which files were requested FAILURE ON # which files were not found DIRECTORY ON # Directory Report HOST ON # which computers requested files ORGANISATION ON # which organisations they were from DOMAIN ON # which countries they were in REFERRER ON # where people followed links from FAILREF ON # where people followed broken links from SEARCHQUERY ON # the phrases and words they used… SEARCHWORD ON # …to find you from search engines BROWSERSUM ON # which browser types people were using OSREP ON # and which operating systems FILETYPE ON # types of file requested SIZE ON # sizes of files requested STATUS ON # number of each type of success and failure

命令行参数

x GENERAL General Summary 1 YEARLY Yearly Report Q QUARTERLY Quarterly Report m MONTHLY Monthly Report W WEEKLY Weekly Report D DAILYREP Daily Report d DAILYSUM Daily Summary H HOURLYREP Hourly Report h HOURLYSUM Hourly Summary w WEEKHOUR Hour of the Week Summary 4 QUARTERREP Quarter-Hour Report 6 QUARTERSUM Quarter-Hour Summary 5 FIVEREP Five-Minute Report 7 FIVESUM Five-Minute Summary S HOST Host Report l REDIRHOST Host Redirection Report L FAILHOST Host Failure Report Z ORGANISATION Organisation Report o DOMAIN Domain Report r REQUEST Request Report i DIRECTORY Directory Report t FILETYPE File Type Report z SIZE File Size Report P PROCTIME Processing Time Report E REDIR Redirection Report I FAILURE Failure Report f REFERRER Referrer Report s REFSITE Referring Site Report N SEARCHQUERY Search Query Report n SEARCHWORD Search Word Report Y INTSEARCHQUERY Internal Search Query Report y INTSEARCHWORD Internal Search Word Report k REDIRREF Redirected Referrer Report K FAILREF Failed Referrer Report B BROWSERREP Browser Report b BROWSERSUM Browser Summary p OSREP Operating System Report v VHOST Virtual Host Report R REDIRVHOST Virtual Host Redirection Report M FAILVHOST Virtual Host Failure Report u USER User Report j REDIRUSER User Redirection Report J FAILUSER User Failure Report c STATUS Status Code Report

#+a可以带上全部统计 更多参考

#输出当前配置 analog -settings > file

#使用命令行配置LOGFILE和OUTFILE ./analog +O/opt/htdocs/www/analog/c1gstudio2014.html /opt/log/2014.08/02/*.c1gstudio.com.log.gz 我使用时一直会报日志格式错误,无法出报告

#我使用的参数 /usr/local/analog -G +g/usr/local/analog/conf/c1g.cfg +b +s +S -n -o -Z -r +b 浏览器概要报告 -n 检索字报告 +s 来源网站报告 -o 网域报告 -Z 来源组织单位报告 +S 主机报告 -r 请求报告

-G 不读analog.cfg +g读取自定义配置文件

我这每日报告用awstats统计,每月报告用analog统计,每个域名汇总一个月报告. 日志按天存放在/opt/log/2014.08/07/目录下 www.c1gstudio.com.log.gz blog.c1gstudio.com.log.gz www.c1g.com.log.gz

每日运行完awstats后运行analog crontab

10 5 * * * /bin/sh /opt/shell/analog.sh > /dev/null 2>&1

vi /opt/shell/analog.sh

#!/bin/sh ana_dir=/usr/local/analog/ web_dir=/opt/htdocs/www/analog/ conf_dir=”${ana_dir}/conf/” today=`date +%d` yesterday=`date +%Y%m%d` lastday_month=`date +%Y.%m -d ‘1 day ago’` lastday_day=`date +%d -d ‘1 day ago’` c1g_LOGFILE=/opt/log/${lastday_month}/*/*c1gstudio.com.log.gz c1g_OUTFILE=${web_dir}c1gstudio${lastday_month}/index.html POST_LOGFILE=/opt/log/${lastday_month}/*/c1g.com.log.gz POST_OUTFILE=${web_dir}c1g${lastday_month}/index.html #if [ $today == “02” ]; then if [ ! -d $(dirname “${c1g_OUTFILE}”) ]; then mkdir -p $(dirname “${c1g_OUTFILE}”) chown www:website $(dirname “${c1g_OUTFILE}”) fi if [ ! -d $(dirname “${POST_OUTFILE}”) ]; then mkdir -p $(dirname “${POST_OUTFILE}”) chown www:website $(dirname “${POST_OUTFILE}”) fi sed -i “s;LOGFILE.*;LOGFILE ${c1g_LOGFILE};” ${conf_dir}c1gstudio.cfg sed -i “s;OUTFILE.*;OUTFILE ${c1g_OUTFILE};” ${conf_dir}c1gstudio.cfg sed -i “s;LOGFILE.*;LOGFILE ${POST_LOGFILE};” ${conf_dir}c1g.cfg sed -i “s;OUTFILE.*;OUTFILE ${POST_OUTFILE};” ${conf_dir}c1g.cfg #fi ${ana_dir}analog -G +g${conf_dir}c1gstudio.cfg +b +D -d +s +S -n -o -Z -r ${ana_dir}analog -G +g${conf_dir}c1g.cfg +b +D -d +s +S -n -o -Z +r

Posted in 日志.

Tagged with , .


No Responses (yet)

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.



Some HTML is OK

or, reply to this post via trackback.