Skip to content


Analog的安装与配置,分析汇总多域名web日志

Analog是一款基于C语言功能强大的开源的网站访问日志分析软件,支持多语言(含中文),可以运行在linux,windows下,支持apache、ngix、iis等主流WEB日志.速度飞快,10分钟内可以处理2千万条日志,数据统计以PV为主,相比AwstatsWebalizer 的报告页面简单了点,更漂亮的图表可用Report Magic 2.21.

目前最新版为analog-6.0,作者自19-Dec-04后就没更新过.演示地址
安装很简单,到:http://www.analog.cx/download.html 下载相应的版本,这里以源码版为例:将下载回来的源码包解压到安装目录,再进入该目录执行make命令即可.

  1. wget http://www.analog.cx/analog-6.0.tar.gz
  2. tar zxvf analog-6.0.tar.gz
  3. cp -ar analog-6.0 /usr/local/
  4. cd /usr/local/analog-6.0
  5. make
  6. ln -s analog-6.0 analog
  7. mkdir /opt/htdocs/www/analog
  8. chown www:website /opt/htdocs/www/analog
  9. cp images /opt/htdocs/www/analog/
  10. mkdir conf
  11. cp analog.cfg conf/c1g.cfg

配置

vi conf/c1g.cfg

  1. #定义为中文
  2. LANGUAGE SIMP-CHINESE
  3. #nginx日志格式
  4. LOGFORMAT   (%s - %j [%d/%M/%Y:%h:%n:%j %j] "%j %r %j" %c %b "%f" "%B"\n)
  5. #日志文件
  6. LOGFILE /opt/log/Y.%M/*/*c1gstudio.com.log.gz
  7. #输出文件
  8. OUTFILE /opt/htdocs/www/analog/c1gstudiolY.%M/index.html
  9. #主机名
  10. HOSTNAME "c1gstudio.com"
  11. #主机URL
  12. HOSTURL http://www.c1gstudio.com/
  13. #web图片目录
  14. IMAGEDIR ../images/
  15. #只列出访问最高的200个页面URL
  16. REQFLOOR 1000p
  17. #forum.php文件算一个文件
  18. FILEALIAS /forum.php* /forum.php
  19. #统计子目录
  20. SUBDIR */*

LOGFORMAT 说明

  1. %S
  2. host (the client hostname, or address of the computer making the request)
  3. %s
  4. numerical IP address of client (if recorded in a separate field; used when %S is empty)
  5. %r
  6. file requested
  7. %q
  8. query string (part of filename after ?, if recorded in a separate field)
  9. %B
  10. browser
  11. %A
  12. browser with +'s instead of spaces
  13. %f
  14. referrer
  15. %u
  16. user (tip: a cookie or session id can usefully be defined as %u too)
  17. %v
  18. virtual host (the server hostname, also called the virtual domain)
  19. %d
  20. day of the month
  21. %m
  22. month in digits
  23. %M
  24. month, three letter English abbreviation
  25. %y
  26. year, last two digits
  27. %Y
  28. year, four digits
  29. %Z
  30. year, two or four digits (less efficient)
  31. %h
  32. hour of the day
  33. %n
  34. minute of the hour
  35. %a
  36. a or A for am, or p or P for pm, if %h is in the 12-hour clock. (So to match "am" you need %am and to match "AM" you need %aM)
  37. %U
  38. "Unix time" (seconds since beginning of 1970, GMT). If it includes decimals, use %U.%j
  39. %b
  40. number of bytes transferred
  41. %t
  42. processing time in seconds
  43. %T
  44. processing time in milliseconds
  45. %D
  46. processing time in microseconds
  47. %c
  48. HTTP status code
  49. %C
  50. code words used instead of HTTP status code in some servers -- only used internally
  51. %j
  52. junk: ignore this field (field can be empty too)
  53. %w
  54. white space: spaces or tabs
  55. %W
  56. optional white space
  57. %%
  58. % sign
  59. \n
  60. new line
  61. \t
  62. tab stop
  63. \\
  64. single backslash

我的nginx日志格式

  1. '$remote_addr - $remote_user [$time_local] "$request" '
  2.    '$status $body_bytes_sent "$http_referer" '
  3.    '"$http_user_agent" $http_x_forwarded_for';
  1. 183.62.5.13 - - [06/Aug/2014:17:16:44 +0800] "GET /aboutc1g.html HTTP/1.1" 200 6642 "http://www.c1gstudio.com/web/hello.html" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36" 183.62.5.13

我这多了个$http_x_forwarded_for’,后面也要加个%j表示丢弃,它不会处理”-”

  1. LOGFORMAT   (%s - %j [%d/%M/%Y:%h:%n:%j %j] "%j %r %j" %c %b "%f" "%B" %j\n)

更多参考

LOGFILE 和OUTFILE说明

LOGFILE new1.log,old*.log
LOGFILE /opt/log/%Y.%M/%D/*.c1gstudio.com.log.gz
支持通配符,日期变量及gz压缩,OUTFILE不会自动创建目录

  1. %D  date of month
  2. %m  month name, in English
  3. %M  month number
  4. %y  two-digit year
  5. %Y  four-digit year
  6. %H  hour
  7. %n  minute
  8. %w  day of week, in English

但是日期不支持运算有点麻烦,需要外部用shell来解决了
更多参考
==================================
2014-8-26更新

The arguments to LOGFILE and CACHEFILE commands are checked for containing only certain allowed characters (specifically, letters, digits, /\.:_*? space, and – between two {letter, digit, underscore}’s). This is because they could match an UNCOMPRESS command and thus be passed to the shell when the uncompress command is popen()’ed.

可以将一个月份分成3部分来减轻压力
LOGFILE /opt/log/%Y.%M/[2-3]?/*.c1gstudio.com.log.gz
Analog运行时会将日志读到内存中,想要运行快最好准备比日志大的内存,CACHEOUTFILE和CACHEFILE会占用大量空间,感觉没什么用.
==================================

配置文件内统计开关变量

  1. MONTHLY ON       # one line for each month
  2. WEEKLY ON        # one line for each week
  3. DAILYREP ON      # one line for each day
  4. DAILYSUM ON      # one line for each day of the week
  5. HOURLYREP ON     # one line for each hour of the day
  6. GENERAL ON       # the General Summary at the top
  7. REQUEST ON       # which files were requested
  8. FAILURE ON       # which files were not found
  9. DIRECTORY ON     # Directory Report
  10. HOST ON          # which computers requested files
  11. ORGANISATION ON  # which organisations they were from
  12. DOMAIN ON        # which countries they were in
  13. REFERRER ON      # where people followed links from
  14. FAILREF ON       # where people followed broken links from
  15. SEARCHQUERY ON   # the phrases and words they used...
  16. SEARCHWORD ON    # ...to find you from search engines
  17. BROWSERSUM ON    # which browser types people were using
  18. OSREP ON         # and which operating systems
  19. FILETYPE ON      # types of file requested
  20. SIZE ON          # sizes of files requested
  21. STATUS ON        # number of each type of success and failure

命令行参数

  1. x  GENERAL         General Summary
  2. 1  YEARLY          Yearly Report
  3. Q  QUARTERLY       Quarterly Report
  4. m  MONTHLY         Monthly Report
  5. W  WEEKLY          Weekly Report
  6. D  DAILYREP        Daily Report
  7. d  DAILYSUM        Daily Summary
  8. H  HOURLYREP       Hourly Report
  9. h  HOURLYSUM       Hourly Summary
  10. w  WEEKHOUR        Hour of the Week Summary
  11. 4  QUARTERREP      Quarter-Hour Report
  12. 6  QUARTERSUM      Quarter-Hour Summary
  13. 5  FIVEREP         Five-Minute Report
  14. 7  FIVESUM         Five-Minute Summary
  15. S  HOST            Host Report
  16. l  REDIRHOST       Host Redirection Report
  17. L  FAILHOST        Host Failure Report
  18. Z  ORGANISATION    Organisation Report
  19. o  DOMAIN          Domain Report
  20. r  REQUEST         Request Report
  21. i  DIRECTORY       Directory Report
  22. t  FILETYPE        File Type Report
  23. z  SIZE            File Size Report
  24. P  PROCTIME        Processing Time Report
  25. E  REDIR           Redirection Report
  26. I  FAILURE         Failure Report
  27. f  REFERRER        Referrer Report
  28. s  REFSITE         Referring Site Report
  29. N  SEARCHQUERY     Search Query Report
  30. n  SEARCHWORD      Search Word Report
  31. Y  INTSEARCHQUERY  Internal Search Query Report
  32. y  INTSEARCHWORD   Internal Search Word Report
  33. k  REDIRREF        Redirected Referrer Report
  34. K  FAILREF         Failed Referrer Report
  35. B  BROWSERREP      Browser Report
  36. b  BROWSERSUM      Browser Summary
  37. p  OSREP           Operating System Report
  38. v  VHOST           Virtual Host Report
  39. R  REDIRVHOST      Virtual Host Redirection Report
  40. M  FAILVHOST       Virtual Host Failure Report
  41. u  USER            User Report
  42. j  REDIRUSER       User Redirection Report
  43. J  FAILUSER        User Failure Report
  44. c  STATUS          Status Code Report

#+a可以带上全部统计
更多参考

#输出当前配置
analog -settings > file

#使用命令行配置LOGFILE和OUTFILE
./analog +O/opt/htdocs/www/analog/c1gstudio2014.html /opt/log/2014.08/02/*.c1gstudio.com.log.gz
我使用时一直会报日志格式错误,无法出报告

#我使用的参数
/usr/local/analog -G +g/usr/local/analog/conf/c1g.cfg +b +s +S -n -o -Z -r
+b 浏览器概要报告
-n 检索字报告
+s 来源网站报告
-o 网域报告
-Z 来源组织单位报告
+S 主机报告
-r 请求报告

-G 不读analog.cfg
+g读取自定义配置文件

我这每日报告用awstats统计,每月报告用analog统计,每个域名汇总一个月报告.
日志按天存放在/opt/log/2014.08/07/目录下
www.c1gstudio.com.log.gz
blog.c1gstudio.com.log.gz
www.c1g.com.log.gz

每日运行完awstats后运行analog
crontab

  1. 10 5 * * * /bin/sh /opt/shell/analog.sh > /dev/null 2>&1

vi /opt/shell/analog.sh

  1. #!/bin/sh
  2. ana_dir=/usr/local/analog/
  3. web_dir=/opt/htdocs/www/analog/
  4. conf_dir="${ana_dir}/conf/"
  5.  
  6. today=`date +%d`
  7. yesterday=`date +%Y%m%d`
  8. lastday_month=`date +%Y.%m -d '1 day ago'`
  9. lastday_day=`date +%d -d '1 day ago'`
  10.  
  11. c1g_LOGFILE=/opt/log/${lastday_month}/*/*c1gstudio.com.log.gz
  12. c1g_OUTFILE=${web_dir}c1gstudio${lastday_month}/index.html
  13.  
  14. POST_LOGFILE=/opt/log/${lastday_month}/*/c1g.com.log.gz
  15. POST_OUTFILE=${web_dir}c1g${lastday_month}/index.html
  16.  
  17. #if [ $today == "02" ]; then
  18. if [ ! -d $(dirname "${c1g_OUTFILE}") ]; then
  19. mkdir -p $(dirname "${c1g_OUTFILE}")
  20. chown www:website $(dirname "${c1g_OUTFILE}")
  21. fi
  22. if [ ! -d $(dirname "${POST_OUTFILE}") ]; then
  23. mkdir -p $(dirname "${POST_OUTFILE}")
  24. chown www:website $(dirname "${POST_OUTFILE}")
  25. fi
  26. sed -i "s;LOGFILE.*;LOGFILE ${c1g_LOGFILE};" ${conf_dir}c1gstudio.cfg
  27. sed -i "s;OUTFILE.*;OUTFILE ${c1g_OUTFILE};" ${conf_dir}c1gstudio.cfg
  28. sed -i "s;LOGFILE.*;LOGFILE ${POST_LOGFILE};" ${conf_dir}c1g.cfg
  29. sed -i "s;OUTFILE.*;OUTFILE ${POST_OUTFILE};" ${conf_dir}c1g.cfg
  30. #fi
  31.  
  32. ${ana_dir}analog -G +g${conf_dir}c1gstudio.cfg +b +D -d +s +S -n -o -Z -r
  33. ${ana_dir}analog -G +g${conf_dir}c1g.cfg +b +D -d +s +S -n -o -Z +r

Posted in 日志.

Tagged with , .


No Responses (yet)

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.



Some HTML is OK

or, reply to this post via trackback.