日志格式:

211.136.115.45|-|[14/Dec/2012:00:00:02 +0800]|GET 3701.aac?type=0&uid=3450371&pos=6&key=%E5%8D%93%E4%BE%9D%E5%A9%B7&ps=34.2.300&version=DM1.2.0.95_S60V3_320X240&channel=
TG60224 HTTP/1.0|206|204800|-|E63|10.166.96.105|bytes 204800-409599/504003|0.065|-|SRV85
awk -F'|' '{a[substr($4,match($4,/uid=[0-9]+/),RLENGTH)]+=$6;b[substr($4,match($4,/uid=[0-9]+/),RLENGTH)]++}END{for(i in a)print i,a[i]*8/(b[i]*1000)" K/uid"}'
简化为:
awk -F'|' '{match($4,/uid=[0-9]+/);a[substr($4,RSTART,RLENGTH)]+=$6;b[substr($4,RSTART,RLENGTH)]++}END{for(i in a)print i,a[i]*8/(b[i]*1000)" K/uid"}'

match函数返回在字符串中正则表达式位置的索引,如果找不到指定的正则表达式则返回0。match函数会设置内建变量RSTART为字符串中子字符串的开始位置,RLENGTH为到子字符串末尾的字符个数。substr可利于这些变量来截取字符串。函数格式如下:
match( string, regular expression )
实例:
$ awk '{start=match("this is a test",/[a-z]+$/); print start}'

$ awk '{start=match("this is a test",/[a-z]+$/); print start, RSTART, RLENGTH }'

例子:

统计启动延时,卡顿次数。

awk '/version:DM4.3.0.00/{match($0,/startup_time:[0-9\.]+/);a+=substr($0,RSTART+13,RLENGTH-13);match($0,/block_times:[0-9\.]+/);b+=substr($0,RSTART+12,RLENGTH-12);i++}END{print "count:"i,"startup_time:"a/i,"block_times:"b/i}' duomi-2013-09-12.play

zcat duomi-2013-09-27.play.gz | awk '/version:DM5.2.5.00/ && /2013-09-27 22:/{match($0,/startup_time:[0-9\.]+/);if(substr($0,RSTART+13,RLENGTH-13) <300){a+=substr($0,RSTART+13,RLENGTH-13); match($0,/block_times:[0-9\.]+/);b+=substr($0,RSTART+12,RLENGTH-12);i++}}END{print "count:"i,"startup_time:"a/i,"block_times:"b/i}'

awk如何获得一个数组的元素数量,也就是数组长度:

[root@COP05-PCS-DM01 ~]# awk 'BEGIN{split("3-3-4",a,"-");print length(a)}'

3