CountFilter

Description

Countfilter is used to count input tokens. It is specified as a tokenfilter to be able to handle any kind of input text (line, words, whole file, etc...). This filter will increment a property value each time an input is counted. Nested elements are used to do more action on the matching input.
Countfilter may be used directly within a filterchain. In this case a tokenfilter is created implicitly. An extra attribute "byline" is added to countfilter to specify whether to use a linetokenizer (byline="true") or a filetokenizer (byline="false"). The default is "true".

Definition

<typedef name="countfilter"
         classname="net.sf.antcount.filters.CountFilter"
         classpath="antcount.jar" />

Attributes

Name Description Required Default
property The name of the Ant property to set with the total input count. No The property won't be set
override Can we override an existing Ant property for the total count.
Caution: setting this to true will break the property immutability rule in Ant.
No false
init The initial value of the total counter. No 0
step The increment step to use for the total counter. I don't know if this can be useful to anyone... No 1
contains The input will be counted only if it contains this string. No Count each input
match The input will be counted only if it matches this regular expression. No Count each input

Nested Elements

Countfilter accepts nested elements to run more tasks on the same matching line.

counteach

The counteach element uses the match attribute of its countfilter parent and substitute this value using a select regular expression pattern. See ReplaceRegexp for more information about this replacement mechanism. Once it gets the substitution string, it counts its number of occurrence. Each number of occurence is associated a property.
You can get the properties created this way by using propertyset, echoproperties or ant-contrib's propertyselector

Several cases can happen if we try to create a property that already exists:

Name Description Required Default
select The substitution pattern to select what needs to be counted and create properties for it No The whole input is used
flags The regexp flags to use. See ReplaceRegexp for more information No None
propertyprefix The prefix to use when creating properties No No prefix. Each property will have the same name than the select result
override Can we override existing properties.
Caution: setting this to true will break the property immutability rule in Ant.
No False
reinit Wether to reinitialize already set property to 0 before counting. This cannot be used unless override=true is set. No False. This means that, by default, if a property is already set and it's value is a number, then we will reuse that number to start counting.
verbose Wether to print a message when creating, reusing or overwriting a property. Override of existing, non-number properties will always be logged. No False

max

Finds the maximum value of a selected float. Selection of the float value is done by using a select regular expression pattern against the parent countfilter match attribute. See ReplaceRegexp for more information about this replacement mechanism.

Name Description Required Default
property The property that will store the max value No Do not calculate a global value (for all input).
propertyselect Calculate the max value for each matching values.
For example, the following will get the max of \2 for each value of \1. Generated properties will be named max.of.\1.
<max propertyselect="max.of.\1" select="\2" />
No Do not calculate a value for specific input.
propertyprefix The prefix to use when creating property set with propertyselect. No No prefix
override Can we override an existing property.
Caution: setting this to true will break the property immutability rule in Ant.
No False
select The substitution pattern to select the float value No The whole input is used
flags The regexp flags to use. See ReplaceRegexp for more information No None
failonerror Should we stop in case of error when parsing a float No False

min

Same than max, but finds the minimum value

Name Description Required Default
property The property that will store the min value No Do not calculate a global value (for all input).
propertyselect Calculate the min value for each matching values.
For example, the following will get the min of \2 for each value of \1. Generated properties will be named min.of.\1.
<min propertyselect="min.of.\1" select="\2" />
No Do not calculate a value for specific input.
propertyprefix The prefix to use when creating property set with propertyselect. No No prefix
override Can we override an existing property.
Caution: setting this to true will break the property immutability rule in Ant.
No False
select The substitution pattern to select the float value No The whole input is used
flags The regexp flags to use. See ReplaceRegexp for more information No None
failonerror Should we stop in case of error when parsing a float No False

sum

Same than max, but adds all parsed floats

Name Description Required Default
property The property that will store the sum result No Do not calculate a global value (for all input).
propertyselect Adds each matching values.
For example, the following will get the sum of \2 for each value of \1. Generated properties will be named add.of.\1.
<add propertyselect="add.of.\1" select="\2" />
No Do not calculate a value for specific input.
propertyprefix The prefix to use when creating property set with propertyselect. No No prefix
override Can we override an existing property.
Caution: setting this to true will break the property immutability rule in Ant.
No False
select The substitution pattern to select the float value No The whole input is used
flags The regexp flags to use. See ReplaceRegexp for more information No None
failonerror Should we stop in case of error when parsing a float No False

avg

Same than max, but calculates the average of all parsed floats.

Name Description Required Default
property The property that will store the average value No Do not calculate a global value (for all input).
propertyselect Calculate the avg value for each matching values.
For example, the following will get the avg of \2 for each value of \1. Generated properties will be named avg.of.\1.
<max propertyselect="avg.of.\1" select="\2" />
No Do not calculate a value for specific input.
propertyprefix The prefix to use when creating property set with propertyselect. No No prefix
override Can we override an existing property.
Caution: setting this to true will break the property immutability rule in Ant.
No False
select The substitution pattern to select each float value No The whole input is used
flags The regexp flags to use. See ReplaceRegexp for more information No None
failonerror Should we stop in case of error when parsing a float No False

Examples

Count input lines in property ${nb.lines}:

<tokenfilter>
  <countfilter property="nb.lines" />
</tokenfilter>


Count input words in property ${nb.words} and words containing 'xxx' in ${nb.words.xxx}:

<tokenfilter>
  <stringtokenizer />
  <countfilter property="nb.words" />
  <countfilter property="nb.words.xxx" contains="xxx" />
</tokenfilter>


With Ant 1.7, the following will parse a set of zip file content and count the total number of line (this example makes use of the ant-contrib for task):

<property name="total" value="0" />
<for param="zip.file">
  <fileset dir="${zip.dir}" includes="*.zip" />
  <sequential>
    <concat>
      <zipfileset src="@{zip.file}" includes="**" />
      <filterchain>
        <tokenfilter>
          <countfilter property="total" init="${total}" override="true" />
        </tokenfilter>
        <stopfilter />
      </filterchain>
    </concat>
  </sequential>
</for>


Each line of log files in ${logs.dir} starts with a date in the 'yyyy-mm-dd' format. The following scans all these files, gets the number of valid lines (starting with a date) and sets for each date a property giving the occurence count for that date.

<concat>
  <fileset dir="${logs.dir}" includes="**" />
  <filterchain>
    <tokenfilter>
      <countfilter property="nb.matching.lines" match="^(....-..-..).*">
        <counteach propertyprefix="count." select="\1" />
      </countfilter>
    </tokenfilter>
    <stopfilter />
  </filterchain>
</concat>
<echo>
${nb.matching.lines} matching lines.
Count per day:
</echo>
<echoproperties prefix="count." />


Count the number of occurence of each word in a text:

<concat>
  <fileset dir="${logs.dir}" includes="**" />
  <filterchain>
    <tokenfilter>
      <countfilter>
        <counteach propertyprefix="count." />
      </countfilter>
    </tokenfilter>
    <stopfilter />
  </filterchain>
</concat>
<echoproperties prefix="count." />


Finds the min, max and average values of all float values in the text:

<tokenfilter>
  <stringtokenizer />
  <countfilter match="[0-9\.]+">
    <min property="min" />
    <max property="max" />
    <avg property="avg" />
  </countfilter>
</tokenfilter>


Let say a log file contains lines with a servlet name and a response time value like this:
MyServlet;0.345.
The following will find the average response time for each servlet and the global average response time:

<countfilter match="(.*);([0-9\.]+)">
  <avg property="avg" select="\2" property="average.response.time"
       propertyselect="average.response.time.for.\1" />
</countfilter>