2018-05-05 09:40:05 +08:00
# TopK Processor Plugin
2022-06-07 07:04:28 +08:00
The TopK processor plugin is a filter designed to get the top series over a
period of time. It can be tweaked to calculate the top metrics via different
aggregation functions.
2018-05-05 09:40:05 +08:00
This processor goes through these steps when processing a batch of metrics:
2021-11-25 02:47:11 +08:00
1. Groups measurements in buckets based on their tags and name
2. Every N seconds, for each bucket, for each selected field: aggregate all the measurements using a given aggregation function (min, sum, mean, etc) and the field.
3. For each computed aggregation: order the buckets by the aggregation, then returns all measurements in the top `K` buckets
2018-05-05 09:40:05 +08:00
2020-12-24 00:43:21 +08:00
Notes:
2018-05-05 09:40:05 +08:00
2021-11-25 02:47:11 +08:00
* The deduplicates metrics
* The name of the measurement is always used when grouping it
* Depending on the amount of metrics on each bucket, more than `K` series may be returned
* If a measurement does not have one of the selected fields, it is dropped from the aggregation
2022-10-27 03:58:36 +08:00
## Global configuration options <!-- @/docs/includes/plugin_config.md -->
In addition to the plugin-specific configuration settings, plugins support
additional global and plugin configuration settings. These settings are used to
modify metrics, tags, and field or create aliases and configure ordering, etc.
See the [CONFIGURATION.md][CONFIGURATION.md] for more details.
2023-01-12 23:55:21 +08:00
[CONFIGURATION.md]: ../../../docs/CONFIGURATION.md#plugins
2022-10-27 03:58:36 +08:00
2021-11-25 02:47:11 +08:00
## Configuration
2018-05-05 09:40:05 +08:00
2022-05-25 22:59:41 +08:00
```toml @sample .conf
2022-04-07 04:49:41 +08:00
# Print all metrics that pass through this filter.
2018-05-05 09:40:05 +08:00
[[processors.topk]]
## How many seconds between aggregations
# period = 10
2022-03-25 01:56:51 +08:00
## How many top buckets to return per field
## Every field specified to aggregate over will return k number of results.
## For example, 1 field with k of 10 will return 10 buckets. While 2 fields
## with k of 3 will return 6 buckets.
2018-05-05 09:40:05 +08:00
# k = 10
2022-04-07 04:49:41 +08:00
## Over which tags should the aggregation be done. Globs can be specified, in
## which case any tag matching the glob will aggregated over. If set to an
## empty list is no aggregation over tags is done
2018-05-05 09:40:05 +08:00
# group_by = ['*']
2022-03-25 01:56:51 +08:00
## The field(s) to aggregate
## Each field defined is used to create an independent aggregation. Each
## aggregation will return k buckets. If a metric does not have a defined
## field the metric will be dropped from the aggregation. Considering using
## the defaults processor plugin to ensure fields are set if required.
2018-05-05 09:40:05 +08:00
# fields = ["value"]
2020-12-24 00:43:21 +08:00
## What aggregation function to use. Options: sum, mean, min, max
2018-05-05 09:40:05 +08:00
# aggregation = "mean"
2022-04-07 04:49:41 +08:00
## Instead of the top k largest metrics, return the bottom k lowest metrics
2018-05-05 09:40:05 +08:00
# bottomk = false
2022-04-07 04:49:41 +08:00
## The plugin assigns each metric a GroupBy tag generated from its name and
## tags. If this setting is different than "" the plugin will add a
## tag (which name will be the value of this setting) to each metric with
## the value of the calculated GroupBy tag. Useful for debugging
2018-05-05 09:40:05 +08:00
# add_groupby_tag = ""
2022-04-07 04:49:41 +08:00
## These settings provide a way to know the position of each metric in
## the top k. The 'add_rank_field' setting allows to specify for which
## fields the position is required. If the list is non empty, then a field
## will be added to each and every metric for each string present in this
## setting. This field will contain the ranking of the group that
## the metric belonged to when aggregated over that field.
2018-05-05 09:40:05 +08:00
## The name of the field will be set to the name of the aggregation field,
## suffixed with the string '_topk_rank'
# add_rank_fields = []
## These settings provide a way to know what values the plugin is generating
2022-04-07 04:49:41 +08:00
## when aggregating metrics. The 'add_aggregate_field' setting allows to
## specify for which fields the final aggregation value is required. If the
## list is non empty, then a field will be added to each every metric for
## each field present in this setting. This field will contain
## the computed aggregation for the group that the metric belonged to when
## aggregated over that field.
2018-05-05 09:40:05 +08:00
## The name of the field will be set to the name of the aggregation field,
## suffixed with the string '_topk_aggregate'
# add_aggregate_fields = []
```
2021-11-25 02:47:11 +08:00
### Tags
2018-05-05 09:40:05 +08:00
2022-06-07 07:04:28 +08:00
This processor does not add tags by default. But the setting `add_groupby_tag`
will add a tag if set to anything other than ""
2018-05-05 09:40:05 +08:00
2021-11-25 02:47:11 +08:00
### Fields
2018-05-05 09:40:05 +08:00
2022-06-07 07:04:28 +08:00
This processor does not add fields by default. But the settings
`add_rank_fields` and `add_aggregation_fields` will add one or several fields if
set to anything other than ""
2019-01-04 04:06:56 +08:00
### Example
2021-11-25 02:47:11 +08:00
Below is an example configuration:
2019-01-04 04:06:56 +08:00
```toml
[[processors.topk]]
period = 20
k = 3
group_by = ["pid"]
fields = ["cpu_usage"]
```
2021-11-25 02:47:11 +08:00
Output difference with topk:
2019-01-04 04:06:56 +08:00
```diff
< procstat , pid = 2088,process_name=Xorg cpu_usage = 7.296576662282613 1546473820000000000
< procstat , pid = 2780,process_name=ibus-engine-simple cpu_usage = 0 1546473820000000000
< procstat , pid = 2554,process_name=gsd-sound cpu_usage = 0 1546473820000000000
< procstat , pid = 3484,process_name=chrome cpu_usage = 4.274300361942799 1546473820000000000
< procstat , pid = 2467,process_name=gnome-shell-calendar-server cpu_usage = 0 1546473820000000000
< procstat , pid = 2525,process_name=gvfs-goa-volume-monitor cpu_usage = 0 1546473820000000000
< procstat , pid = 2888,process_name=gnome-terminal-server cpu_usage = 1.0224991500287577 1546473820000000000
< procstat , pid = 2454,process_name=ibus-x11 cpu_usage = 0 1546473820000000000
< procstat , pid = 2564,process_name=gsd-xsettings cpu_usage = 0 1546473820000000000
< procstat , pid = 12184,process_name=docker cpu_usage = 0 1546473820000000000
< procstat , pid = 2432,process_name=pulseaudio cpu_usage = 9.892858669796528 1546473820000000000
---
> procstat,pid=2432,process_name=pulseaudio cpu_usage=11.486933087507786 1546474120000000000
> procstat,pid=2432,process_name=pulseaudio cpu_usage=10.056503212060552 1546474130000000000
> procstat,pid=23620,process_name=chrome cpu_usage=2.098690278123081 1546474120000000000
> procstat,pid=23620,process_name=chrome cpu_usage=17.52514619948493 1546474130000000000
> procstat,pid=2088,process_name=Xorg cpu_usage=1.6016732172309973 1546474120000000000
> procstat,pid=2088,process_name=Xorg cpu_usage=8.481040931533833 1546474130000000000
```