108 lines
5.9 KiB
Markdown
108 lines
5.9 KiB
Markdown
|
|
# Intel RDT Input Plugin
|
|||
|
|
The intel_rdt plugin collects information provided by monitoring features of
|
|||
|
|
Intel Resource Director Technology (Intel(R) RDT) like Cache Monitoring Technology (CMT),
|
|||
|
|
Memory Bandwidth Monitoring (MBM), Cache Allocation Technology (CAT) and Code
|
|||
|
|
and Data Prioritization (CDP) Technology provide the hardware framework to monitor
|
|||
|
|
and control the utilization of shared resources, like last level cache, memory bandwidth.
|
|||
|
|
These Technologies comprise Intel’s Resource Director Technology (RDT).
|
|||
|
|
As multithreaded and multicore platform architectures emerge,
|
|||
|
|
running workloads in single-threaded, multithreaded, or complex virtual machine environment,
|
|||
|
|
the last level cache and memory bandwidth are key resources to manage. Intel introduces CMT,
|
|||
|
|
MBM, CAT and CDP to manage these various workloads across shared resources.
|
|||
|
|
|
|||
|
|
To gather Intel RDT metrics plugin uses _pqos_ cli tool which is a part of [Intel(R) RDT Software Package](https://github.com/intel/intel-cmt-cat).
|
|||
|
|
Before using this plugin please be sure _pqos_ is properly installed and configured regarding that the plugin
|
|||
|
|
run _pqos_ to work with `OS Interface` mode. This plugin supports _pqos_ version 4.0.0 and above.
|
|||
|
|
Be aware pqos tool needs root privileges to work properly.
|
|||
|
|
|
|||
|
|
Metrics will be constantly reported from the following `pqos` commands within the given interval:
|
|||
|
|
|
|||
|
|
#### In case of cores monitoring:
|
|||
|
|
```
|
|||
|
|
pqos -r --iface-os --mon-file-type=csv --mon-interval=INTERVAL --mon-core=all:[CORES]\;mbt:[CORES]
|
|||
|
|
```
|
|||
|
|
where `CORES` is equal to group of cores provided in config. User can provide many groups.
|
|||
|
|
|
|||
|
|
#### In case of process monitoring:
|
|||
|
|
```
|
|||
|
|
pqos -r --iface-os --mon-file-type=csv --mon-interval=INTERVAL --mon-pid=all:[PIDS]\;mbt:[PIDS]
|
|||
|
|
```
|
|||
|
|
where `PIDS` is group of processes IDs which name are equal to provided process name in a config.
|
|||
|
|
User can provide many process names which lead to create many processes groups.
|
|||
|
|
|
|||
|
|
In both cases `INTERVAL` is equal to sampling_interval from config.
|
|||
|
|
|
|||
|
|
Because PIDs association within system could change in every moment, Intel RDT plugin provides a
|
|||
|
|
functionality to check on every interval if desired processes change their PIDs association.
|
|||
|
|
If some change is reported, plugin will restart _pqos_ tool with new arguments. If provided by user
|
|||
|
|
process name is not equal to any of available processes, will be omitted and plugin will constantly
|
|||
|
|
check for process availability.
|
|||
|
|
|
|||
|
|
### Useful links
|
|||
|
|
Pqos installation process: https://github.com/intel/intel-cmt-cat/blob/master/INSTALL
|
|||
|
|
Enabling OS interface: https://github.com/intel/intel-cmt-cat/wiki, https://github.com/intel/intel-cmt-cat/wiki/resctrl
|
|||
|
|
More about Intel RDT: https://www.intel.com/content/www/us/en/architecture-and-technology/resource-director-technology.html
|
|||
|
|
|
|||
|
|
### Configuration
|
|||
|
|
```toml
|
|||
|
|
# Read Intel RDT metrics
|
|||
|
|
[[inputs.IntelRDT]]
|
|||
|
|
## Optionally set sampling interval to Nx100ms.
|
|||
|
|
## This value is propagated to pqos tool. Interval format is defined by pqos itself.
|
|||
|
|
## If not provided or provided 0, will be set to 10 = 10x100ms = 1s.
|
|||
|
|
# sampling_interval = "10"
|
|||
|
|
|
|||
|
|
## Optionally specify the path to pqos executable.
|
|||
|
|
## If not provided, auto discovery will be performed.
|
|||
|
|
# pqos_path = "/usr/local/bin/pqos"
|
|||
|
|
|
|||
|
|
## Optionally specify if IPC and LLC_Misses metrics shouldn't be propagated.
|
|||
|
|
## If not provided, default value is false.
|
|||
|
|
# shortened_metrics = false
|
|||
|
|
|
|||
|
|
## Specify the list of groups of CPU core(s) to be provided as pqos input.
|
|||
|
|
## Mandatory if processes aren't set and forbidden if processes are specified.
|
|||
|
|
## e.g. ["0-3", "4,5,6"] or ["1-3,4"]
|
|||
|
|
# cores = ["0-3"]
|
|||
|
|
|
|||
|
|
## Specify the list of processes for which Metrics will be collected.
|
|||
|
|
## Mandatory if cores aren't set and forbidden if cores are specified.
|
|||
|
|
## e.g. ["qemu", "pmd"]
|
|||
|
|
# processes = ["process"]
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Exposed metrics
|
|||
|
|
| Name | Full name | Description |
|
|||
|
|
|---------------|-----------------------------------------------|-------------|
|
|||
|
|
| MBL | Memory Bandwidth on Local NUMA Node | Memory bandwidth utilization by the relevant CPU core/process on the local NUMA memory channel |
|
|||
|
|
| MBR | Memory Bandwidth on Remote NUMA Node | Memory bandwidth utilization by the relevant CPU core/process on the remote NUMA memory channel |
|
|||
|
|
| MBT | Total Memory Bandwidth | Total memory bandwidth utilized by a CPU core/process on local and remote NUMA memory channels |
|
|||
|
|
| LLC | L3 Cache Occupancy | Total Last Level Cache occupancy by a CPU core/process |
|
|||
|
|
| *LLC_Misses | L3 Cache Misses | Total Last Level Cache misses by a CPU core/process |
|
|||
|
|
| *IPC | Instructions Per Cycle | Total instructions per cycle executed by a CPU core/process |
|
|||
|
|
|
|||
|
|
*optional
|
|||
|
|
|
|||
|
|
### Troubleshooting
|
|||
|
|
Pointing to non-existing core will lead to throwing an error by _pqos_ and plugin will not work properly.
|
|||
|
|
Be sure to check if provided core number exists within desired system.
|
|||
|
|
|
|||
|
|
Be aware reading Intel RDT metrics by _pqos_ cannot be done simultaneously on the same resource.
|
|||
|
|
So be sure to not use any other _pqos_ instance which is monitoring the same cores or PIDs within working system.
|
|||
|
|
Also there is no possibility to monitor same cores or PIDs on different groups.
|
|||
|
|
|
|||
|
|
Pids association for the given process could be manually checked by `pidof` command. E.g:
|
|||
|
|
```
|
|||
|
|
pidof PROCESS
|
|||
|
|
```
|
|||
|
|
where `PROCESS` is process name.
|
|||
|
|
|
|||
|
|
### Example Output
|
|||
|
|
```
|
|||
|
|
> rdt_metric,cores=12\,19,host=r2-compute-20,name=IPC,process=top value=0 1598962030000000000
|
|||
|
|
> rdt_metric,cores=12\,19,host=r2-compute-20,name=LLC_Misses,process=top value=0 1598962030000000000
|
|||
|
|
> rdt_metric,cores=12\,19,host=r2-compute-20,name=LLC,process=top value=0 1598962030000000000
|
|||
|
|
> rdt_metric,cores=12\,19,host=r2-compute-20,name=MBL,process=top value=0 1598962030000000000
|
|||
|
|
> rdt_metric,cores=12\,19,host=r2-compute-20,name=MBR,process=top value=0 1598962030000000000
|
|||
|
|
> rdt_metric,cores=12\,19,host=r2-compute-20,name=MBT,process=top value=0 1598962030000000000
|
|||
|
|
```
|