PQOS(8) | System Manager's Manual | PQOS(8) |
NAME¶
pqos, pqos-msr, pqos-os - Intel(R) Resource Director Technology/AMD PQoS monitoring and control tool
SYNOPSIS¶
pqos [OPTIONS]...
DESCRIPTION¶
Intel(R) Resource Director Technology/AMD PQoS is designed to monitor and manage cpu resources and improve performance of applications and virtual machines.
Intel(R) Resource Director Technology/AMD PQoS includes monitoring and control technologies. Monitoring technologies include CMT (Cache Monitoring Technology), which monitors occupancy of last level cache, and MBM (Memory Bandwidth Monitoring). Control technologies include CAT (Cache Allocation Technology), CDP (Code Data Prioritization), MBA (Memory Bandwidth Allocation) and SMBA (Slow Memory Bandwidth Allocation).
pqos supports CMT and MBM on a per core or hardware thread basis.
MBM supports two types of events reporting local and remote memory
bandwidth.
pqos-msr and pqos-os are simple pqos wrapper scripts that automatically select
the MSR or OS/Kernel library interface to program the technologies.
Please see the -I option below for more information.
For hardware information please refer to the README located on: https://github.com/intel/intel-cmt-cat/blob/master/README
OPTIONS¶
pqos options are as follow:
- -h, --help
- show help
- -v, --verbose
- verbose mode
- -V, --super-verbose
- super-verbose mode
- -l FILE, --log-file=FILE
- log messages into selected log FILE
- -s, --show
- show the current allocation and monitoring configuration
- -d, --display
- display supported Intel(R) Resource Director Technology/AMD PQoS capabilities
- -D, --display-verbose
- display supported Intel(R) Resource Director Technology/AMD PQoS capabilities in verbose mode
- -f FILE, --config-file=FILE
- load commands from selected configuration FILE
- -e CLASSDEF, --alloc-class=CLASSDEF
- define the allocation classes on all CPU sockets. CLASSDEF format is
"TYPE:ID=DEFINITION;...".
define classes for selected CPU resources. CLASSDEF format is "TYPE[@RESOURCE_ID]:ID=DEFINITION;...".
For CAT, TYPE is "llc" for the last level cache (aka l3) and "l2" for level 2 cache, ID is a CLOS number and DEFINITION is a bitmask.
For MBA, TYPE is "mba", ID is a CLOS number and DEFINITION is a value between 1 and 100 representing the percentage of available memory bandwidth.
For MBA CTRL, TYPE is "mba_max", ID is a CLOS number and DEFINITION is a value representing the requested memory bandwidth specified in MBps.
For SMBA, TYPE is "smba", ID is a CLOS number and DEFINITION is a value between 1 and 100 representing the percentage of available slow memory bandwidth.
RESOURCE_ID is a unique number that can represent a socket or l2/l3 cache identifier. The RESOURCE_ID for each logical CPU can be found using "pqos -s"
Note: When L2/L3 CDP is on, ID can be postfixed with 'D' for data or 'C' for code.
Note: L2/L3 CDP is available on selected CPUs only.
Note: MBA CTRL is supported only by the OS interface and requires Linux and kernel version 4.18 or newer.
Some examples:
"-e llc:0=0xffff;llc:1=0x00ff"
"-e llc@0-1:2=0xff00;l2:2=0x3f;l2@2:1=0xf"
"-e llc:0d=0xfff;llc:0c=0xfff00"
"-e l2:0d=0xf;l2:0c=0xc"
"-e mba:1=30;mba@1:3=80"
"-e smba:1=64;mba@1:3=128"
"-e mba_max:1=6000;mba_max@1:3=10000"
"-e l2:2=0x3f" means that COS2 for all L2 cache clusters is changed to 0x3f.
"-e l2@2:1=0xf" means that COS1 for L2 cache cluster 2 is changed to 0xf.
"-e mba:1=30" means that COS1, on all sockets, can utilize up to 30% of available memory bandwidth.
"-e smba:1=64" means that COS1, on all sockets, can utilize up to 64 units of available slow memory bandwidth.
"-e mba_max:1=6000" means that COS1, on all sockets, can utilize up to 6000 MBps of memory bandwidth.
- -a CLASS2ID, --alloc-assoc=CLASS2ID
- associate allocation classes with cores or processes. CLASS2ID format is
"TYPE:ID=CORE_LIST;..." or "TYPE:ID=TASK_LIST;...".
For COS association required for CAT or MBA, TYPE is "cos", "llc", "core" (for COS-core association) or "pid" (for COS-process association) and ID is a class number. CORE_LIST is comma or dash separated list of cores. TASK_LIST is comma or dash separated list of process/task ID's.
For example:
"-a cos:0=0,2,4,6-10;cos:1=1;" associates cores 0, 2, 4, 6, 7, 8, 9, 10 with CAT class 0 and core 1 with class 1.
"-a llc:0=0,2,4,6-10;llc:1=1;" associates cores 0, 2, 4, 6, 7, 8, 9, 10 with CAT class 0 and core 1 with class 1.
"-a core:0=0,2,4,6-10;core:1=1;" associates cores 0, 2, 4, 6, 7, 8, 9, 10 with CAT class 0 and core 1 with class 1.
"-I -a pid:0=3543,7643,4556;pid:1=7644;" associates process ID 3543, 7643, 4556 with CAT class 0 and process ID 7644 with class 1.
Notes:
"llc" TYPE label is considered deprecated, please use "cos" or "core" instead.
The -I option must be used for PID association.
- -R [CONFIG[,CONFIG]], --alloc-reset[=CONFIG[,CONFIG]]
- reset allocation setting (L3 CAT, L2 CAT, MBA) and reconfigure allocation. CONFIG is one of the following options:
l3cdp-off sets L3 CDP off
l3cdp-any keeps current L3 CDP setting (default)
l3iordt-on sets L3 I/O RDT on
l3iordt-off sets L3 I/O RDT off
l3iordt-any keeps current L3 I/O RDT setting (default)
l2cdp-on sets L2 CDP on
l2cdp-off sets L2 CDP off
l2cdp-any keeps current L2 CDP setting (default)
mbaCtrl-on sets MBA CTRL on
mbaCtrl-off sets MBA CTRL off
mbaCtrl-any keeps current MBA CTRL setting (default)
mba40-on enables MBA 4.0 extensions for all sockets
mba40-off disables MBA 4.0 extensions for all sockets
mba40-any keeps current MBA 4.0 setting (default)
- -m EVTCORES, --mon-core=EVTCORES
- select the cores and events for monitoring, EVTCORES format is
"EVENT:CORE_LIST". Valid EVENT settings are:
- "llc" for CMT (LLC occupancy)
- "mbr" for MBR (remote memory bandwidth)
- "mbl" for MBL (local memory bandwidth)
- "mbt" for MBT (total memory bandwidth)
- "all" or "" for all detected event types (except MBT)
CORE_LIST is comma or dash separated list of cores.
Example "-m all:0,2,4-10;llc:1,3;mbr:11-12".
Core statistics can be grouped by enclosing the core list in square brackets.
Example "-m llc:[0-3];all:[4,5,6];mbr:[0-3],7,8". - -p [EVTPIDS], --mon-pid[=EVTPIDS]
- select top 10 most active (CPU utilizing) process ids to monitor or select
the process ids and events to monitor, EVTPIDS format is
"EVENT:PID_LIST".
See -m option for valid EVENT settings. PID_LIST is comma separated list of process ids.
Examples:
"-p all:892,4588-4592"
Process's IDs can be grouped by enclosing them in square brackets.
Examples:
"-p all:892,[4588-4592]"
Note:
The -I option must be used for PID monitoring.
It is not possible to track both processes and cores at the same time.
- --mon-uncore[=EVTUNCORE]
- select sockets and uncore events for monitoring, EVTUNCORE format is
'EVENT:SOCKET_LIST. Socket's IDs can be grouped by enclosing them in
square brackets.
Examples:
"--mon-uncore=all:0"
Note: It is not possible to track both sockets and cores at the same time.
- -T, --mon-top
- enable top like monitoring output sorted by highest LLC occupancy
- --mon-dev=EVTDEVICES"
- select I/O RDT devices and events to monitor, EVTDEVICES format is
See -m option for valid EVENT settings. DEVICE_LIST is comma separated list of I/O RDT devices.
Examples:
"--mon-dev llc:0000:0010:05.0"
- --mon-channel=EVTCHANNELS
- select I/O RDT channels and events to monitor, EVTCHANNELS format is
See -m option for valid EVENT settings. CHANNEL_LIST is comma separated list of I/O RDT channels.
Channels can be grouped by enclosing them in square brackets. - -o FILE, --mon-file FILE
- select output FILE to store monitored data in, the default is 'stdout'
- -u TYPE, --mon-file-type=TYPE
- select the output format TYPE for monitored data. Supported TYPE settings are: "text" (default), "xml" and "csv".
- -i INTERVAL, --mon-interval=INTERVAL
- define monitoring sampling INTERVAL in 100ms units, 1=100ms, default 10=10x100ms=1s
- -t SECONDS, --mon-time=SECONDS
- define monitoring time in seconds, use 'inf' or 'infinite' for infinite monitoring. Use CTRL+C to stop monitoring at any time.
- -r, --mon-reset[=CONFIG[,CONFIG]]
- reset monitoring and use all RMID's in the system and reconfigure allocation. CONFIG is one of the following options:
l3iordt-off sets L3 I/O RDT off
l3iordt-any keeps current L3 I/O RDT setting (default)
- --disable-mon-ipc
- Disable IPC monitoring
- --disable-mon-llc_miss
- Disable LLC misses monitoring
- -H, --profile-list
- list supported allocation profiles
- -c PROFILE, --profile-set=PROFILE
- select a PROFILE from predefined allocation classes, use -H to list available profiles
- -I, --iface-os
- set the library interface to use the kernel implementation. If not set the default implementation is to program the MSR's directly.
- --iface=INTERFACE
- set the library interface to automatically detected one ('auto'), MSR
('msr') or kernel interface ('os'). INTERFACE can be set to either 'auto'
(default), 'msr' or 'os'. If automatic detection is selected ('auto'), it:
1) Takes RDT_IFACE environment variable into account if this variable is set
2) Selects OS interface if the kernel interface is supported
3) Selects MSR interface otherwise
NOTES¶
CMT, MBM and CAT are configured using Model Specific Registers (MSRs). The pqos software executes in user space, and access to the MSRs is obtained through a standard Linux* interface. The msr file interface is protected and requires root privileges. The msr driver might not be auto-loaded and on some modular kernels the driver may need to be loaded manually:
For Linux:
sudo modprobe msr
For FreeBSD:
sudo kldload cpuctl
Interface enforcement:
If you require system wide interface enforcement you can do so by setting the
"RDT_IFACE" environment variable.
SEE ALSO¶
AUTHOR¶
pqos was written by Tomasz Kantecki <tomasz.kantecki@intel.com>, Marcel Cornu <marcel.d.cornu@intel.com>, Aaron Hetherington <aaron.hetherington@intel.com>, Michal Aleksinski <michalx.aleksinski@intel.com>, Wojciech Andralojc <wojciechx.andralojc@intel.com>, Adrian Boczkowski <adrianx.boczkowski@intel.com>
This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
April 19, 2022 |