summaryrefslogtreecommitdiffstats
path: root/src/plugins/perfmon/intel
AgeCommit message (Collapse)AuthorFilesLines
2022-07-12perfmon: enable perfmon plugin for ArmZachary Leaf2-0/+2
This patch enables statistics from the Arm PMUv3 through the perfmon plugin. In comparison to using the Linux "perf" tool, it allows obtaining direct, per node level statistics (rather than per thread). By accessing the PMU counter registers directly from userspace, we can avoid the overhead of using a read() system call and get more accurate and fine grained statistics about the running of individual nodes. A demo of perfmon on Arm can be found at: https://asciinema.org/a/egVNN1OF7JEKHYmfl5bpDYxfF *Important Note* Perfmon on Arm is dependent on and works only on Linux kernel versions of v5.17+ as this is when userspace access to Arm perf counters was included. On most Arm systems, a maximum of 7 PMU events can be configured at once - (6x PMU events + 1x CPU_CYCLE counter). If some perf counters are in use elsewhere by other applications, and there are insufficient counters remaining to open the bundle, the perf_event_open call will fail (provided the events are grouped with the group_fd param, which perfmon currently utilises). See arm/events.h for a list of PMUv3 events available, although it is implementation defined whether most events are implemented or not. Only a small set of 7 events is required to be implemented in Armv8.0, with some additional events required in later versions. As such, depending on the implementation, some statistics may not be available. See Arm Architecture Reference Manual for Armv8-A, D7.10.2 "The PMU event number space and common events" for more information. arm/events.c:arm_init() gets information from the sysfs about what events are implemented on a particular CPU at runtime. Arm's implementation of the perfmon source callback .bundle_support uses this information to disable unsupported events in a bundle, or in the case no events are supported, disable the entire bundle. Where a particular event in a bundle is not implemented, the statistic for that event is shown as '-' in the 'show perfmon statistics' cli output, by disabling the column. There is additional code in perfmon.c to only open events which are marked as implemented. Since we're only opening and reading events that are implemented, some extra logic is required in cli.c to re-align either perfmon_node_stats_t or perfmon_reading_t with the column headings configured in each bundle, taking into account disabled columns. Userspace access to perf counters is disabled by default, and needs to be enabled with 'sudo sysctl kernel/perf_user_access=1'. There is a check built into the Arm event source init function (arm/events.c:arm_init) to check that userspace reading of perf counters is enabled in the /proc/sys/kernel/perf_user_access file. If the above file does not exist, it means the kernel version is unsupported. Users without a supported kernel will see a warning message, and no Arm bundles will be registered to use in perfmon. Enabling/using plugin: - include the following in startup.conf: - plugins { plugin perfmon_plugin.so { enable } - 'show perfmon bundle [verbose]' - show available statistics bundles - 'perfmon start bundle <bundle-name>' - enable and start logging - 'perfmon stop' - stop logging - 'show perfmon statistics' - show output For a general guide on using and understanding Arm PMUv3 events, see https://community.arm.com/arm-community-blogs/b/tools-software-ides-blog/posts/arm-neoverse-n1-performance-analysis-methodology Type: feature Signed-off-by: Zachary Leaf <zachary.leaf@arm.com> Tested-by: Jieqiang Wang <jieqiang.wang@arm.com> Change-Id: I0620fe5b1bbe78842dfb1d0b6a060bb99e777651
2022-07-12perfmon: make less arch dependentZachary Leaf5-0/+231
In preparation for enabling perfmon on Arm platforms, move some Intel /arch specific logic into the /intel directory and update the CMake to split the common code from arch specific files. Since the dispatch_wrapper code is very different on Arm/Intel, each arch can provide their own implementation + conduct any additional arch specific config e.g. on Intel, all indexes from the mmap pages are cached. The new method intel_config_dispatch_wrapper conducts this config and returns a pointer to the dispatch wrapper to use. Similarly, is_bundle_supported() looks very different on Arm/Intel, so each implementation is to provide their own arch specific checks. Two new callbacks/function ptrs are added in PERFMON_REGISTER_SOURCE to support this - .bundle_support and .config_dispatch_wrapper. Type: refactor Signed-off-by: Zachary Leaf <zachary.leaf@arm.com> Change-Id: Idd121ddcfd1cc80a57c949cecd64eb2db0ac8be3
2022-03-29perfmon: fix non-NULL terminated C-stringBenoît Ganne1-1/+1
format() expects a NULL-terminated C-string as format string. Type: fix Change-Id: Ib428cf2debbf98850eed512907175f8ae8ba3c04 Signed-off-by: Benoît Ganne <bganne@cisco.com>
2022-03-23perfmon: null-terminate stringDamjan Marion1-1/+1
Type: fix Change-Id: I43ebb2c2922f3b8b8eddf26ccdf044f31d7b7a10 Signed-off-by: Damjan Marion <damarion@cisco.com>
2022-02-18perfmon: show distribution of uops delivered to frontendRay Kinsella3-7/+107
Breakdown the distribution of uops delivered to the frontend. Collerates directly with the source of the uops. Type: improvement Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: I93a57dbe56dfa0f378527844aa4e63f45a548e55
2022-01-30perfmon: topdown level 1 and 2 for icxRay Kinsella3-63/+182
Topdown level 1 and 2 for Intel Ice Lake (ICX). Limiting topdown support to THREAD for the moment on Ice Lake, as NODE support is still unreliable. Also removing Topdown Level 1 from Sapphire Rapids onwards, as Topdown LeveL 2 also shows Level 1 on Sapphire, and it reduces the overall number of bundles. Type: improvement Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: Iaa68b711dc8b6fb1090880b411debadb3c37f8bc
2022-01-30perfmon: fix init of bundles with pseudo eventsRay Kinsella1-5/+14
Previously Linux pseudo events were being counted as multiple fixed events, such that a bundle with pseudo events could exceed the number of available fixed counters. Reworked to ignore pseudo events in the accounting for the moment. Type: fix Fixes: 0024e53ad Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: Ic938f8266fd04d7731afbd02e261c61ef22a8522
2022-01-30perfmon: topdown backend bound core bundleRay Kinsella2-0/+117
Add a bundle to measure topdown backend bound core cycles, will indicate if any given execution port has contention. Type: improvement Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: I37d1b38c101ac42d51c10fa4452b822d34b729c9
2022-01-27perfmon: frontend and backend boundness bundlesRay Kinsella5-67/+331
Renamed memory stalls to topdown backend-bound-mem, added topdown frontend-bound-latency and frontend-bound-bandwidth. Type: improvement Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: I70f42b6b63fe2502635cad4aed4271e2bbdda5f1
2022-01-27perfmon: prune bundles by available pmu countersRay Kinsella1-0/+10
Prune perfmon bundles that exceed the number of available pmu counters. Type: improvement Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: I70fec26bb8ca915f4b980963e06c2e43dfde5a23
2022-01-27perfmon: add cli to show perf configRay Kinsella1-0/+4
Added a cli to show Linux perf config for a give perfmon bundle. This makes it easier to format Linux perf commands for next level analysis. Type: improvement Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: I9adafa7d441b72120390d186e3c8f884b1bc9828
2021-12-02perfmon: refactor perf metric supportRay Kinsella1-8/+2
Refactoring perf metric support to remove branching on bundle type in the dispatch wrapper. This change includes caching the rdpmc index at perfmon_start(), so that the mmap_page.index doesn't need to be looked up each time. It also exclude the effects of mmap_page.index. This patch prepares the path for bundles that support general, fixed and metrics counters simulataneously. Type: refactor Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: I9c5b4917bd02fea960e546e8558452c4362eabc4
2021-11-16perfmon: fix coverity warningKlement Sekera1-2/+9
Check for possible hash lookup failure to avoid NULL dereference. Type: fix Fixes: e15c999c30 Signed-off-by: Klement Sekera <ksekera@cisco.com> Change-Id: Ib806b4d124be26fbccf36fe9d19af1aec63f487b
2021-11-15perfmon: rename bundle to memory stallsRay Kinsella1-8/+8
Rename the memory bandwidth bundle to memory stalls, to differentiate it from the bundle that measures memory controller bandwidth boundedness. Type: refactor Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: I828c73b6f769046e1ab592712bdf81ceefcd7911
2021-11-08perfmon: fix iio-bw coverity issuesRay Kinsella1-3/+1
Fixes an number of coverity issues associated with the iio-bw feature. Type: fix Fixes: e15c999c3 Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: I9ad2b336694132545d90a3483200a510226e9198
2021-11-07perfmon: numa node list probing should use '/online' instead of '/has_cpu'Xiaoming Jiang1-1/+1
Type: fix Signed-off-by: Xiaoming Jiang <jiangxiaoming@outlook.com> Change-Id: I85e41d58884af71afba960d20604bb1b01876d26
2021-11-02perfmon: added bundle to measure pci bandwidthRay Kinsella1-0/+258
Added an Intel Ice Lake specific bundles to measure pci bandwidth through the Intel IO PMU. The "PCI" bundle measures read/writes from pci devices. The "CPU" bundle measure read/writes from cpus to pci devices. Type: improvement Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: Id48cef5988113e8dc4690b97d22243311bfa7961
2021-11-02perfmon: added intel internal io pmu supportRay Kinsella2-9/+82
Added support for the Intel Internal IO Uncore PMU, along with the ability to format PMU Unit specific names. Type: improvement Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: I2939f8ade5e5ed63ccf7f3ccd0279d7c72e95a6e
2021-10-28perfmon: fix coverity warningKlement Sekera1-0/+8
Check that cpumask is initialised properly to avoid possible NULL pointer dereference. Type: fix Signed-off-by: Klement Sekera <ksekera@cisco.com> Change-Id: I8df5a718104fe703d6baf3f1294b4a6d2ca01619
2021-10-16perfmon: topdown lvl 2 support on sapphire rapidsRay Kinsella2-11/+167
Added topdown level 2 support on sapphire rapids, including ability to indentify a sapphire rapids cpu. Type: improvement Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: I9f99a92fa0886b98bb5185cff32bebd5a094f329
2021-10-07perfmon: Topdown Level 1 support on SnowridgeRay Kinsella3-1/+101
Enable Topdown Level 1 support on Snowridge, enabled with standard CPU events on small core. Type: improvement Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: I58ad09383de7464265ac1b69e683f253591e3b5e
2021-10-07perfmon: fix peusdo eventsRay Kinsella1-1/+1
Fix peusdo events, missed populating "core" events with peusdo events. Type: fix Fixes: bf37bf6f7 Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: I569fa876f1b58540adac0b095be0ff4ade664dec
2021-10-05perfmon: bundles with multiple typesRay Kinsella1-3/+1
Allow perfmon bundles to support more than one bundle type, either node or thread. Only used for topdown bundle for the moment. Type: improvement Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: Iba3653a4deb39b0a8ee8ad448a7e8f954283ccd8
2021-10-04perfmon: topdown events as peusdo eventsRay Kinsella1-9/+13
Topdown events are peusdo events exposed by linux, and are only present on Intel platforms. Change to clarifies this. Type: fix Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: I6a3dcea5f43f53dbb96475329baf5e596a24d54f
2021-09-08perfmon: add membw-bound bundleRay Kinsella2-0/+78
Added memory bandwidth boundedness bundle, closely related to cache-hierarchy. This bundle works on ICX only, due to an ICX specific counter. Type: improvement Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: Id385bd5f4e645ac020774e311c623afb64b79b1e
2021-09-08perfmon: adding support for papi TMAMRay Kinsella2-47/+73
Adding support for Linux papi TMAM on Intel Snowridge. Adds the ability to indicate that a bundle should be thread or node bundle type based on available cpu features (rdpmc support). Type: feature Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: Ib871b2644fdb2410fbb580e0d21c3a8e2be13aba
2021-05-26perfmon: revert raw column supportRay Kinsella1-10/+0
Revert raw column from the perfmon plugin. Type: refactor Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: If127f57ee2022cc1c0ea5177f1655a792f195f1d
2021-04-27perfmon: top down level 1 supportmdr783-6/+125
Adding perfmon node TMAM support on ICX. Type: improvement Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: I48a9a9ff6a72efc28eaf0cb11ef39fb62cebb126
2021-04-01perfmon: % power level per nodeRay Kinsella1-0/+57
Show % time spent per graph node in power level 0, 1 and 2. Type: improvement Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: I678ee812fa993af39568e9f9dfbf2396fc13ad42
2021-03-31perfmon: add branch mispredictionsRay Kinsella2-0/+75
Add branches, branches taken (a meteric for branchy code), and branch misses. Type: improvement Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: If92d4aaf9d0a6e3b99b8c19e6311cc08ca470590
2021-03-16perfmon: fixes for cache hierarchyRay Kinsella1-8/+12
Account for occasional instances with the misses rates between caches are inconsistent. Type: fix Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: Idfb8bb7543401405cfe04291ad201c28be030cc9
2021-01-21perfmon: added cache hits and missesRay Kinsella1-0/+69
Added basic support for counting cache hits and misses per node. Type: improvement Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: Ic566611fd3d4246ccaa2117d8f74a569a6862e80
2020-12-18perfmon: new perfmon pluginDamjan Marion7-0/+673
Type: feature Change-Id: I2c14f82393d11fc05c6d229f5c58603ab5c0f14d Signed-off-by: Damjan Marion <damarion@cisco.com>