Age | Commit message (Collapse) | Author | Files | Lines |
|
Added topdown level 2 support on sapphire rapids,
including ability to indentify a sapphire rapids cpu.
Type: improvement
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I9f99a92fa0886b98bb5185cff32bebd5a094f329
|
|
The Intel Icelake uArch supports measuring up to 12 counters,
comprised of 4 fixed and 8 general counters.
Type: improvement
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I68369ea55a0c95d6a4a280a464e69502bbf5474f
|
|
Enable Topdown Level 1 support on Snowridge,
enabled with standard CPU events on small core.
Type: improvement
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I58ad09383de7464265ac1b69e683f253591e3b5e
|
|
Add a check bundle is supported before futher activation.
Enable different bundles with same name, supported on different platforms.
Type: improvement
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I73e8bbd1e07c05ebccd9146d48a234eb598a2388
|
|
Fix peusdo events, missed populating "core" events with peusdo events.
Type: fix
Fixes: bf37bf6f7
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I569fa876f1b58540adac0b095be0ff4ade664dec
|
|
Allow perfmon bundles to support more than one bundle type, either node
or thread. Only used for topdown bundle for the moment.
Type: improvement
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: Iba3653a4deb39b0a8ee8ad448a7e8f954283ccd8
|
|
Topdown events are peusdo events exposed by linux,
and are only present on Intel platforms.
Change to clarifies this.
Type: fix
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I6a3dcea5f43f53dbb96475329baf5e596a24d54f
|
|
This code seems really usefull for reuse in
other plugins, for pretty table formatting
Type: feature
Change-Id: Ib5784a0dfc81b7d5a5d1f5ccdd02072e460a50fb
Signed-off-by: Nathan Skrzypczak <nathan.skrzypczak@gmail.com>
|
|
Type: make
Change-Id: I2958e9eddadee6434766ecd3cdb3b9cea742ed64
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
This patch sorts 'show perfmon bundle' output in alphabetical order.
Type: improvement
Signed-off-by: Zachary Leaf <zachary.leaf@arm.com>
Change-Id: I26b379b5d6766b9f87f9a3a5013ea92b207fb5d4
|
|
Added memory bandwidth boundedness bundle, closely related to cache-hierarchy.
This bundle works on ICX only, due to an ICX specific counter.
Type: improvement
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: Id385bd5f4e645ac020774e311c623afb64b79b1e
|
|
Adding support for Linux papi TMAM on Intel Snowridge. Adds the ability to
indicate that a bundle should be thread or node bundle type based on available
cpu features (rdpmc support).
Type: feature
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: Ib871b2644fdb2410fbb580e0d21c3a8e2be13aba
|
|
When mmap()-ing perf event in userspace, we must adhere to the kernel
update protocol to read consistent values.
Also, 'offset' is an offset to add to the counter value, not to apply
to the PMC index.
Type: fix
Change-Id: I59106bb3a48185ff3fcb0d2f09097269a67bb6d6
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
Revert raw column from the perfmon plugin.
Type: refactor
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: If127f57ee2022cc1c0ea5177f1655a792f195f1d
|
|
Adding perfmon node TMAM support on ICX.
Type: improvement
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I48a9a9ff6a72efc28eaf0cb11ef39fb62cebb126
|
|
Original set, start, stop, reset, show etc interface was somewhat cumbersome, we
can improve slightly by combining set and start.
Type: improvement
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I7b865b2c29d2ab32adbd24d7f8a580da6990bb76
|
|
Show % time spent per graph node in power level 0, 1 and 2.
Type: improvement
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I678ee812fa993af39568e9f9dfbf2396fc13ad42
|
|
Add branches, branches taken (a meteric for branchy code), and branch
misses.
Type: improvement
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: If92d4aaf9d0a6e3b99b8c19e6311cc08ca470590
|
|
Type: improvement
Change-Id: If3da7d4338470912f37ff1794620418d928fb77f
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Account for occasional instances with the misses rates between caches
are inconsistent.
Type: fix
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: Idfb8bb7543401405cfe04291ad201c28be030cc9
|
|
Add perfmon plugin support to output raw counter and timestamps, both
are useful for debug.
Type: improvement
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: Ia5a73d1f05e3464c18991c2346f0ed8b7ef63099
|
|
Added basic support for counting cache hits and misses per node.
Type: improvement
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: Ic566611fd3d4246ccaa2117d8f74a569a6862e80
|
|
Type: feature
Change-Id: I2c14f82393d11fc05c6d229f5c58603ab5c0f14d
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Type: refactor
Change-Id: I1303219f9f2a25d821737665903b0264edd3de32
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Type: refactor
Change-Id: Ie67dc579e88132ddb1ee4a34cb69f96920101772
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Callbacks for monitoring and performance measurement:
- Add new callback list type, with context
- Add callbacks for API, CLI, and barrier sync
- Modify node dispatch callback to pass plugin-specific context
- Modify perfmon plugin to keep PMC samples local to the plugin
- Include process nodes in dispatch callback
- Pass dispatch function return value to callback
Type: refactor
Signed-off-by: Tom Seidenberg <tseidenb@cisco.com>
Change-Id: I28b06c58490611e08d76ff5b01b2347ba2109b22
|
|
It is a bad idea to poison memory after munmap because the address space
can be reused (eg. for global data of dlopen()ed object) and ASan model
allows access by default.
Moreover, access to a stale address space will fault.
Type: fix
Change-Id: I356de422f255447d9d50a3a71fb0c2eaa790d731
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
When perfmon_init is called at initialization time worker threads are
not created yet and vec_len(vlib_mains) returns 1.
Initialize per-worker data when the number of workers is known, when
enabling data collection instead.
Type: fix
Change-Id: I36887cc7b2a3e88d9728d3cd7262d9b1c968dd3c
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
Change-Id: Iddeb3a1b0e20706e72ec8f74dabc60b342f003ba
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
- Make plugin descriptions more consistent
so the output of "show plugin" can be
used in the wiki.
Change-Id: I4c6feb11e7dcc5a4cf0848eed37f1d3b035c7dda
Signed-off-by: Dave Wallace <dwallacelf@gmail.com>
|
|
Change-Id: Iaa5cd89791b0dfdb56a75009c564581d10696d83
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
When adding two or more events using a single "set pmc",
the pmc hardware indices might be out-dated due to kernel
reschdeduling the perf_event hardware counters.
E.g. set pmc cpu-cycles cache-misses
Solution:
Open and enable all the events first, then aquire the
indices from the kernel.
Change-Id: I6913a871ab169e3b2855ac6159f527a1fca343e9
Signed-off-by: Su Wang <su.z.wang@ericsson.com>
|
|
EXAMPLE:
src/plugins/perfmon/intel_json_to_c.py \
-i skylakex_core_v1.12.json \
-o src/plugins/perfmon/perfmon_intel_skx.c \
-m 0x55,0 \
-m 0x55,1 \
-m 0x55,2 \
-m 0x55,3
Change-Id: I16ce059e231d340ecfcb6f6638e29c5b46304683
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I79b213b34c6071d14acf1922f89037a4a5a36c45
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Add missing pre-input node runtime fork and refork code.
unix-epoll-input runs on all threads; each instance needs its own
runtime stats.
Change-Id: I16b02e42d0c95f863161176c4bb9f9917bef809d
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Change-Id: I9b0a101e5d78c10257e3c5d8f5573c3eb29bfdef
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
As a FUD reduction measure, this patch implements 2-way parallel
counter collection. Synthetic stat component counter pairs run at the
same time. Running two counters (of any kind) at the same time
naturally reduces the aggregate time required by an approximate
factor-of-2, depending on whether an even or odd number of stats have
been requested.
I don't completely buy the argument that computing synthetic stats
such as instructions-per-clock will be inaccurate if component counter
values are collected sequentially. Given uniform traffic pattern, it
must make no difference.
As the collection interval increases, the difference between serial
and parallel component counter collection will approach zero, see also
the Central Limit theorem.
Change-Id: I36ebdcf125e8882cca8a1929ec58f17fba1ad8f1
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Built a tool to chew up https://download.01.org/perfmon/mapfile.csv,
and output a table in this format:
typedef struct {
u8 model;
u8 stepping;
u8 has_stepping;
char *filename;
} file_by_model_and_stepping_t;
static const file_by_model_and_stepping_t fms_table [] =
{
/* model, stepping, stepping valid, file */
{ 0x2E, 0x0, 0, "NehalemEX_core_V2.json" },
{ 0x1E, 0x0, 0, "NehalemEP_core_V2.json" },
<snip>
{ 0x55, 0x5, 1, "cascadelakex_core_v1.00.json" },
{ 0x55, 0x6, 1, "cascadelakex_core_v1.00.json" },
{ 0x55, 0x7, 1, "cascadelakex_core_v1.00.json" },
<snip>
Change-Id: Ie0e8a7e851799e9d060b966047745039c066ec7b
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Change-Id: Id4f37f5d4a03160572954a416efa1ef9b3d79ad1
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
The license issue is resolved, so we can package the .json
files. Added to the vpp-dev package in .tar.xz form, which saves a lot
of space.
Updated the perfmon error log entry: tell folks where to find the
compressed tarball, and how to extract it.
Change-Id: I3ed351fbf154cc3ba22d5f9c666acff77a2a14cf
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Change-Id: I20f2fb14e00f3e7e96774959a4bf1a159ab9030f
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Added/tested additional cpuids from our testbed.
Change-Id: Ifd3ea9e8e8231a8901966903bf5eceb635b82482
Signed-off-by: Paul Vinciguerra <pvinci@vinciconsulting.com>
|
|
Change-Id: Ie5a00c15ee9536cc61afab57f6cadc1aa1972f3c
Signed-off-by: Dave Barach <dave@barachs.net>
|