Age | Commit message (Collapse) | Author | Files | Lines |
|
A bit ugly, but generates faster and less noisy code which
should be important for this particular use case.
Type: improvement
Change-Id: If2bba947dac33ffedb4236a5b3fb50fc783668e1
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Refactoring perf metric support to remove branching on bundle type in
the dispatch wrapper. This change includes caching the rdpmc index at
perfmon_start(), so that the mmap_page.index doesn't need to be looked
up each time. It also exclude the effects of mmap_page.index.
This patch prepares the path for bundles that support general, fixed and
metrics counters simulataneously.
Type: refactor
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I9c5b4917bd02fea960e546e8558452c4362eabc4
|
|
The Intel Icelake uArch supports measuring up to 12 counters,
comprised of 4 fixed and 8 general counters.
Type: improvement
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I68369ea55a0c95d6a4a280a464e69502bbf5474f
|
|
Allow perfmon bundles to support more than one bundle type, either node
or thread. Only used for topdown bundle for the moment.
Type: improvement
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: Iba3653a4deb39b0a8ee8ad448a7e8f954283ccd8
|
|
Added memory bandwidth boundedness bundle, closely related to cache-hierarchy.
This bundle works on ICX only, due to an ICX specific counter.
Type: improvement
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: Id385bd5f4e645ac020774e311c623afb64b79b1e
|
|
Adding support for Linux papi TMAM on Intel Snowridge. Adds the ability to
indicate that a bundle should be thread or node bundle type based on available
cpu features (rdpmc support).
Type: feature
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: Ib871b2644fdb2410fbb580e0d21c3a8e2be13aba
|
|
Revert raw column from the perfmon plugin.
Type: refactor
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: If127f57ee2022cc1c0ea5177f1655a792f195f1d
|
|
Adding perfmon node TMAM support on ICX.
Type: improvement
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I48a9a9ff6a72efc28eaf0cb11ef39fb62cebb126
|
|
Original set, start, stop, reset, show etc interface was somewhat cumbersome, we
can improve slightly by combining set and start.
Type: improvement
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: I7b865b2c29d2ab32adbd24d7f8a580da6990bb76
|
|
Add perfmon plugin support to output raw counter and timestamps, both
are useful for debug.
Type: improvement
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Change-Id: Ia5a73d1f05e3464c18991c2346f0ed8b7ef63099
|
|
Type: feature
Change-Id: I2c14f82393d11fc05c6d229f5c58603ab5c0f14d
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Type: refactor
Change-Id: I1303219f9f2a25d821737665903b0264edd3de32
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Callbacks for monitoring and performance measurement:
- Add new callback list type, with context
- Add callbacks for API, CLI, and barrier sync
- Modify node dispatch callback to pass plugin-specific context
- Modify perfmon plugin to keep PMC samples local to the plugin
- Include process nodes in dispatch callback
- Pass dispatch function return value to callback
Type: refactor
Signed-off-by: Tom Seidenberg <tseidenb@cisco.com>
Change-Id: I28b06c58490611e08d76ff5b01b2347ba2109b22
|
|
EXAMPLE:
src/plugins/perfmon/intel_json_to_c.py \
-i skylakex_core_v1.12.json \
-o src/plugins/perfmon/perfmon_intel_skx.c \
-m 0x55,0 \
-m 0x55,1 \
-m 0x55,2 \
-m 0x55,3
Change-Id: I16ce059e231d340ecfcb6f6638e29c5b46304683
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Add missing pre-input node runtime fork and refork code.
unix-epoll-input runs on all threads; each instance needs its own
runtime stats.
Change-Id: I16b02e42d0c95f863161176c4bb9f9917bef809d
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
As a FUD reduction measure, this patch implements 2-way parallel
counter collection. Synthetic stat component counter pairs run at the
same time. Running two counters (of any kind) at the same time
naturally reduces the aggregate time required by an approximate
factor-of-2, depending on whether an even or odd number of stats have
been requested.
I don't completely buy the argument that computing synthetic stats
such as instructions-per-clock will be inaccurate if component counter
values are collected sequentially. Given uniform traffic pattern, it
must make no difference.
As the collection interval increases, the difference between serial
and parallel component counter collection will approach zero, see also
the Central Limit theorem.
Change-Id: I36ebdcf125e8882cca8a1929ec58f17fba1ad8f1
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Change-Id: Ie5a00c15ee9536cc61afab57f6cadc1aa1972f3c
Signed-off-by: Dave Barach <dave@barachs.net>
|