summaryrefslogtreecommitdiffstats
path: root/src/vppinfra/atomics.h
AgeCommit message (Collapse)AuthorFilesLines
2019-06-05Switch atomic release API from __sync to __atomic builtin.Sirshak Das1-1/+1
__sync_lock_release switched to __atomic_store for code consitency, although both generate same instructions with current compilers. Change-Id: I37d320509e43a4c2b8a49af6346dc4a43ca2f535 Signed-off-by: Sirshak Das <sirshak.das@arm.com> Reviewed-by: Lijian Zhang <Lijian.Zhang@arm.com> Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
2019-06-05Switch atomic test and set API from __sync to __atomic builtinSirshak Das1-1/+1
__sync_test_and_set uses full memory barriers for AArch64, __atomic_exchange(ACQUIRE) would use load acquire. Change-Id: Ifdf2481db3b9dde6c5842d75671402862adb6d81 Signed-off-by: Sirshak Das <sirshak.das@arm.com> Reviewed-by: Lijian Zhang <Lijian.Zhang@arm.com> Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
2019-04-16svm_fifo rework to avoid contention on cursizeSirshak Das1-0/+3
Problems Addressed: - Contention of cursize by producer and consumer. - Reduce the no of modulo operations. Changes: - Synchronization between producer and consumer changed from cursize to head and tail indexes Implications: reduces the usable size of fifo by 1. - Using weaker memory ordering C++11 atomics to access head and tail based on producer and consumer role. - Head and tail indexes are unsigned 32 bit integers. Additions and subtraction on them are implicit 32 bit Modulo operation. - Adding weaker memory ordering variants of max_enq, max_deq, is_empty and is_full Using them appropriately in all places. Perfomance improvement (iperf3 via Hoststack): iperf3 Server: Marvell ThunderX2(AArch64) - iperf3 Client: Skylake(x86) ~6%(256 rxd/txd) - ~11%(2048 rxd/txd) Change-Id: I1d484e000e437430fdd5a819657d1c6b62443018 Signed-off-by: Sirshak Das <sirshak.das@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2019-03-22svm/atomics: add clib_atomic_swap_rel_nFlorin Coras1-0/+1
Change-Id: Iea2c173000570043beafef58ca923463ce76d872 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-11-28Use acquire/release ordering when accessing svm_fifo shared variable cursizeSirshak Das1-0/+4
Improves TCP iperf3 performance by ~3% on AArch64. Change-Id: I1e51bd8403ba45ec6af4c2f96b95e884c1ae0d67 Signed-off-by: Sirshak Das <sirshak.das@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
2018-11-05Enable atomic swap and store macro with acquire and release orderingSirshak Das1-2/+3
Add atomic swap and store macro with acquire and release ordering respectively. Variable in question is interupt_pending variable which is used as guard variable by input nodes to process the device queue. Atomic Swap is used with Acquire ordering as writes or reads following this in program order should not be reordered before the swap. Atomic Store is used with Release ordering, as post store the node is added to pending list. Change-Id: I1be49e91a15c58d0bf21ff5ba1bd37d5d7d12f7a Original-patch-by: Damjan Marion <damarion@cisco.com> Signed-off-by: Sirshak Das <sirshak.das@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
2018-10-19vppinfra: add atomic macros for __sync builtinsSirshak Das1-0/+45
This is first part of addition of atomic macros with only macros for __sync builtins. - Based on earlier patch by Damjan (https://gerrit.fd.io/r/#/c/10729/) Additionally - clib_atomic_release macro added and used in the absence of any memory barrier. - clib_atomic_bool_cmp_and_swap added Change-Id: Ie4e48c1e184a652018d1d0d87c4be80ddd180a3b Original-patch-by: Damjan Marion <damarion@cisco.com> Signed-off-by: Sirshak Das <sirshak.das@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com> Reviewed-by: Steve Capper <steve.capper@arm.com>