summaryrefslogtreecommitdiffstats
path: root/src/vcl/vcl_locked.c
diff options
context:
space:
mode:
authorLijian.Zhang <Lijian.Zhang@arm.com>2019-07-08 10:33:34 +0800
committerDamjan Marion <dmarion@me.com>2019-09-11 19:20:27 +0000
commit86b1871ba212064ceb985be4a6b655ebfe2e32f9 (patch)
tree71d0e9bb6e98a76f79628fdd72f91312b470e30d /src/vcl/vcl_locked.c
parent840f64b4b2d6063adebb8c7b31c9357aaaf8dd5e (diff)
ip: apply dual loop unrolling in ip4_input
Too many prefetches within loop unrollings induce bottleneck and performance degradation on some CPUs which have less cache line fill buffers, e.g, Arm Cortex-A72. Apply dual loop unrolling and tune prefetches manually to resolve hot-spot with prefetch instructions. It saves about 11.5% cycles with ip4_input node on Cortex-A72 CPUs. Type: feature Change-Id: I1ac9eb21061a804af2a414b420217fbcda3689c9 Signed-off-by: Lijian Zhang <Lijian.Zhang@arm.com>
Diffstat (limited to 'src/vcl/vcl_locked.c')
0 files changed, 0 insertions, 0 deletions