Age | Commit message (Collapse) | Author | Files | Lines |
|
Missing an increment in the while loop. Hashes not stored in the array.
Type: fix
Signed-off-by: Steven Luong <sluong@cisco.com>
Change-Id: I603027f5a7305478f48a102ac8035ffde9102c53
(cherry picked from commit 0471cdbd3fe04a88a8b70b5f0eff0c378e19abf7)
|
|
Type: fix
Change-Id: Ibb7ba878b049b8b18e890c43fdd6324cb88d63b8
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
Type: feature
Change-Id: I913f08383ee1c24d610c3d2aac07cef402570e2c
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Use consistent API types.
Type: fix
Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
Change-Id: Idbba4ab6a412b75338e3149e51476693f0862f16
Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
|
|
Not all interfaces have the same characteristics within the bonding group.
For active-backup mode, we should do our best to select the slave that
performs the best as the primary slave. We already did that by preferring
the slave that is local numa. Sometimes, this is not enough. For example,
when all are local numas, the selection is arbitrary. Some slave interfaces
may have higher speed or better qos than the others. But this is hard to
infer.
One rule does not fit all. So we let the operator to optionally specify the
weight for each slave interface. Our primary slave selection rule is now
1. biggest weight
2. is local numa
3. current primary slave (to avoid churn)
4. lowest sw_if_index (for deterministic behavior)
This selection rule only applies to active-backup mode which only one slave
is used for forwarding traffic until it becomes unreachable. At that time,
the next "best" slave candidate is automatically promoted. The slaves are
sorted according to the preference rule when they are up. So there is no need
to find the next best candidate when the primary slave goes down.
Another good thing about this rule is when the down slave comes back up, it
is selected as the primary slave again unless there is indeed a "better"
slave than this down slave that were added during that period.
To set the weight for the slave interface, do this after the interface is
enslaved
set interface bond <interface-name> weight <value>
Type: feature
Signed-off-by: Steven Luong <sluong@cisco.com>
Change-Id: I59ced6d20ce1dec532e667dbe1afd1b4243e04f9
|
|
Add /if/lacp/<bond-sw_if_index>/<slave-sw_if_index>/state
<bond-sw_if_index> is a vector of the bond sw_if_index
<slave-sw_if_index> is a vector of the slave sw_if_index
Content is the integer value of the lacp actor state. The state is actually
a bitfield as described in the lacp protocol spec.
Type: feature
Signed-off-by: Steven Luong <sluong@cisco.com>
Change-Id: Ic6eca8ce2a1acd2d858e4e50b7eac1d000ea08e5
Signed-off-by: Ole Troan <ot@cisco.com>
|
|
Virtual interfaces may be part of the bonding like physical interfaces. The
difference is virtual interfaces may disappear dynamically. As an example,
the following CLI sequence may crash the debug image
create vhost-user socket /tmp/sock1
create bond mode lacp
bond add BondEthernet0 VirtualEthernet0/0/0
delete vhost-user VirtualEhernet0/0/0
Notice the virtual interface is deleted without first doing bond delete.
The proper order is to first remove the slave interface from the bond prior
to deleting the virtual interface as shown below. But we should handle it
anyway.
create vhost-user socket /tmp/sock1
create bond mode lacp
bond add BondEthernet0 VirtualEthernet0/0/0
bond del VirtualEthernet0/0/0 <-----
delete vhost-user VirtualEhernet0/0/0
The fix is to register for VNET_SW_INTERFACE_ADD_DEL_FUNCTION and remove
the slave interface from the bond if the to-be-deleted interface is part of
the bond. We check the interface that it is actually up before we send
the lacp pdu. Up means both hw and sw admin up.
Type: fix
Signed-off-by: Steven Luong <sluong@cisco.com>
Change-Id: If4d2da074338b16aab0df54e00d719e55c45221a
|
|
show interface does not display the RX counters for the bond
interfaces. It displays rx-no-buf instead.
The problem is VNET_INTERFACE_COUNTER_RX is a combined counter,
not a simple counter. Change the code to use
vlib_increment_combined_counter passing it with n_rx_packets and
n_rx_bytes.
Type: fix
Change-Id: I8121ad7e546447049fa13da62481b6c8f5575bec
Signed-off-by: Steven Luong <sluong@cisco.com>
|
|
Type: feature
Change-Id: Icd718c98ba2fa900cafaf1a59dfb100ee9914ec9
Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
|
|
1. "numa-only" is optional and is disabled by default for lacp mode.
2. update lacp doc.
Type: fix
Change-Id: I6a3a8423ef31ad9980353a796957693cd6205d73
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
|
|
If numa-only is set, Only slaves on local numa node
transmit pkts if have at least one, otherwise the bond
interface works as usual.
CLI change:
create bond mode lacp [load-balance { l2 | l23 | l34 } {numa-only}]
[hw-addr <mac-address>] [id <if-id>]
The new member "u8 numa_only;" is also added to bond_create_if_args_t.
Type: feature
Change-Id: Icdccedafb0738d8c9d4a5acce909ce562428c071
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
|
|
Type: style
Change-Id: I28908756019f8ca54c50334c470d8eded5621ade
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
|
|
This patch enables bonding numa awareness on multi-socket
server working in active-backeup mode.
The VPP adds capability for automatically preferring slave
with local numa node in order to reduces the load on the
QPI-bus and improve system overall performance in multi-socket
use cases. Users doesn't need to add any extra operation as
usual.
Change-Id: Iec267375fc399a9a0c0a7dca649fadb994d36671
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
|
|
1. remove unnecessary cast for void * pointer.
2. remove the unused input parameter.
Change-Id: Ic0324364fc0c772200d30fb18a0ba959ed4f7ea4
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
|
|
This reverts commit 5d0d5494db58422eb528c0f8b39a86ea966505e9.
The csit crash was actually due to the test image missing the patch
https://gerrit.fd.io/r/#/c/17731/
It was a mistake to revert the original patch
https://gerrit.fd.io/r/#/c/15577/
Change-Id: I7fc563981aa13d308d55b25194fee21475ebc57d
Signed-off-by: Steven Luong <sluong@cisco.com>
|
|
Some API action handlers called vl_msg_ai_send_shmem()
directly. That breaks Unix domain socket API transport.
A couple (bond / vhost) also tried to send a sw_interface_event
directly, but did not send the message to all that had
registred interest. That scheme never worked correctly.
Refactored and improved the interface event code.
Change-Id: Idb90edfd8703c6ae593b36b4eeb4d3ed7da5c808
Signed-off-by: Ole Troan <ot@cisco.com>
|
|
By definition, passive mode means the node does not start sending lacp pdu until
it first hears from the partner or remote.
- Rename ptx machine's BEGIN state to NO_PERIODIC state.
- Put periodic machine in NO_PERIDOIC state when the interface is enabled for
lacp. ptx machine will transition out of NO_PERIODIC state when the local node
hears from the remote or when the local node is configured for active mode.
- Also add send and receive statistics for debugging.
Change-Id: I747953b9595ed31328b2f4f3e7a8d15d01e04d7f
Signed-off-by: Steven Luong <sluong@cisco.com>
|
|
Change-Id: I53ab8d17914e6563110354e4052109ac02bf8f3b
Signed-off-by: Paul Vinciguerra <pvinci@vinciconsulting.com>
|
|
This reverts commit e63325e3ca03c847963863446345e6c80a2c0cfd.
Allow time for CSIT to accommodate.
Change-Id: I59435e4ab5e05e36a2796c3bf44889b5d4823cc2
Signed-off-by: ot@cisco.com
|
|
Use of consistent API types for interface.api
Change-Id: Ieb54cebb4ac96b432a3f0b41596718aa2f34885b
Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
|
|
During CSIT testing we discovered that LACP tests were failing and
producing coredumps. Reverting this patch fix the problem with VPP
crashing.
This reverts commit f23890138e02d4218c828c427f687f8ecdb0e165.
Change-Id: Icf97053ce1473350add885cbebe591f7f3efcbea
Signed-off-by: Peter Mikus <pmikus@cisco.com>
|
|
-fno-common makes sure we do not have multiple declarations of the same
global symbol across compilation units. It helps debug nasty linkage
bugs by guaranteeing that all reference to a global symbol use the same
underlying object.
It also helps avoiding benign mistakes such as declaring enum as global
objects instead of types in headers (hence the minor fixes scattered
across the source).
Change-Id: I55c16406dc54ff8a6860238b90ca990fa6b179f1
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
We register callback for VNET_HW_INTERFACE_LINK_UP_DOWN_FUNCTION and
VNET_SW_INTERFACE_ADMIN_UP_DOWN_FUNCTION to add and remove the slave
interface from the bond interface accordingly. For static bonding without
lacp, one would think that it is good enough to put the slave interface into
the ective slave set as soon as it is configured. Wrong, sometimes the slave
interface is configured to be part of the bonding without ever bringing up the
hardware carrier or setting the admin state to up. In that case, we send
traffic to the "dead" slave interface.
The fix is to make sure both the carrier and admin state are up before we put
the slave into the active set for forwarding traffic.
Change-Id: I93b1c36d5481ca76cc8b87e8ca1b375ca3bd453b
Signed-off-by: Steven <sluong@cisco.com>
|
|
Change-Id: I78fe58144fa3ba2e1c7135897a13a2541f235c91
Signed-off-by: Alexander Chernavin <achernavin@netgate.com>
|
|
Change-Id: Iba750a41262cc028ad0363fff78cc219e4a33538
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: Id4f37f5d4a03160572954a416efa1ef9b3d79ad1
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Typically we have scalar_size == 0, so it doesn't matter
but vlib_frame_args was providing pointer to scalar frame
data, not vector data. To avoid future confusion function
is renamed to vlib_frame_scalar_args(...)
Change-Id: I48b75523b46d487feea24f3f3cb10c528dde516f
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
when the last interface is removed from l2 in the bonding group, we should
invoke ethernet_set_rx_direct to allow ip packets to go directly to
ip4-input.
Change-Id: I43b3cd64e2c119762edd0c295bb9348732adab45
Signed-off-by: Steven <sluong@cisco.com>
|
|
Change-Id: Ied34720ca5a6e6e717eea4e86003e854031b6eab
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Break up bond tx function into multiple small workloads:
1. parse the packet header and hash it based on the configured algorithm
2. optionally, trace the packet
3. convert the hash value from (1) to the slave port
4. update the buffers with the slave sw_if_index
5. Add the buffers to the queues
6. Create and send the frames
old numbers
-----------
Time 5.3, average vectors/node 223.74, last 128 main loops 40.00 per node 222.61
vector rates in 3.3627e6, out 6.6574e6, drop 3.3964e4, punt 0.0000e0
Name State Calls Vectors Suspends Clocks Vectors/Call
BondEthernet0-output active 68998 17662979 0 1.89e1 255.99
BondEthernet0-tx active 68998 17662979 0 2.60e1 255.99
TenGigabitEthernet3/0/1-output active 68998 8797416 0 1.03e1 127.50
TenGigabitEthernet3/0/1-tx active 68998 8797416 0 7.85e1 127.50
TenGigabitEthernet7/0/1-output active 68996 8865563 0 1.02e1 128.49
TenGigabitEthernet7/0/1-tx active 68996 8865563 0 7.65e1 128.49
new numbers
-----------
BondEthernet0-output active 304064 77840384 0 2.29e1 256.00
BondEthernet0-tx active 304064 77840384 0 2.47e1 256.00
TenGigabitEthernet3/0/1-output active 304064 38765525 0 1.03e1 127.49
TenGigabitEthernet3/0/1-tx active 304064 38765525 0 7.66e1 127.49
TenGigabitEthernet7/0/1-output active 304064 39074859 0 1.01e1 128.51
Change-Id: I3ef9a52bfe235559dae09d055c03c5612c08a0f7
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
active-backup mode is using l2 load balance algo. It should be using
active-backup. Also notice that the output is missing a character.
vpp# create bond mode active-backup
create bond mode active-backup
vpp# sh bond
sh bond
interface name sw_if_index mode load balance active slaves slaves
BondEthernet0 6 xor l34 2 2
BondEthernet1 9 xor l34 1 1
BondEthernet2 10 active-backu l2 0 0
vpp#
Change-Id: If5ed0cc6c25f6c2ddabec15ff6188b34923d38e3
Signed-off-by: Steven <sluong@cisco.com>
|
|
Introduce bond_tx_inline which takes lb as a constant for gcc to do the optimization
The number appears a tad better for 256 bytes frame.
with the patch
--------------
Thread 2 vpp_wk_1 (lcore 3)
Time 4.3, average vectors/node 224.00, last 128 main loops 40.00 per node 222.61
vector rates in 8.4836e6, out 1.6967e7, drop 0.0000e0, punt 0.0000e0
Name State Calls Vectors Suspends Clocks Vectors/Call
BondEthernet0-output active 141054 36109824 0 2.51e1 256.00
BondEthernet0-tx active 141054 36109824 0 2.55e1 256.00
TenGigabitEthernet6/0/0-output active 141054 18055469 0 9.43e0 128.00
TenGigabitEthernet6/0/0-tx active 141054 18055469 0 6.97e1 128.00
TenGigabitEthernet6/0/1-output active 141054 18054355 0 9.54e0 127.99
TenGigabitEthernet6/0/1-tx active 141054 18054355 0 7.05e1 127.99
bond-input active 141054 36109824 0 1.76e1 256.00
dpdk-input polling 70527 36109824 0 5.03e1 512.00
ethernet-input active 141054 36109824 0 6.12e1 256.00
ip4-input active 141054 36109824 0 3.26e1 256.00
ip4-lookup active 141054 36109824 0 2.94e1 256.00
ip4-rewrite active 141054 36109824 0 3.27e1 256.00
without the patch
-----------------
Thread 2 vpp_wk_1 (lcore 3)
Time 4.3, average vectors/node 224.00, last 128 main loops 40.00 per node 222.61
vector rates in 8.4443e6, out 1.6889e7, drop 0.0000e0, punt 0.0000e0
Name State Calls Vectors Suspends Clocks Vectors/Call
BondEthernet0-output active 142744 36542464 0 2.51e1 256.00
BondEthernet0-tx active 142744 36542464 0 2.67e1 256.00
TenGigabitEthernet6/0/0-output active 142744 18270813 0 9.19e0 127.99
TenGigabitEthernet6/0/0-tx active 142744 18270813 0 6.98e1 127.99
TenGigabitEthernet6/0/1-output active 142744 18271651 0 9.43e0 128.00
TenGigabitEthernet6/0/1-tx active 142744 18271651 0 7.02e1 128.00
bond-input active 142744 36542464 0 1.76e1 256.00
dpdk-input polling 71372 36542464 0 5.08e1 512.00
ethernet-input active 142744 36542464 0 6.15e1 256.00
ip4-input active 142744 36542464 0 3.23e1 256.00
ip4-lookup active 142744 36542464 0 2.96e1 256.00
ip4-rewrite active 142744 36542464 0 3.28e1 256.00
Change-Id: I9fd43eda3c735cbff680ac6d2f01ecdae81f0eda
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
- Reduce per packet cost by buffering the output packet buffer indexes in the queue and
process the queue outside the packet processing loop.
- Move unnecessary variable initialization outside of the while loop.
- There is no need to save the old interface if tracing is not enabled.
Test result for 256 bytes packet comparison. Other packet size shows similar improvement.
With the patch
--------------
BondEthernet0-output active 52836 13526016 0 1.71e1 256.00
BondEthernet0-tx active 52836 13526016 0 2.68e1 256.00
TenGigabitEthernet6/0/0-output active 52836 6762896 0 9.17e0 127.99
TenGigabitEthernet6/0/0-tx active 52836 6762896 0 6.97e1 127.99
TenGigabitEthernet6/0/1-output active 52836 6763120 0 9.40e0 128.00
TenGigabitEthernet6/0/1-tx active 52836 6763120 0 7.00e1 128.00
bond-input active 52836 13526016 0 1.76e1 256.00
Without the patch
-----------------
BondEthernet0-output active 60858 15579648 0 1.73e1 256.00
BondEthernet0-tx active 60858 15579648 0 2.94e1 256.00
TenGigabitEthernet6/0/0-output active 60858 7789626 0 9.29e0 127.99
TenGigabitEthernet6/0/0-tx active 60858 7789626 0 7.01e1 127.99
TenGigabitEthernet6/0/1-output active 60858 7790022 0 9.31e0 128.00
TenGigabitEthernet6/0/1-tx active 60858 7790022 0 7.10e1 128.00
bond-input active 60858 15579648 0 1.77e1 256.00
Change-Id: Ib6d73a63ceeaa2f1397ceaf4c5391c57fd865b04
Signed-off-by: Steven <sluong@cisco.com>
|
|
Change-Id: I0c3f2add35ad9fc11308b7a2a2c69ffd8472dd2e
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
After the slave interface is removed from bond, bond input node still receives traffic for
the slave interface.
We have to disable feature arc for the corresponding slave interface.
Change-Id: I44e7001e6685e290b032c48147d02911a55d547b
Signed-off-by: Steven <sluong@cisco.com>
|
|
This significantly reduces need for
...
in multiarch code. Simply constructor macros will jost create static unused
entry if CLIB_MARCH_VARIANT is defined and that will be optimized out by
compiler.
Change-Id: I17d1c4ac0c903adcfadaa4a07de1b854c7ab14ac
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: Ieb8b53977fc8484c19780941e232ee072b667de3
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
It is cheaper to get thread index from vlib_main_t if available...
Change-Id: I4582e160d06d9d7fccdc54271912f0635da79b50
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
- Modify the API send_ip6_na and send_ip4_garp to take sw_if_index instead
of vnet_hw_interface_t and add call to build_ethernet_rewrite to support
subinterface/vlan
- Add code to bonding driver to send an event to bond_process when the first
interface becomes active or when the active interface is down
- Create a bond_process to walk the interface and the corresponding
subinterfaces to send garp/ip6_na when an event is received.
- Minor cleanup in bonding/node.c
Note: dpdk bonding driver does not send garp/ip6_na for subinterfaces. There is
no attempt to fix it here. But the infra is now done and should be easy to
add the support.
Change-Id: If3ecc4cd0fb3051330f7fa11ca0dab3e18557ce1
Signed-off-by: Steven <sluong@cisco.com>
|
|
Change-Id: I00fc4a4553dabed7ef099227b8253ed4916ea5e4
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: Ibab5e27277f618ceb2d543b9d6a1a5f191e7d1db
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Old code ~25 clocks/packet, new ~10.
Change-Id: I202cd6cbafb1ab2296939634d674f7ffd28253fc
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
- hash is great. But it is a bit too slow for the DP. Use direct array indexing
to quickly retrieve the slave interface.
- the algorithm used by flow hash is great. But it is a bit too slow for the DP.
Use l2_hash_hash() extracted from lb_hash.h which ECMP is using. It makes use
of intrinsic crc32 instruction set.
- shortcut modulo arithmetic when the operand is 2**x (where x up to 4) to
avoid division instruction.
- special case for link count == 1 in bond_tx_fn()
- use clib_mem_unaligned to access data for the packet to avoid alignment error
- Fix some typos for packet tracing.
Change-Id: I8eae3ad497061c5473aa675ba894ee0211120d25
Signed-off-by: Steven <sluong@cisco.com>
|
|
[VPP-1251]
Problem:
When the bond subinterface is removed, it was observed that we lost the lacp
partner. Show hardware shows rx counter goes up, but show interface does not
for the slave interfaces.
Cause:
We reset the interface promiscuous mode when the bond subinterface is deleted.
This causes dpdk not to accept any packet. Leave the interface in promiscuous
mode fixes the problem.
Other fixes:
There are few places we use hw_if_index as if they are sw_if_index. But they
don't necessarily have the same value. As soon as a subinterface is created,
they start to diverge. The fix is to use the correct API for the hw_if_index
and sw_if_index.
Change-Id: I1e6b8bca0a4aae396d217a141271cbf968500c91
Signed-off-by: Steven <sluong@cisco.com>
(cherry picked from commit 42c6599bf3057a7e8f4f00f5b6a9dd72af48d283)
|
|
In dpdk based bonding, when the bond interface is configured for l2,
it automatically sets the bond interface to promiscuous mode and sets rx
redirect to ethernet-input. This allows traffic to be bridged to
non compute node facing interface when it is received from the compute
node interface.
For native vpp bonding, we need to do similar things. When the bond interface
is configured for l2, we set the slave interfaces to promiscuous mode
and set rx redirect to ethernet-input because dpdk does not know anything
about the bond interface. Likewise, when a new interface is enslaved, we also
need to do the same thing if the bond interface has already been configured
for l2.
Change-Id: I7e168008e8a4221be74929b2a20e6db0ce8f3110
Signed-off-by: Steven <sluong@cisco.com>
|
|
While https://gerrit.fd.io/r/#/c/11316/ took care of 1 packet/frame for
most of the bonding modes, it missed the broadcast mode. This patch is
to fix the 1 packet/frame for the broadcast mode.
Change-Id: Iac48a2977c7f702f341479cc712a6448090dbc60
Signed-off-by: Steven <sluong@cisco.com>
|
|
For the debug image, if the interface is removed and the trace was
collected prior to the interface delete, show trace may cause a crash.
This is because vnet_get_sw_interface_name and vnet_get_sup_hw_interface
are not safe if the interface is deleted.
The fix is to use format_vnet_sw_if_index_name if all we need is to
get the interface name in the trace to display. It would show "DELETED"
which is better than a crash.
Change-Id: I912402d3e71592ece9f49d36c8a6b7af97f3b69e
Signed-off-by: Steven <sluong@cisco.com>
|
|
rename "enslave interface <slave> to <BondEthernetx>" to
"bond add <BondEthernetx> <slave>
"detach interface <slave>" to
"bond del <slave>"
Change-Id: I1bf8f017517b1f8a823127c7efedd3766e45cd5b
Signed-off-by: Steven <sluong@cisco.com>
|
|
coverity complains about statements in function A
function A
{
x % vec_len (y)
}
because vec_len (y) is a macro and may return 0 if the pointer y is null.
But coverity fails to realize the same statement vec_len (y) was already
invoked and checked in the caller of function A and punt if vec_len (y) is 0.
We can fix the coverity warning and shave off a few cpu cycles by caching
the result of vec_len (y) and pass it around to avoid calling vec_len (y)
again in multiple places.
Change-Id: I095166373abd3af3859646f860ee97c52f12fb50
Signed-off-by: Steven <sluong@cisco.com>
|
|
We were only puting one packet per frame to the output node. Change to
buffer multiple packets per frame. Performance is now on top of dpdk-based
bonding.
Put a spinlock in the tx thread in case the rug is pulled under us.
Change-Id: Ifda5af086a984a7301972cd6c8e428217f676a95
Signed-off-by: Steven <sluong@cisco.com>
|