Home

Awesome

Bond CNI plugin

Coverage Status

Build

It is recommended that Bond CNI be built with Go 1.12+ with dependencies managed using Go modules.

make build-bin
cp ./bin/bond /opt/cni/bin/

The binary should be placed at /opt/cni/bin on all nodes on which bonding will take place. That is all nodes to which a container with a bonded interface can be deployed.

Network configuration reference

Usage

Standalone operation

Given the following network configuration:

# cat > /etc/cni/net.d/00-flannel-bonding.conf <<EOF
{
	"name": "mynet",
	"type": "flannel",
	"delegate": {
		"type": "bond",
		"mode": "active-backup",
		"miimon": "100",
		"mtu": 1500,
                "failOverMac": 1,
		"links": [
            {
				"name": "ens3f2"
			},
			{
				"name": "ens3f2d1"
			}
		]
	}
}
EOF

Note: In this example configuration above required "ipam" is provided by flannel plugin implicitly.

Integration with Multus, SRIOV CNI and SRIOV Device Plugin

Users can take advantage of Multus to enable adding multiple interfaces to a K8s Pod. The SRIOV CNI plugin allows a SRIOV VF (Virtual Function) to be added to a container. Additionally the SRIOV Device Plugin allows Kubelet to manage SRIOV virtual functions. This example shows how Bond CNI could be used in conjunction with these plugins to handle more advanced use cases e.g, high performance container networking solution for NFV environment. Specifically the below functionality shows how to set up failover for SR-IOV interfaces in Kubernetes. This configuration is only applicable to SRIOV VFs using the kernel driver. Userspace driver VFs - such as those used in DPDK workloads - can not be bonded with the Bond CNI.

Configuration is based on the Multus CRD Network Attachment Definition. Please follow the configuration details in the link: Usage with Kubernetes CRD based Network Objects

For more information and advanced use refer to the Network Custom Resource standard for more details.

Bonded failover for SRIOV Workloads

Prerequisites:

The SRIOV Device Plugin will need to be configured to ensure the VFs in the pod are from different network cards. This is important because failover requires that the bonded interface still have connection even if one of the slave interfaces goes down. If both virtual functions are from the same root any connection issues on the physical interface and card will be reflected in both VFs at the same time.

An example SRIOV config - which works on the basis of physical interface names- is:

apiVersion: v1
kind: ConfigMap
metadata:
  name: sriovdp-config
  namespace: kube-system
data:
  config.json: |
    {
        "resourceList": [{
                "resourceName": "intel_sriov_PF_1",
                "selectors": {
                    "vendors": ["8086"],
                    "devices": ["154c", "10ed"],
                    "drivers": ["i40evf", "ixgbevf"],
                    "pfNames": ["<PF_NAME_2>"]

                }
            },
        {
                "resourceName": "intel_sriov_PF_2",
                "selectors": {
                    "vendors": ["8086"],
                    "devices": ["154c", "10ed"],
                    "drivers": ["i40evf", "ixgbevf"],
                    "pfNames": ["<PF_NAME_2>"]
                }
            }
        ]
    }

In the above specific PF names will have to be entered - based on available PFs in the cluster - in order to make the selectors pick up the correct VFs. The other selectors in the above configuration are identical. Note that SRIOV device plugin only picks up new configuration at startup - so if the daemonset was previously running the pods will have to be killed and redeployed before this is advertised again.

Steps for deployment

  1. Deploy Network Attach Definiton for SRIOV
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: sriov-net1
  annotations:
    k8s.v1.cni.cncf.io/resourceName: intel.com/intel_sriov_PF_1
spec:
  config: '{
  "type": "sriov",
  "name": "sriov-network",
  "spoofchk":"off"
}'

We will create a separate - but equivalent except for naming - SRIOV network attach definition. This allows us to keep our definitions seperate for our two Physical Function pools as definited above.

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: sriov-net2
  annotations:
    k8s.v1.cni.cncf.io/resourceName: intel.com/intel_sriov_PF_2
spec:
  config: '{
  "type": "sriov",
  "name": "sriov-network",
  "spoofchk":"off"
}'
  1. Deploy Network Attach Definition for Bond CNI:
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: bond-net1
spec:
  config: '{
  "type": "bond",
  "cniVersion": "0.3.1",
  "name": "bond-net1",
  "mode": "active-backup",
  "failOverMac": 1,
  "linksInContainer": true,
  "miimon": "100",
  "mtu": 1500,
  "links": [
     {"name": "net1"},
     {"name": "net2"}
  ],
  "ipam": {
    "type": "host-local",
    "subnet": "10.56.217.0/24",
    "routes": [{
      "dst": "0.0.0.0/0"
    }],
    "gateway": "10.56.217.1"
  }
}'

Note above the "linksInContainer": true flag. This tells the Bond CNI that the interfaces we're looking for are to be found inside the container. By default it will look for these interfaces on the host which does not work for integration with SRIOV/Multus.

  1. Deploy a pod which requests two SRIOV networks, one from each PF, and one bonded network.
apiVersion: v1
kind: Pod
metadata:
  name: test-pod
  annotations:
        k8s.v1.cni.cncf.io/networks: '[
{"name": "sriov-net1",
"interface": "net1"
},
{"name": "sriov-net2",
"interface": "net2"
},
{"name": "bond-net",
"interface": "bond0"
}]'
spec:
  restartPolicy: Never
  containers:
  - name: bond-test
    image: alpine:latest
    command:
      - /bin/sh
      - "-c"
      - "sleep 60m"
    imagePullPolicy: IfNotPresent
    resources:
      requests:
        intel.com/intel_sriov_PF_1: '1'
        intel.com/intel_sriov_PF_2: '1'
      limits:
        intel.com/intel_sriov_PF_1: '1'
        intel.com/intel_sriov_PF_2: '1'

The order in the request annotation k8s.v1.cni.cncf.io/networks: sriov-net1, sriov-net2, bond-net1 is important as it is the same order in which networks will be added. In the above spec we add one SRIOV network, then we add another identically configured SRIOV network from our second SRIOV VF pool. Multus will give these networks the names net1 and net2 respectively.

Next the bond-net1 network is created - using interfaces net1 and net2. If bond is created before the SRIOV networks the CNI will not be able to find the interfaces in the container.

The name of each interface can be set manually in the annotation according to the CRD Spec. Changing the names applied in the annotation configuration requires matching changes to be made in the bond network attachment definition.

After deploying the above pod spec on Kubernetes running the following command:

kubectl exec -it test-pod -- ip a

Will result in output like:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
3: eth0@if150: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue state UP 
    link/ether 62:b1:b5:c8:fb:7a brd ff:ff:ff:ff:ff:ff
    inet 10.244.1.122/24 brd 10.244.1.255 scope global eth0
       valid_lft forever preferred_lft forever
4: bond0: <BROADCAST,MULTICAST,UP,LOWER_UP400> mtu 1500 qdisc noqueue state UP qlen 1000
    link/ether 9e:23:69:42:fb:8a brd ff:ff:ff:ff:ff:ff
    inet 10.56.217.66/24 scope global bond0
       valid_lft forever preferred_lft forever
43: net1: <BROADCAST,MULTICAST,UP,LOWER_UP800> mtu 1500 qdisc mq master bond0 state UP qlen 1000
    link/ether 9e:23:69:42:fb:8a brd ff:ff:ff:ff:ff:ff
44: net2: <BROADCAST,MULTICAST,UP,LOWER_UP800> mtu 1500 qdisc mq master bond0 state UP qlen 1000
    link/ether 9e:23:69:42:fb:8a brd ff:ff:ff:ff:ff:ff

We have three new interfaces added to our pod - net1 and net2 are SRIOV interfaces while bond0 is the bond over the two of them. Net1 and Net2 don't require IP addresses - and this can be changed in their CRD.