Configure BGP peering
Big pictureβ
Configure BGP (Border Gateway Protocol) between Calico nodes or peering with network infrastructure to distribute routing information.
Valueβ
Calico nodes can exchange routing information over BGP to enable reachability for Calico networked workloads (Kubernetes pods or OpenStack VMs). In an on-premises deployment this allows you to make your workloads first-class citizens across the rest of your network. In public cloud deployments, it provides an efficient way of distributing routing information within your cluster, and is often used in conjunction with IPIP overlay or cross-subnet modes.
Conceptsβ
BGPβ
BGP is a standard protocol for exchanging routing information between routers in a network. Each router running BGP has one or more BGP peers - other routers which they are communicating with over BGP. You can think of Calico networking as providing a virtual router on each of your nodes. You can configure Calico nodes to peer with each other, with route reflectors, or with top-of-rack (ToR) routers.
Common BGP topologiesβ
There are many ways to configure a BGP network depending on your environment. Here are some common ways it is done with Calico.
Full-meshβ
When BGP is enabled, Calicoβs default behavior is to create a full-mesh of internal BGP (iBGP) connections where each node peers with each other. This allows Calico to operate over any L2 network, whether public cloud or private cloud, or, if IPIP is configured, to operate as an overlay over any network that does not block IPIP traffic. Calico does not use BGP for VXLAN overlays.
Most public clouds support IPIP. The notable exception is Azure, which blocks IPIP traffic. So if you want to run Calico as an overlay network in Azure, you must configure Calico to use VXLAN.
Full-mesh works great for small and medium-size deployments of say 100 nodes or less, but at significantly larger scales full-mesh becomes less efficient, and we recommend using route reflectors.
Route reflectorsβ
To build large clusters of internal BGP (iBGP), BGP route reflectors can be used to reduce the number of BGP peerings used on each node. In this model, some nodes act as route reflectors and are configured to establish a full mesh amongst themselves. Other nodes are then configured to peer with a subset of those route reflectors (typically 2 for redundancy), reducing the total number BGP peering connections compared to full-mesh.
Top of Rack (ToR)β
In on-premises deployments, you can configure Calico to peer directly with your physical network infrastructure. Typically, this involves disabling Calicoβs default full-mesh behavior, and instead peer Calico with your L3 ToR routers. There are many ways to build an on-premises BGP network. How you configure your BGP is up to you - Calico works well with both iBGP and eBGP configurations, and you can effectively treat Calico like any other router in your network design.
Depending on your topology, you may also consider using BGP route reflectors within each rack. However, this is typically needed only if the number of nodes in each L2 domain is large (> 100).
For a deeper look at common on-premises deployment models, see Calico over IP Fabrics.
Before you begin...β
calicoctl must be installed and configured.
How toβ
Significantly changing Calico's BGP topology, such as changing from full-mesh to peering with ToRs, may result in temporary loss of pod network connectivity during the reconfiguration process. It is recommended to only make such changes during a maintenance window.
- Configure a global BGP peer
- Configure a per-node BGP peer
- Configure a node to act as a route reflector
- Disable the default BGP node-to-node mesh
- Change from node-to-node mesh to route reflectors without any traffic disruption
- View BGP peering status for a node
- Change the default global AS number
- Change AS number for a particular node
- Configure a BGP filter
- Configure a BGP peer with a BGP filter
Configure a global BGP peerβ
Global BGP peers apply to all nodes in your cluster. This is useful if your network topology includes BGP speakers that will be peered with every Calico node in your deployment.
The following example creates a global BGP peer that configures every Calico node to peer with 192.20.30.40 in AS 64567.
apiVersion: projectcalico.org/v3
kind: BGPPeer
metadata:
name: my-global-peer
spec:
peerIP: 192.20.30.40
asNumber: 64567
Configure a per-node BGP peerβ
Per-node BGP peers apply to one or more nodes in the cluster. You can choose which nodes by specifying the nodeβs name exactly, or using a label selector.
The following example creates a BGPPeer that configures every Calico node with the label, rack: rack-1 to peer with 192.20.30.40 in AS 64567.
apiVersion: projectcalico.org/v3
kind: BGPPeer
metadata:
name: rack1-tor
spec:
peerIP: 192.20.30.40
asNumber: 64567
nodeSelector: rack == 'rack-1'
Configure a node to act as a route reflectorβ
Calico nodes can be configured to act as route reflectors. To do this, each node that you want to act as a route reflector must have a cluster ID - typically an unused IPv4 address.
To configure a node to be a route reflector with cluster ID 244.0.0.1, run the following command.
- Kubernetes API datastore
- etcd datastore
kubectl annotate node my-node projectcalico.org/RouteReflectorClusterID=244.0.0.1
calicoctl patch node my-node -p '{"spec": {"bgp": {"routeReflectorClusterID": "244.0.0.1"}}}'
Typically, you will want to label this node to indicate that it is a route reflector, allowing it to be easily selected by a BGPPeer resource. You can do this with kubectl. For example:
kubectl label node my-node route-reflector=true
Now it is easy to configure route reflector nodes to peer with each other and other non-route-reflector nodes using label selectors. For example:
kind: BGPPeer
apiVersion: projectcalico.org/v3
metadata:
name: peer-with-route-reflectors
spec:
nodeSelector: all()
peerSelector: route-reflector == 'true'
Adding routeReflectorClusterID
to a node spec will remove it from the node-to-node mesh immediately, tearing down the
existing BGP sessions. Adding the BGP peering will bring up new BGP sessions. This will cause a short (about 2 seconds)
disruption to dataplane traffic of workloads running in the nodes where this happens. To avoid this, make sure no
workloads are running on the nodes, by provisioning new nodes or by running kubectl drain
on the node (which may
itself cause a disruption as workloads are drained).
Disable the default BGP node-to-node meshβ
The default node-to-node BGP mesh may be turned off to enable other BGP topologies. To do this, modify the default BGP configuration resource.
Run the following command to disable the BGP full-mesh:
calicoctl patch bgpconfiguration default -p '{"spec": {"nodeToNodeMeshEnabled": false}}'
If the default BGP configuration resource does not exist, you need to create it first. See BGP configuration for more information.
Disabling the node-to-node mesh will break pod networking until/unless you configure replacement BGP peerings using BGPPeer resources. You may configure the BGPPeer resources before disabling the node-to-node mesh to avoid pod networking breakage.
Change from node-to-node mesh to route reflectors without any traffic disruptionβ
Switching from node-to-node BGP mesh to BGP route reflectors involves tearing down BGP sessions and bringing up new ones. This causes a short dataplane network disruption (of about 2 seconds) for workloads running on the nodes in the cluster. To avoid this, you may provision route reflector nodes and bring their BGP sessions up before tearing down the node-to-node mesh sessions.
Follow these steps to do so:
Provision new nodes to be route reflectors. The nodes should not be schedulable and they should have
routeReflectorClusterID
in their spec. These won't be part of the existing node-to-node BGP mesh, and will be the route reflectors when the mesh is disabled. These nodes should also have a label likeroute-reflector
to select them for the BGP peerings. Alternatively, you can drain workloads from existing nodes in your cluster by runningkubectl drain <NODE>
to configure them to be route reflectors, but this will cause a disruption on the workloads on those nodes as they are drained.Also set up a BGPPeer spec to configure route reflector nodes to peer with each other and other non-route-reflector nodes using label selectors.
Wait for these peerings to be established. This can be verified by running
sudo calicoctl node status
on the nodes. Alternatively, you can create aCalicoNodeStatus
resource to get BGP session status for the node.If you did drain workloads from the nodes or created them as unschedulable, mark the nodes as schedulable again (e.g. by running
kubectl uncordon <NODE>
).
View BGP peering status for a nodeβ
Create a CalicoNodeStatus resource to monitor BGP session status for the node.
Alternatively, you can run the calicoctl node status
command on a given node to learn more about its BGP status.
This command communicates with the local Calico agent, so you must execute it on the node whose status you are attempting to view.
Change the default global AS numberβ
By default, all Calico nodes use the 64512 autonomous system, unless a per-node AS has been specified for the node. You can change the global default for all nodes by modifying the default BGPConfiguration resource. The following example command sets the global default AS number to 64513.
calicoctl patch bgpconfiguration default -p '{"spec": {"asNumber": "64513"}}'
If the default BGP configuration resource does not exist, you need to create it first. See BGP configuration for more information.
Change AS number for a particular nodeβ
You can configure an AS for a particular node by modifying the node object using calicoctl
. For example, the following command changes the node named node-1 to belong to AS 64514.
calicoctl patch node node-1 -p '{"spec": {"bgp": {"asNumber": "64514"}}}'
Configure a BGP filterβ
BGP filters control which routes are imported and exported between BGP peers.
The BGP filter rules (importVX, exportVX) are applied sequentially, taking the
action
of the first matching rule. When no rules are matched, the default
action
is Accept
.
In order for a BGPFilter to be used in a BGP peering, its name
must be added to filters
of the corresponding BGPPeer resource.
The following example creates a BGPFilter
apiVersion: projectcalico.org/v3
kind: BGPFilter
metadata:
name: my-filter
spec:
exportV4:
- action: Accept
matchOperator: In
cidr: 77.0.0.0/16
- action: Reject
source: RemotePeers
- action: Reject
interface: '*.calico'
importV4:
- action: Reject
matchOperator: NotIn
cidr: 44.0.0.0/16
exportV6:
- action: Reject
source: RemotePeers
- action: Reject
interface: '*.calico'
importV6:
- action: Accept
matchOperator: Equal
cidr: 5000::0/64
- action: Reject
Configure a BGP peer with a BGP filterβ
BGP peers can use BGP filters to control which routes are imported or exported between them.
The following example creates a BGPFilter and associates it with a BGPPeer
BGPFilters are applied in the order listed on a BGPPeer
kind: BGPFilter
apiVersion: projectcalico.org/v3
metadata:
name: first-bgp-filter
spec:
exportV4:
- action: Accept
matchOperator: In
cidr: 77.0.0.0/16
source: RemotePeers
importV4:
- action: Reject
matchOperator: NotIn
cidr: 44.0.0.0/16
exportV6:
- action: Reject
interface: '*.calico'
importV6:
- action: Accept
matchOperator: Equal
cidr: 5000::0/64
---
kind: BGPFilter
apiVersion: projectcalico.org/v3
metadata:
name: second-bgp-filter
spec:
exportV4:
- action: Accept
matchOperator: In
cidr: 77.0.0.0/16
interface: '*.calico'
importV4:
- action: Reject
matchOperator: NotIn
cidr: 44.0.0.0/16
exportV6:
- action: Reject
source: RemotePeers
importV6:
- action: Reject
---
kind: BGPPeer
apiVersion: projectcalico.org/v3
metadata:
name: peer-with-filter
spec:
peerSelector: has(filter-bgp)
filters:
- first-bgp-filter
- second-bgp-filter