Configure federated endpoint identity and multi-cluster networking
Big picture
Configure a cluster to federate endpoint identities and establish cross-cluster connectivity.
Value
Secure cross-cluster traffic with identity-aware policy, and leverage Calico Enterprise to establish the required cross-cluster networking.
Concepts
Local and remote clusters
Each cluster in the cluster mesh can act as both a local and remote cluster.
- Local clusters are configured to retrieve endpoint and routing data from remote clusters (via RemoteClusterConfiguration)
- Remote clusters authorize local clusters to retrieve endpoint and routing data
Remote endpoint identity and policy
Typically, policy can only reference the endpoint identity (e.g. pod labels) of local endpoints. Federated endpoint identity enables local policy rules to reference remote endpoint identities.
RemoteClusterConfiguration
RemoteClusterConfiguration is the resource that configures a local cluster to sync resources from a remote cluster. It primarily describes how a local cluster establishes that connection to the remote cluster through which resources are synced.
The resources synced through this connection enable the local cluster to reference remote endpoint identity and establish cross-cluster overlay routes.
RemoteClusterConfiguration creates this connection in one direction. If you want identity-aware policy on both sides (i.e. both clusters) of a connection, or you want Calico Enterprise to establish cross-cluster overlay networking, you need to create a RemoteClusterConfiguration for both directions.
kubeconfig files
Each cluster in the cluster mesh should have a dedicated kubeconfig file used by other clusters in the mesh to connect and authenticate.
Before you begin
Required
How to
- Create kubeconfig files
- Create RemoteClusterConfiguration
- Validate federation and multi-cluster networking
- Create remote-identity-aware network policy
- Troubleshoot
- Configure IP pool resources
Ensure pod IP routability
Federation of workload endpoint identities requires Pod IP routability between clusters. If your clusters are using a supported overlay networking mode, Calico Enterprise can automatically meet this requirement when clusters are connected.
Calico Enterprise multi-cluster networking
Calico Enterprise can automatically extend the overlay networking in your clusters to establish pod IP routes across clusters and thus meet the requirement for Pod IP routability. Only VXLAN overlay is supported at this time.
Ensure the following requirements are met if utilizing Calico Enterprise multi-cluster networking to achieve pod IP routability:
- All nodes in the cluster mesh must be able to establish connections to each other via their private IP, and must have unique node names.
- VXLAN must be enabled on participating IP pools in all clusters, and these IP pool CIDRs must not overlap.
routeSource
andvxlan*
FelixConfiguration values must be aligned across clusters, and traffic on thevxlanPort
must be allowed between nodes in the cluster mesh.- RemoteClusterConfigurations must be established in both directions for cluster pairs in the cluster mesh.
- CNI must be Calico.
With these requirements met, multi-cluster networking will be automatically established when RemoteClusterConfigurations are created.
Other networking configurations
Alternatively, you can meet the requirement for Pod IP routability by configuring Calico Enterprise with BGP or with VPC routing to establish unencapsulated Pod IP routes in your environment.
If you have already configured federated endpoint identity without multi-cluster networking, and you wish to switch to using multi-cluster networking, you should note that the steps below are intended for establishing new RemoteClusterConfigurations. You may wish to consult the switch to multi-cluster networking section.
Create kubeconfig files
Create a kubeconfig file, for each cluster, that will be used by other clusters to connect and authenticate themselves.
For each cluster in the cluster mesh, utilizing an existing kubeconfig with administrative privileges, follow these steps:
-
Create the ServiceAccount used by remote clusters for authentication:
kubectl apply -f https://downloads.tigera.io/ee/v3.19.4/manifests/federation-remote-sa.yaml
-
If RBAC is enabled, create the ClusterRole and ClusterRoleBinding used by remote clusters for authorization:
kubectl apply -f https://downloads.tigera.io/ee/v3.19.4/manifests/federation-rem-rbac-kdd.yaml
-
Create the kubeconfig file:
Open a file in your favorite editor. Consider establishing a naming scheme unique to each cluster, e.g.
kubeconfig-app-a
.Paste the following into the file - we will replace the templated values with data retrieved in following steps.
apiVersion: v1
kind: Config
users:
- name: tigera-federation-remote-cluster
user:
token: <YOUR-SERVICE-ACCOUNT-TOKEN>
clusters:
- name: tigera-federation-remote-cluster
cluster:
certificate-authority-data: <YOUR-CERTIFICATE-AUTHORITY-DATA>
server: <YOUR-SERVER-ADDRESS>
contexts:
- name: tigera-federation-remote-cluster-ctx
context:
cluster: tigera-federation-remote-cluster
user: tigera-federation-remote-cluster
current-context: tigera-federation-remote-cluster-ctx -
Retrieve the ServiceAccount token:
If using Kubernetes ≥ 1.24
- Create the ServiceAccount token:
kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
type: kubernetes.io/service-account-token
metadata:
name: tigera-federation-remote-cluster
namespace: kube-system
annotations:
kubernetes.io/service-account.name: "tigera-federation-remote-cluster"
EOF- Retrieve the ServiceAccount token value and replace
<YOUR-SERVICE-ACCOUNT-TOKEN>
with it's value:
kubectl describe secret tigera-federation-remote-cluster -n kube-system
If using Kubernetes < 1.24
- Retrieve the ServiceAccount token value and replace
<YOUR-SERVICE-ACCOUNT-TOKEN>
with it's value:
kubectl describe secret -n kube-system $(kubectl get serviceaccounts tigera-federation-remote-cluster -n kube-system -o jsonpath='{.secrets[0].name}')
-
Retrieve and save the certificate authority and server data:
Run the following command:
kubectl config view --flatten --minify
Replace
<YOUR-CERTIFICATE-AUTHORITY-DATA>
and<YOUR-SERVER-ADDRESS>
withcertificate-authority-data
andserver
values respectively. -
Verify that the
kubeconfig
file works:Issue a command like the following to validate the kubeconfig file can be used to connect to the current cluster and access resources:
kubectl --kubeconfig=kubeconfig-app-a get nodes
Create RemoteClusterConfigurations
We'll now create the RemoteClusterConfigurations that establish synchronization between clusters. This enables remote-identity aware policy, federated services, and can establish multi-cluster networking.
- Overlay Routing
- Unencapsulated Routing
In this setup, the cluster mesh will be configured to meet the pod IP routability requirement by establishing routes between clusters using Calico Enterprise multi-cluster networking.
For each pair of clusters in the cluster mesh (e.g. cluster A and cluster B):
-
In cluster A, create a secret that contains the kubeconfig for cluster B:
Determine the namespace (
<secret-namespace>
) for the secret to replace in all steps. The simplest method to create a secret for a remote cluster is to use thekubectl
command because it correctly encodes the data and formats the file.kubectl create secret generic remote-cluster-secret-name -n <secret-namespace> \
--from-literal=datastoreType=kubernetes \
--from-file=kubeconfig=<kubeconfig file> -
If RBAC is enabled in cluster A, create a Role and RoleBinding for Calico Enterprise to use to access the secret that contains the kubeconfig for cluster B:
kubectl create -f - <<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: remote-cluster-secret-access
namespace: <secret-namespace>
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["watch", "list", "get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: remote-cluster-secret-access
namespace: <secret-namespace>
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: remote-cluster-secret-access
subjects:
- kind: ServiceAccount
name: calico-typha
namespace: calico-system
EOF -
Create the RemoteClusterConfiguration in cluster A:
Within the RemoteClusterConfiguration, we specify the secret used to access cluster B, and the overlay routing mode which toggles the establishment of cross-cluster overlay routes.
kubectl create -f - <<EOF
apiVersion: projectcalico.org/v3
kind: RemoteClusterConfiguration
metadata:
name: cluster-b
spec:
clusterAccessSecret:
name: remote-cluster-secret-name
namespace: <secret-namespace>
kind: Secret
syncOptions:
overlayRoutingMode: Enabled
EOF -
Validate the that the remote cluster connection can be established.
-
Repeat the above steps, switching cluster A and cluster B.
After completing the above steps for all cluster pairs in the cluster mesh, your clusters should now be ready to utilize remote-identity-aware policy and federated services, along with multi-cluster networking if requirements were met.
In this setup, the cluster mesh will rely on the underlying network to meet the pod IP routability requirement.
For each pair of clusters in the cluster mesh (e.g. cluster A and cluster B):
-
In cluster A, create a secret that contains the kubeconfig for cluster B:
Determine the namespace (
<secret-namespace>
) for the secret to replace in all steps. The simplest method to create a secret for a remote cluster is to use thekubectl
command because it correctly encodes the data and formats the file.kubectl create secret generic remote-cluster-secret-name -n <secret-namespace> \
--from-literal=datastoreType=kubernetes \
--from-file=kubeconfig=<kubeconfig file> -
If RBAC is enabled in cluster A, create a Role and RoleBinding for Calico Enterprise to use to access the secret that contains the kubeconfig for cluster B:
kubectl create -f - <<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: remote-cluster-secret-access
namespace: <secret-namespace>
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["watch", "list", "get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: remote-cluster-secret-access
namespace: <secret-namespace>
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: remote-cluster-secret-access
subjects:
- kind: ServiceAccount
name: calico-typha
namespace: calico-system
EOF -
Create the RemoteClusterConfiguration in cluster A:
Within the RemoteClusterConfiguration, we specify the secret used to access cluster B, and the overlay routing mode which toggles the establishment of cross-cluster overlay routes.
kubectl create -f - <<EOF
apiVersion: projectcalico.org/v3
kind: RemoteClusterConfiguration
metadata:
name: cluster-b
spec:
clusterAccessSecret:
name: remote-cluster-secret-name
namespace: <secret-namespace>
kind: Secret
syncOptions:
overlayRoutingMode: Disabled
EOF -
If you have no IP pools in cluster A with NAT-outgoing enabled, skip this step.
Otherwise, if you have IP pools in cluster A with NAT-outgoing enabled, and workloads in that pool will egress to workloads in cluster B, you need to instruct Calico Enterprise to not perform NAT on traffic destined for IP pools in cluster B.
You can achieve this by creating a disabled IP pool in cluster A for each CIDR in cluster B. This IP pool should have NAT-outgoing disabled. For example:
apiVersion: projectcalico.org/v3
kind: IPPool
metadata:
name: clusterB-main-pool
spec:
cidr: <Cluster B CIDR>
disabled: true -
Validate the that the remote cluster connection can be established.
-
Repeat the above steps, switching cluster A and cluster B.
After completing the above steps for all cluster pairs in the cluster mesh, your clusters should now be ready to utilize remote-identity-aware policy and federated services.
This tutorial sets up RemoteClusterConfigurations in both directions. This is required for Calico Enterprise to manage multi-cluster networking, and also ensures you can write identity-aware policy on both sides of a cross-cluster connection. Unidirectional connections can be made at your own discretion.
Switch to multi-cluster networking
The steps above assume that you are configuring both federated endpoint identity and multi-cluster networking for the first time. If you already have federated endpoint identity, and want to use multi-cluster networking, follow these steps:
- Validate that all requirements for multi-cluster networking have been met.
- Update the ClusterRole in each cluster in the cluster mesh using the RBAC manifest found in Create kubeconfig files
- In all RemoteClusterConfigurations, set
Spec.OverlayRoutingMode
toEnabled
. - Verify that all RemoteClusterConfigurations are bidirectional (in both directions for each cluster pair) using these instructions.
- If you had previously created disabled IP pools to prevent NAT outgoing from applying to remote cluster destinations, those disabled IP pools are no longer needed when using multi-cluster networking and must be deleted.
Validate federated endpoint identity & multi-cluster networking
Validate RemoteClusterConfiguration and federated endpoint identity
Check remote cluster connection
You can validate in a local cluster that Typha has synced to the remote cluster through the Prometheus metrics for Typha.
Alternatively, you can check the Typha logs for remote cluster connection status. Run the following command:
kubectl logs deployment/calico-typha -n calico-system | grep "Sending in-sync update"
You should see an entry for each RemoteClusterConfiguration in the local cluster.
If either output contains unexpected results, proceed to the troubleshooting section.
Validate multi-cluster networking
If all requirements were met for Calico Enterprise to establish multi-cluster networking, you can test the functionality by establishing a connection from a pod in a local cluster to the IP of a pod in a remote cluster. Ensure that there is no policy in either cluster that may block this connection.
If the connection fails, proceed to the troubleshooting section.
Create remote-identity-aware network policy
With federated endpoint identity and routing between clusters established, you can now use labels to reference endpoints on a remote cluster in local policy rules, rather than referencing them by IP address.
The main policy selector still refers only to local endpoints; and that selector chooses which local endpoints to apply the policy. However, rule selectors can now refer to both local and remote endpoints.
In the following example, cluster A (an application cluster) has a network policy that governs outbound connections to cluster B (a database cluster).
apiversion: projectcalico.org/v3
kind: NetworkPolicy
metadata:
name: default.app-to-db
namespace: myapp
spec:
# The main policy selector selects endpoints from the local cluster only.
selector: app == 'backend-app'
tier: default
egress:
- destination:
# Rule selectors can select endpoints from local AND remote clusters.
selector: app == 'postgres'
protocol: TCP
ports: [5432]
action: Allow
Troubleshoot
Troubleshoot RemoteClusterConfiguration and federated endpoint identity
Verify configuration
For each impacted remote cluster pair (between cluster A and cluster B):
-
Retrieve the kubeconfig from the secret stored in cluster A. Manually verify that it can be used to connect to cluster B.
kubectl get secret -n <secret-namespace> remote-cluster-secret-name -o=jsonpath="{.data.kubeconfig}" | base64 -d > verify_kubeconfig_b
kubectl --kubeconfig=verify_kubeconfig_b get nodesThis validates that the credentials used by Typha to connect to cluster B's API server are stored in the correct location and provide sufficient access.
The command above should yield a result like the following:
NAME STATUS ROLES AGE VERSION
clusterB-master Ready master 7d v1.27.0
clusterB-worker-1 Ready worker 7d v1.27.0
clusterB-worker-2 Ready worker 7d v1.27.0If you do not see the nodes of cluster B listed in response to the command above, verify that you created the kubeconfig for cluster B correctly, and that you stored it in cluster A correctly.
If you do see the nodes of cluster B listed in response to the command above, you can run this test (or a similar test) on a node in cluster A to verify that cluster A nodes can connect to the API server of cluster B.
-
Validate that the Typha service account in Cluster A is authorized to retrieve the kubeconfig secret for cluster B.
kubectl auth can-i list secrets --namespace <secret-namespace> --as=system:serviceaccount:calico-system:calico-typha
This command should yield the following output:
yes
If the command does not return this output, verify that you correctly configured RBAC in cluster A.
-
Repeat the above, switching cluster A to cluster B.
Check logs
Validate that querying Typha logs yield the expected result outlined in the validation section.
If the Typha logs do not yield the expected result, review the warning or error-related logs in typha
or calico-node
for insights.
calicoq
calicoq can be used to emulate the connection that Typha will make to remote clusters. Use the following command:
calicoq eval "all()"
If all remote clusters are accessible, calicoq returns something like the following. Note the remote cluster prefixes: there should be endpoints prefixed with the name of each RemoteClusterConfiguration in the local cluster.
Endpoints matching selector all():
Workload endpoint remote-cluster-1/host-1/k8s/kube-system.kube-dns-5fbcb4d67b-h6686/eth0
Workload endpoint remote-cluster-1/host-2/k8s/kube-system.cnx-manager-66c4dbc5b7-6d9xv/eth0
Workload endpoint host-a/k8s/kube-system.kube-dns-5fbcb4d67b-7wbhv/eth0
Workload endpoint host-b/k8s/kube-system.cnx-manager-66c4dbc5b7-6ghsm/eth0
If this command fails, the error messages returned by the command may provide insight into where issues are occurring.
Troubleshoot multi-cluster networking
Basic validation
- Ensure that RemoteClusterConfiguration and federated endpoint identity are functioning correctly
- Verify that you have met the prerequisites for multi-cluster networking
- If you had previously set up RemoteClusterConfigurations without multi-cluster networking, and are upgrading to use the feature, review the switching considerations
- Verify that traffic between clusters is not being denied by network policy
Check overlayRoutingMode
Ensure that overlayRoutingMode
is set to "Enabled"
on all RemoteClusterConfigurations.
If overlay routing is successfully enabled, you can view the logs of a Typha instance using:
kubectl logs deployment/calico-typha -n calico-system
You should see an output for each connected remote cluster that looks like this:
18:49:35.394 [INFO][14] wrappedcallbacks.go 443: Creating syncer for RemoteClusterConfiguration(my-cluster)
18:49:35.394 [INFO][14] watchercache.go 186: Full resync is required ListRoot="/calico/ipam/v2/assignment/"
18:49:35.395 [INFO][14] watchercache.go 186: Full resync is required ListRoot="/calico/resources/v3/projectcalico.org/workloadendpoints"
18:49:35.396 [INFO][14] watchercache.go 186: Full resync is required ListRoot="/calico/resources/v3/projectcalico.org/hostendpoints"
18:49:35.396 [INFO][14] watchercache.go 186: Full resync is required ListRoot="/calico/resources/v3/projectcalico.org/profiles"
18:49:35.396 [INFO][14] watchercache.go 186: Full resync is required ListRoot="/calico/resources/v3/projectcalico.org/nodes"
18:49:35.397 [INFO][14] watchercache.go 186: Full resync is required ListRoot="/calico/resources/v3/projectcalico.org/ippools"
If you do not see the each of the resource types above, overlay routing was not successfully enabled in your cluster. Verify that you followed the setup correctly for overlay routing, and that the cluster is using a version of Calico Enterprise that supports multi-cluster networking.
Check logs
Warning or error logs in typha
or calico-node
may provide insight into where issues are occurring.