Skip to main content
Version: 3.18 (latest)

Configure federated endpoint identity and multi-cluster networking

Big picture​

Configure a cluster to federate endpoint identities and establish cross-cluster connectivity.

Value​

Secure cross-cluster traffic with identity-aware policy, and leverage Calico Enterprise to establish the required cross-cluster networking.

Concepts​

Local and remote clusters​

Each cluster in the cluster mesh can act as both a local and remote cluster.

  • Local clusters are configured to retrieve endpoint and routing data from remote clusters (via RemoteClusterConfiguration)
  • Remote clusters authorize local clusters to retrieve endpoint and routing data

Remote endpoint identity and policy​

Typically, policy can only reference the endpoint identity (e.g. pod labels) of local endpoints. Federated endpoint identity enables local policy rules to reference remote endpoint identities.

RemoteClusterConfiguration​

RemoteClusterConfiguration is the resource that configures a local cluster to sync resources from a remote cluster. It primarily describes how a local cluster establishes that connection to the remote cluster through which resources are synced.

The resources synced through this connection enable the local cluster to reference remote endpoint identity and establish cross-cluster overlay routes.

RemoteClusterConfiguration creates this connection in one direction. If you want identity-aware policy on both sides (i.e. both clusters) of a connection, or you want Calico Enterprise to establish cross-cluster overlay networking, you need to create a RemoteClusterConfiguration for both directions.

kubeconfig files​

Each cluster in the cluster mesh should have a dedicated kubeconfig file used by other clusters in the mesh to connect and authenticate.

Before you begin​

Required

How to​

Ensure pod IP routability​

Federation of workload endpoint identities requires Pod IP routability between clusters. If your clusters are using a supported overlay networking mode, Calico Enterprise can automatically meet this requirement when clusters are connected.

Calico Enterprise multi-cluster networking​

Calico Enterprise can automatically extend the overlay networking in your clusters to establish pod IP routes across clusters and thus meet the requirement for Pod IP routability. Only VXLAN overlay is supported at this time.

Ensure the following requirements are met if utilizing Calico Enterprise multi-cluster networking to achieve pod IP routability:

  • All nodes in the cluster mesh must be able to establish connections to each other via their private IP, and must have unique node names.
  • VXLAN must be enabled on participating IP pools in all clusters, and these IP pool CIDRs must not overlap.
  • routeSource and vxlan* FelixConfiguration values must be aligned across clusters, and traffic on the vxlanPort must be allowed between nodes in the cluster mesh.
  • RemoteClusterConfigurations must be established in both directions for cluster pairs in the cluster mesh.
  • CNI must be Calico.

With these requirements met, multi-cluster networking will be automatically established when RemoteClusterConfigurations are created.

Other networking configurations​

Alternatively, you can meet the requirement for Pod IP routability by configuring Calico Enterprise with BGP or with VPC routing to establish unencapsulated Pod IP routes in your environment.

caution

If you have already configured federated endpoint identity without multi-cluster networking, and you wish to switch to using multi-cluster networking, you should note that the steps below are intended for establishing new RemoteClusterConfigurations. You may wish to consult the switch to multi-cluster networking section.

Create kubeconfig files​

Create a kubeconfig file, for each cluster, that will be used by other clusters to connect and authenticate themselves.

For each cluster in the cluster mesh, utilizing an existing kubeconfig with administrative privileges, follow these steps:

  1. Create the ServiceAccount used by remote clusters for authentication:

    kubectl apply -f https://downloads.tigera.io/ee/v3.18.2/manifests/federation-remote-sa.yaml
  2. If RBAC is enabled, create the ClusterRole and ClusterRoleBinding used by remote clusters for authorization:

    kubectl apply -f https://downloads.tigera.io/ee/v3.18.2/manifests/federation-rem-rbac-kdd.yaml
  3. Create the kubeconfig file:

    Open a file in your favorite editor. Consider establishing a naming scheme unique to each cluster, e.g. kubeconfig-app-a.

    Paste the following into the file - we will replace the templated values with data retrieved in following steps.

    apiVersion: v1
    kind: Config
    users:
    - name: tigera-federation-remote-cluster
    user:
    token: <YOUR-SERVICE-ACCOUNT-TOKEN>
    clusters:
    - name: tigera-federation-remote-cluster
    cluster:
    certificate-authority-data: <YOUR-CERTIFICATE-AUTHORITY-DATA>
    server: <YOUR-SERVER-ADDRESS>
    contexts:
    - name: tigera-federation-remote-cluster-ctx
    context:
    cluster: tigera-federation-remote-cluster
    user: tigera-federation-remote-cluster
    current-context: tigera-federation-remote-cluster-ctx
  4. Retrieve the ServiceAccount token:

    If using Kubernetes ≥ 1.24​

    • Create the ServiceAccount token:
    kubectl apply -f - <<EOF
    apiVersion: v1
    kind: Secret
    type: kubernetes.io/service-account-token
    metadata:
    name: tigera-federation-remote-cluster
    namespace: kube-system
    annotations:
    kubernetes.io/service-account.name: "tigera-federation-remote-cluster"
    EOF
    • Retrieve the ServiceAccount token value and replace <YOUR-SERVICE-ACCOUNT-TOKEN> with it's value:
    kubectl describe secret tigera-federation-remote-cluster -n kube-system

    If using Kubernetes < 1.24​

    • Retrieve the ServiceAccount token value and replace <YOUR-SERVICE-ACCOUNT-TOKEN> with it's value:
    kubectl describe secret -n kube-system $(kubectl get serviceaccounts tigera-federation-remote-cluster -n kube-system -o jsonpath='{.secrets[0].name}')
  5. Retrieve and save the certificate authority and server data:

    Run the following command:

    kubectl config view --flatten --minify

    Replace <YOUR-CERTIFICATE-AUTHORITY-DATA> and <YOUR-SERVER-ADDRESS> with certificate-authority-data and server values respectively.

  6. Verify that the kubeconfig file works:

    Issue a command like the following to validate the kubeconfig file can be used to connect to the current cluster and access resources:

    kubectl --kubeconfig=kubeconfig-app-a get nodes

Create RemoteClusterConfigurations​

We'll now create the RemoteClusterConfigurations that establish synchronization between clusters. This enables remote-identity aware policy, federated services, and can establish multi-cluster networking.

In this setup, the cluster mesh will be configured to meet the pod IP routability requirement by establishing routes between clusters using Calico Enterprise multi-cluster networking.

For each pair of clusters in the cluster mesh (e.g. cluster A and cluster B):

  1. In cluster A, create a secret that contains the kubeconfig for cluster B:

    Determine the namespace (<secret-namespace>) for the secret to replace in all steps. The simplest method to create a secret for a remote cluster is to use the kubectl command because it correctly encodes the data and formats the file.

    kubectl create secret generic remote-cluster-secret-name -n <secret-namespace> \
    --from-literal=datastoreType=kubernetes \
    --from-file=kubeconfig=<kubeconfig file>
  2. If RBAC is enabled in cluster A, create a Role and RoleBinding for Calico Enterprise to use to access the secret that contains the kubeconfig for cluster B:

    kubectl create -f - <<EOF
    apiVersion: rbac.authorization.k8s.io/v1
    kind: Role
    metadata:
    name: remote-cluster-secret-access
    namespace: <secret-namespace>
    rules:
    - apiGroups: [""]
    resources: ["secrets"]
    verbs: ["watch", "list", "get"]
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: RoleBinding
    metadata:
    name: remote-cluster-secret-access
    namespace: <secret-namespace>
    roleRef:
    apiGroup: rbac.authorization.k8s.io
    kind: Role
    name: remote-cluster-secret-access
    subjects:
    - kind: ServiceAccount
    name: calico-typha
    namespace: calico-system
    EOF
  3. Create the RemoteClusterConfiguration in cluster A:

    Within the RemoteClusterConfiguration, we specify the secret used to access cluster B, and the overlay routing mode which toggles the establishment of cross-cluster overlay routes.

    kubectl create -f - <<EOF
    apiVersion: projectcalico.org/v3
    kind: RemoteClusterConfiguration
    metadata:
    name: cluster-b
    spec:
    clusterAccessSecret:
    name: remote-cluster-secret-name
    namespace: <secret-namespace>
    kind: Secret
    syncOptions:
    overlayRoutingMode: Enabled
    EOF
  4. Validate the that the remote cluster connection can be established.

  5. Repeat the above steps, switching cluster A and cluster B.

After completing the above steps for all cluster pairs in the cluster mesh, your clusters should now be ready to utilize remote-identity-aware policy and federated services, along with multi-cluster networking if requirements were met.

caution

This tutorial sets up RemoteClusterConfigurations in both directions. This is required for Calico Enterprise to manage multi-cluster networking, and also ensures you can write identity-aware policy on both sides of a cross-cluster connection. Unidirectional connections can be made at your own discretion.

Switch to multi-cluster networking​

The steps above assume that you are configuring both federated endpoint identity and multi-cluster networking for the first time. If you already have federated endpoint identity, and want to use multi-cluster networking, follow these steps:

  1. Validate that all requirements for multi-cluster networking have been met.
  2. Update the ClusterRole in each cluster in the cluster mesh using the RBAC manifest found in Create kubeconfig files
  3. In all RemoteClusterConfigurations, set Spec.OverlayRoutingMode to Enabled.
  4. Verify that all RemoteClusterConfigurations are bidirectional (in both directions for each cluster pair) using these instructions.
  5. If you had previously created disabled IP pools to prevent NAT outgoing from applying to remote cluster destinations, those disabled IP pools are no longer needed when using multi-cluster networking and must be deleted.

Validate federated endpoint identity & multi-cluster networking​

Validate RemoteClusterConfiguration and federated endpoint identity​

Check remote cluster connection​

You can validate in a local cluster that Typha has synced to the remote cluster through the Prometheus metrics for Typha.

Alternatively, you can check the Typha logs for remote cluster connection status. Run the following command:

kubectl logs deployment/calico-typha -n calico-system | grep "Sending in-sync update"

You should see an entry for each RemoteClusterConfiguration in the local cluster.

If either output contains unexpected results, proceed to the troubleshooting section.

Validate multi-cluster networking​

If all requirements were met for Calico Enterprise to establish multi-cluster networking, you can test the functionality by establishing a connection from a pod in a local cluster to the IP of a pod in a remote cluster. Ensure that there is no policy in either cluster that may block this connection.

If the connection fails, proceed to the troubleshooting section.

Create remote-identity-aware network policy​

With federated endpoint identity and routing between clusters established, you can now use labels to reference endpoints on a remote cluster in local policy rules, rather than referencing them by IP address.

The main policy selector still refers only to local endpoints; and that selector chooses which local endpoints to apply the policy. However, rule selectors can now refer to both local and remote endpoints.

In the following example, cluster A (an application cluster) has a network policy that governs outbound connections to cluster B (a database cluster).

apiversion: projectcalico.org/v3
kind: NetworkPolicy
metadata:
name: default.app-to-db
namespace: myapp
spec:
# The main policy selector selects endpoints from the local cluster only.
selector: app == 'backend-app'
tier: default
egress:
- destination:
# Rule selectors can select endpoints from local AND remote clusters.
selector: app == 'postgres'
protocol: TCP
ports: [5432]
action: Allow

Troubleshoot​

Troubleshoot RemoteClusterConfiguration and federated endpoint identity​

Verify configuration​

For each impacted remote cluster pair (between cluster A and cluster B):

  1. Retrieve the kubeconfig from the secret stored in cluster A. Manually verify that it can be used to connect to cluster B.

    kubectl get secret -n <secret-namespace> remote-cluster-secret-name -o=jsonpath="{.data.kubeconfig}" | base64 -d > verify_kubeconfig_b
    kubectl --kubeconfig=verify_kubeconfig_b get nodes

    This validates that the credentials used by Typha to connect to cluster B's API server are stored in the correct location and provide sufficient access.

    The command above should yield a result like the following:

     NAME              STATUS   ROLES    AGE   VERSION
    clusterB-master Ready master 7d v1.27.0
    clusterB-worker-1 Ready worker 7d v1.27.0
    clusterB-worker-2 Ready worker 7d v1.27.0

    If you do not see the nodes of cluster B listed in response to the command above, verify that you created the kubeconfig for cluster B correctly, and that you stored it in cluster A correctly.

    If you do see the nodes of cluster B listed in response to the command above, you can run this test (or a similar test) on a node in cluster A to verify that cluster A nodes can connect to the API server of cluster B.

  2. Validate that the Typha service account in Cluster A is authorized to retrieve the kubeconfig secret for cluster B.

     kubectl auth can-i list secrets --namespace <secret-namespace> --as=system:serviceaccount:calico-system:calico-typha

    This command should yield the following output:

    yes

    If the command does not return this output, verify that you correctly configured RBAC in cluster A.

  3. Repeat the above, switching cluster A to cluster B.

Check logs​

Validate that querying Typha logs yield the expected result outlined in the validation section.

If the Typha logs do not yield the expected result, review the warning or error-related logs in typha or calico-node for insights.

calicoq​

calicoq can be used to emulate the connection that Typha will make to remote clusters. Use the following command:

calicoq eval "all()"

If all remote clusters are accessible, calicoq returns something like the following. Note the remote cluster prefixes: there should be endpoints prefixed with the name of each RemoteClusterConfiguration in the local cluster.

Endpoints matching selector all():
Workload endpoint remote-cluster-1/host-1/k8s/kube-system.kube-dns-5fbcb4d67b-h6686/eth0
Workload endpoint remote-cluster-1/host-2/k8s/kube-system.cnx-manager-66c4dbc5b7-6d9xv/eth0
Workload endpoint host-a/k8s/kube-system.kube-dns-5fbcb4d67b-7wbhv/eth0
Workload endpoint host-b/k8s/kube-system.cnx-manager-66c4dbc5b7-6ghsm/eth0

If this command fails, the error messages returned by the command may provide insight into where issues are occurring.

Troubleshoot multi-cluster networking​

Basic validation​
  • Ensure that RemoteClusterConfiguration and federated endpoint identity are functioning correctly
  • Verify that you have met the prerequisites for multi-cluster networking
  • If you had previously set up RemoteClusterConfigurations without multi-cluster networking, and are upgrading to use the feature, review the switching considerations
  • Verify that traffic between clusters is not being denied by network policy
Check overlayRoutingMode​

Ensure that overlayRoutingMode is set to "Enabled" on all RemoteClusterConfigurations.

If overlay routing is successfully enabled, you can view the logs of a Typha instance using:

kubectl logs deployment/calico-typha -n calico-system

You should see an output for each connected remote cluster that looks like this:

18:49:35.394 [INFO][14] wrappedcallbacks.go 443: Creating syncer for RemoteClusterConfiguration(my-cluster)
18:49:35.394 [INFO][14] watchercache.go 186: Full resync is required ListRoot="/calico/ipam/v2/assignment/"
18:49:35.395 [INFO][14] watchercache.go 186: Full resync is required ListRoot="/calico/resources/v3/projectcalico.org/workloadendpoints"
18:49:35.396 [INFO][14] watchercache.go 186: Full resync is required ListRoot="/calico/resources/v3/projectcalico.org/hostendpoints"
18:49:35.396 [INFO][14] watchercache.go 186: Full resync is required ListRoot="/calico/resources/v3/projectcalico.org/profiles"
18:49:35.396 [INFO][14] watchercache.go 186: Full resync is required ListRoot="/calico/resources/v3/projectcalico.org/nodes"
18:49:35.397 [INFO][14] watchercache.go 186: Full resync is required ListRoot="/calico/resources/v3/projectcalico.org/ippools"

If you do not see the each of the resource types above, overlay routing was not successfully enabled in your cluster. Verify that you followed the setup correctly for overlay routing, and that the cluster is using a version of Calico Enterprise that supports multi-cluster networking.

Check logs​

Warning or error logs in typha or calico-node may provide insight into where issues are occurring.

Next steps​

Configure federated services