Skip to main content
Calico Cloud documentation

Limitations and known issues for Windows nodes

Calico Cloud feature limitations

FeatureUnsupported in this release
Platforms- GKE
Install and upgrade- Typha component for scaling (Linux-based feature)
Networking- Overlay mode with BGP peering
- IP in IP overlay with BGP routing
- Cross-subnet support and MTU setting for VXLAN
- IPv6 and dual stack
- Dual-ToR
- Service advertisement
- Multiple networks to pods
Policy- Staged network-policy
- Firewall integrations
- Policy for hosts (host endpoints, including automatic host endpoints)
- Tiered policy: TKG, GKE, AKS
- WAF integration
- AWS firewall integration
- Fortinet integration
Visibility and troubleshooting- Packet capture
- DNS logs
- iptables logs
- L7 logs
Threat defense- No threat defense features are supported.
Image Assurance- No Image Assurance features are supported.
Multi-cluster management- Multi-cluster management federated identity endpoints and services
- Federated endpoint identity and services
Compliance and security- CIS benchmark and other reports
- Wireguard encryption for pod-to-pod traffic and host-to-host traffic
Dataplane- eBPF is a Linux-based feature

Calico Cloud BGP networking limitations

If you are using Calico Cloud with BGP, note these current limitations with Windows.

FeatureLimitation
IP mobility/ borrowingCalico Cloud IPAM allocates IPs to host in blocks for aggregation purposes.
If the IP pool is full, nodes can also "borrow" IPs from another node's block. In BGP terms, the borrower then advertises a more specific "/32" route for the borrowed IP and traffic for that IP is only routed to the borrowing host.

Windows nodes do not support this borrowing mechanism; they will not borrow IPs even if the IP pool is full and they mark their blocks so that Linux nodes will not borrow from them.
IPs reserved for WindowsCalico Cloud IPAM allocates IPs in CIDR blocks. Due to networking requirements on Windows, four IPs per Windows node-owned block must be reserved for internal purposes.

For example, with the default block size of /26, each block contains 64 IP addresses, 4 are reserved for Windows, leaving 60 for pod networking.

To reduce the impact of these reservations, a larger block size can be configured at the IP pool scope (before any pods are created).
Single IP block per hostCalico Cloud IPAM is designed to allocate blocks of IPs (default size /26) to hosts on demand. While the Calico Cloud CNI plugin was written to do the same, kube-proxy for Windows currently only supports a single IP block per host.

To work around the default limit of one /26 per host there some options:

- Use Calico Cloud BGP networking with the kubernetes datastore. In that mode, Calico Cloud IPAM is not used and the CNI host-local IPAM plugin is used with the node's Pod CIDR.

To allow multiple IPAM blocks per host (at the expense of kube-proxy compatibility), set the windows_use_single_network flag to false in the cni.conf.template before installing Calico Cloud. Changing that setting after pods are networked is not recommended because it may leak HNS endpoints.
IP-in-IP overlayCalico Cloud's IPIP overlay mode cannot be used in clusters that contain Windows nodes because Windows does not support IP-in-IP.
NAT-outgoingCalico Cloud IP pools support a "NAT outgoing" setting with the following behaviour:

- Traffic between Calico Cloud workloads (in any IP pools) is not NATted.
- Traffic leaving the configured IP pools is NATted if the workload has an IP within an IP pool that has NAT outgoing enabled. Calico Cloud honors the above setting but it is only applied at pod creation time. If the IP pool configuration is updated after a pod is created, the pod's traffic will continue to be NATted (or not) as before. NAT policy for newly-networked pods will honor the new configuration. Calico Cloud automatically adds the host itself and its subnet to the NAT exclusion list. This behaviour can be disabled by setting flag windows_disable_host_subnet_nat_exclusion to true in cni.conf.template before running the install script.
Service IP advertisementThis Calico Cloud feature is not supported on Windows.

Check your network configuration

If you are using a networking type that requires layer 2 reachability (such as Calico Cloud with a BGP mesh and no peering to your fabric), you can check that your network has layer 2 reachability as follows:

On each of your nodes, check the IP network of the network adapter that you plan to use for pod networking. For example, on Linux, assuming your network adapter is eth0, you can run:

$ ip addr show eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000

link/ether 00:0c:29:cb:c8:19 brd ff:ff:ff:ff:ff:ff
inet 192.168.171.136/24 brd 192.168.171.255 scope

global eth0
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fecb:c819/64 scope
link

valid_lft forever preferred_lft
forever

In this case, the IPv4 is 192.168.171.136/24; which, after applying the /24 mask gives 192.168.171.0/24 for the IP network.

Similarly, on Windows, you can run

PS C:\> ipconfig

Windows IP Configuration

Ethernet adapter vEthernet (Ethernet 2):

Connection-specific DNS Suffix . :
us-west-2.compute.internal Link-local IPv6 Address . . . .
. : fe80::6d10:ccdd:bfbe:bce2%15 IPv4 Address. . . . . . .
. . . . : 172.20.41.103 Subnet Mask . . . . . . . . . . .
: 255.255.224.0 Default Gateway . . . . . . . . . :
172.20.32.1

In this case, the IPv4 address is 172.20.41.103 and the mask is represented as bytes 255.255.224.0 rather than CIDR notation. Applying the mask, we get a network address 172.20.32.0/19.

Because the Linux node has network 192.168.171.136/24 and the Windows node has a different network, 172.20.32.0/19, they are unlikely to be on the same layer 2 network.

VXLAN networking limitations

Because of differences between the Linux and Windows dataplane feature sets, the following Calico Cloud features are not supported on Windows.

FeatureLimitation
IPs reserved for WindowsCalico Cloud IPAM allocates IPs in CIDR blocks. Due to networking requirements on Windows, four IPs per Windows node-owned block must be reserved for internal purposes.

For example, with the default block size of /26, each block contains 64 IP addresses, 4 are reserved for Windows, leaving 60 for pod networking.

To reduce the impact of these reservations, a larger block size can be configured at the IP pool scope (before any pods are created).
Single IP block per hostCalico Cloud IPAM is designed to allocate blocks of IPs (default size /26) to hosts on demand. While the Calico Cloud CNI plugin was written to do the same, kube-proxy currently only supports a single IP block per host.
To allow multiple IPAM blocks per host (at the expense of kube-proxy compatibility), set the windows_use_single_network flag to false in the cni.conf.template before installing Calico Cloud. Changing that setting after pods are networked is not recommended because it may leak HNS endpoints.

Routes are lost in cloud providers

If you create a Windows host with a cloud provider (AWS for example), the creation of the vSwitch at Calico Cloud install time can remove the cloud provider's metadata route. If your application relies on the metadata service, you may need to examine the routing table before and after installing Calico Cloud to reinstate any lost routes.

VXLAN limitations

VXLAN support

  • Windows 1903 build 18317 and above
  • Windows 1809 build 17763 and above

Configuration updates

Certain configuration changes will not be honored after the first pod is networked. This is because Windows does not currently support updating the VXLAN subnet parameters after the network is created so updating those parameters requires the node to be drained:

One example is the VXLAN VNI setting. To change such parameters:

  • Drain the node of all pods

  • Delete the Calico Cloud HNS network:

    Import-Module -DisableNameChecking C:\TigeraCalico\libs\hns\hns.psm1
    Get-HNSNetwork | ? Name -EQ "Calico Cloud" | Remove-HNSNetwork
  • Update the configuration in config.ps1, run uninstall-calico.ps1 and then install-calico.ps1 to regenerate the CNI configuration.

Pod-to-pod connections are dropped with TCP reset packets

Restarting Felix or changes to policy (including changes to endpoints referred to in policy) can cause pod-to-pod connections to be dropped with TCP reset packets when one of the following occurs:

  • The policy that applies to a pod is updated
  • Some ingress or egress policy that applies to a pod contains selectors and the set of endpoints that those selectors match changes

Felix must reprogram the HNS ACL policy attached to the pod. This reprogramming can cause TCP resets. Microsoft has confirmed this is a HNS issue, and they are investigating.

Service ClusterIPs incompatible with selectors on pod IPs in network policy

Windows 1809 prior to build 17763.1432

On Windows nodes, kube-proxy unconditionally applies source NAT to traffic from local pods to service ClusterIPs. This means that, at the destination pod, where policy is applied, the traffic appears to come from the source host rather than the source pod. In turn, this means that a network policy with a source selector matching the source pod will not match the expected traffic.

Network policy and using selectors

Under certain conditions, relatively simple Calico Cloud policies can require significant Windows dataplane resources, that can cause significant CPU and memory usage, and large policy programming latency.

We recommend avoiding policies that contain rules with both a source and destination selector. The following is an example of a policy that would be inefficient. The policy applies to all workloads, and it only allows traffic from workloads labeled as clients to workloads labeled as servers:

apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
name: calico-dest-selector
spec:
selector: all()
order: 500
ingress:
- action: Allow
destination:
selector: role == "webserver"
source:
selector: role == "client"

Because the policy applies to all workloads, it will be rendered once per workload (even if the workload is not labeled as a server), and then the selectors will be expanded into many individual dataplane rules to capture the allowed connectivity.

Here is a much more efficient policy that still allows the same traffic:

apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
name: calico-dest-selector
spec:
selector: role == "webserver"
order: 500
ingress:
- action: Allow
source:
selector: role == "client"

The destination selector is moved into the policy selector, so this policy is only rendered for workloads that have the role: webserver label. In addition, the rule is simplified so that it only matches on the source of the traffic. Depending on the number of webserver pods, this change can reduce the dataplane resource usage by several orders of magnitude.

Network policy with tiers

Because of the way the Windows dataplane handles rules, the following limitations are required to avoid performance issues:

  • Tiers: maximum of 5
  • pass rules: maximum of 10 per tier
  • If each tier contains a large number of rules, and has pass rules, you may need to reduce the number of tiers further.

Flow log limitations

Calico Cloud supports flow logs with these limitations:

  • No packet/bytes stats for denied traffics
  • Inaccurate num_flows_started and num_flows_completed stats with VXLAN networking
  • No DNS stats
  • No Http stats
  • No RuleTrace for tiers
  • No BGP logs

DNS Policy limitations

note

DNS Policy is a tech preview feature. Tech preview features may be subject to significant changes before they become GA.

Calico Cloud supports DNS policy on Windows with these limitations:

  • It could take up to 5 seconds for the first TCP SYN packet to go through, for a connection to a DNS domain name. This is because DNS policies are dynamically programmed. The first TCP packet could be dropped since there is no policy to allow it until Calico Cloud detects domain IPs from DNS response and programs DNS policy rules. The Windows TCPIP stack will send SYN again after TCP Retransmission timeout (RTO) if previous SYN has been dropped.
  • Some runtime libraries do not honour DNS TTL. Instead, they manage their own DNS cache which has a different TTL value for DNS entries. On .NET Framework, the value to control DNS TTL is ServicePointManager.DnsRefreshTimeout which has default value of 120 seconds - DNS refresh timeout. It is important that Calico Cloud uses a longer TTL value than the one used by the application, so that DNS policy will be in place when the application is making outbound connections. The configuration item “WindowsDNSExtraTTL” should have a value bigger than the maximum value of DNS TTL used by the runtime libraries for your applications.
  • Due to the limitations of Windows container networking, a policy update could have an impact on performance. Programming DNS policy may result in more policy updates. Setting “WindowsDNSExtraTTL” to a bigger number will reduce the performance impact.