Deep Calico troubleshooting on AWS EC2

Published by The Daily Scout

What happened

A practical post walks through Calico networking failures on Kubernetes running on AWS EC2, diagnosing VXLAN CrossSubnet mode and Source/Destination Check drops and showing fixes like disabling checks. The thread is a concise, hands-on reference for infra engineers debugging Pod-to-Pod networking on cloud VMs. (x.com)

Why it matters

Calico’s FelixConfiguration supports setting awsSrcDstCheck: Disable so Felix can toggle AWS source/destination checks for nodes automatically. (github.com) Disabling per-EC2 source/destination checks can also be done with the AWS CLI using modify-instance-attribute (example flag --no-source-dest-check --instance-id i-0123456789abcdef0). (awscli.amazonaws.com) When an IPPool is configured with vxlanMode or ipipMode set to CrossSubnet, Calico leaves intra-subnet traffic unencapsulated and applies VXLAN/IP-in-IP only across VPC subnet boundaries. (bookstack.cn) VXLAN encapsulation in Calico uses UDP port 4789 (50 bytes of overhead for UDP+VXLAN) and requires cluster node security groups to allow bidirectional UDP/4789 to avoid dropped encapsulated packets. (oneuptime.com) Calico’s AWS guidance explicitly warns that you must disable AWS src/dst checks when Calico assigns pod IPs outside the EC2 instance IP range to allow native routing within a VPC subnet. (docs.tigera.io) The Calico troubleshooting guide documents diagnostic commands such as sudo calicoctl node diags and kubectl logs -n calico-system <pod_name> for collecting node and component logs during pod-to-pod connectivity failures. (docs.tigera.io) Published incident writeups of Calico on self-managed AWS highlight recurring root causes—security groups blocking VXLAN/BGP, VPC src/dst checks left enabled, and MTU mismatches—each requiring targeted fixes (open UDP/4789, disable src/dst checks, or align MTU). (github.com)

Key numbers

  • A practical post walks through Calico networking failures on Kubernetes running on AWS EC2, diagnosing VXLAN CrossSubnet mode and Source/Destination Check drops and showing fixes like disabling checks.
  • (github.com) Disabling per-EC2 source/destination checks can also be done with the AWS CLI using modify-instance-attribute (example flag --no-source-dest-check --instance-id i-0123456789abcdef0).
  • (bookstack.cn) VXLAN encapsulation in Calico uses UDP port 4789 (50 bytes of overhead for UDP+VXLAN) and requires cluster node security groups to allow bidirectional UDP/4789 to avoid dropped encapsulated packets.
  • (oneuptime.com) Calico’s AWS guidance explicitly warns that you must disable AWS src/dst checks when Calico assigns pod IPs outside the EC2 instance IP range to allow native routing within a VPC subnet.

What happens next

  • (awscli.amazonaws.com) When an IPPool is configured with vxlanMode or ipipMode set to CrossSubnet, Calico leaves intra-subnet traffic unencapsulated and applies VXLAN/IP-in-IP only across VPC subnet boundaries.

Quick answers

What happened in Deep Calico troubleshooting on AWS EC2?

A practical post walks through Calico networking failures on Kubernetes running on AWS EC2, diagnosing VXLAN CrossSubnet mode and Source/Destination Check drops and showing fixes like disabling checks. The thread is a concise, hands-on reference for infra engineers debugging Pod-to-Pod networking on cloud VMs. (x.com)

Why does Deep Calico troubleshooting on AWS EC2 matter?

Calico’s FelixConfiguration supports setting awsSrcDstCheck: Disable so Felix can toggle AWS source/destination checks for nodes automatically. (github.com) Disabling per-EC2 source/destination checks can also be done with the AWS CLI using modify-instance-attribute (example flag --no-source-dest-check --instance-id i-0123456789abcdef0). (awscli.amazonaws.com) When an IPPool is configured with vxlanMode or ipipMode set to CrossSubnet, Calico leaves intra-subnet traffic unencapsulated and applies VXLAN/IP-in-IP only across VPC subnet boundaries. (bookstack.cn) VXLAN encapsulation in Calico uses UDP port 4789 (50 bytes of overhead for UDP+VXLAN) and requires cluster node security groups to allow bidirectional UDP/4789 to avoid dropped encapsulated packets. (oneuptime.com) Calico’s AWS guidance explicitly warns that you must disable AWS src/dst checks when Calico assigns pod IPs outside the EC2 instance IP range to allow native routing within a VPC subnet. (docs.tigera.io) The Calico troubleshooting guide documents diagnostic commands such as sudo calicoctl node diags and kubectl logs -n calico-system <pod_name> for collecting node and component logs during pod-to-pod connectivity failures. (docs.tigera.io) Published incident writeups of Calico on self-managed AWS highlight recurring root causes—security groups blocking VXLAN/BGP, VPC src/dst checks left enabled, and MTU mismatches—each requiring targeted fixes (open UDP/4789, disable src/dst checks, or align MTU). (github.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Published by The Daily Scout - Be the smartest in the room.