Kubernetes Pod-to-Pod Networking: The Complete Guide

1. Introduction - Why Networking in Kubernetes is Different

The Challenge

In traditional infrastructure, networking is relatively straightforward—servers have static IPs, and you configure firewalls and routes manually. Kubernetes turns this upside down:

Pods are ephemeral: They come and go, getting new IPs each time
Dynamic scaling: The number of pods changes constantly
Multi-host: Pods on different nodes must communicate seamlessly
Flat network: Every pod should reach every other pod without NAT

The Kubernetes Networking Requirements

Kubernetes imposes four fundamental networking requirements:

The Four Pillars of Kubernetes Networking:

Pod-to-Pod: All pods can communicate with all other pods without NAT

Node-to-Pod: All nodes can communicate with all pods without NAT

Pod-to-Self: A pod sees itself with the same IP others see it with

Service Abstraction: Services provide stable endpoints for pod groups

What We'll Cover

Communication Type	Scenario	Complexity
Same-Node	Pod A → Pod B (same host)	Medium
Cross-Node	Pod A (Node 1) → Pod B (Node 2)	High
Pod-to-Service	Pod → ClusterIP → Backend Pods	Medium
External-to-Pod	Internet → Ingress → Service → Pod	High

2. Linux Networking Fundamentals

Before diving into Kubernetes, you need to understand the Linux networking primitives it builds upon.

Network Namespaces

A network namespace is an isolated network stack with its own:

Network interfaces
Routing tables
Firewall rules (iptables)
Sockets

# Create a network namespace
sudo ip netns add my-namespace

# List all network namespaces
ip netns list

# Execute command in namespace
sudo ip netns exec my-namespace ip addr

# Delete namespace
sudo ip netns delete my-namespace

Why It Matters:

Each pod in Kubernetes gets its own network namespace. This provides network isolation—each pod has its own eth0, its own IP, and its own routing table.

Virtual Ethernet (veth) Pairs

A veth pair is like a virtual network cable with two ends. Traffic entering one end exits the other.

# Create a veth pair
sudo ip link add veth0 type veth peer name veth1

# Move one end to a namespace
sudo ip link set veth1 netns my-namespace

# Assign IPs
sudo ip addr add 10.0.0.1/24 dev veth0
sudo ip netns exec my-namespace ip addr add 10.0.0.2/24 dev veth1

# Bring up interfaces
sudo ip link set veth0 up
sudo ip netns exec my-namespace ip link set veth1 up

# Now they can communicate!
ping 10.0.0.2

veth Pair Visualization:

[Host Network] ←--veth0--||--veth1--→ [Container Namespace]

One end stays in the host, the other goes into the container's namespace

Linux Bridge

A bridge is a virtual Layer 2 switch. It connects multiple network interfaces and forwards packets between them based on MAC addresses.

# Create a bridge
sudo ip link add br0 type bridge
sudo ip link set br0 up

# Connect veth to bridge
sudo ip link set veth0 master br0

# Assign IP to bridge (becomes gateway)
sudo ip addr add 10.0.0.1/24 dev br0

IP Tables & NAT

iptables is the Linux firewall and packet manipulation tool. Kubernetes uses it heavily for:

Service load balancing (via kube-proxy)
Network policies
NAT for external traffic

# View NAT rules
sudo iptables -t nat -L -n -v

# View filter rules
sudo iptables -L -n -v

Key iptables chains used by Kubernetes:

PREROUTING: Modify packets before routing decision
POSTROUTING: Modify packets after routing (SNAT)
FORWARD: Packets passing through (not destined for localhost)
KUBE-SERVICES: Kubernetes service rules
KUBE-NODEPORTS: NodePort service rules

Routing Tables

Routing determines where packets go. Each network namespace has its own routing table.

# View routing table
ip route

# Example output:
# default via 192.168.1.1 dev eth0
# 10.244.0.0/24 dev cni0 proto kernel scope link src 10.244.0.1
# 10.244.1.0/24 via 192.168.1.101 dev eth0

3. Container Networking Basics

How Docker Does It (Without Kubernetes)

Docker's default networking uses a bridge called docker0:

Limitations of Docker's default networking:

Containers on different hosts can't communicate directly
Requires port mapping for external access
NAT between containers and external world

Why Kubernetes Needs More

Kubernetes requirements break Docker's default model:

Pods need routable IPs (no NAT)
Cross-node communication must be seamless
Need for network policies
Service discovery and load balancing

This is why Kubernetes uses CNI (Container Network Interface) instead of Docker's built-in networking.

4. Kubernetes Networking Model

The Flat Network Model

Kubernetes implements a flat network where:

Kubernetes Network Principles:

Every Pod gets a unique IP address

Pods on any node can communicate with pods on any other node using their IP

No NAT between pods (the IP a pod sees for itself is the same IP others see)

Agents on a node (kubelet, system daemons) can communicate with all pods on that node

IP Address Allocation

Component	IP Range (Example)	Assigned By
Nodes	192.168.1.0/24	Infrastructure/DHCP
Pods	10.244.0.0/16	CNI Plugin
Services	10.96.0.0/12	kube-apiserver

# Pod CIDR is configured in the cluster
# kubelet flag: --pod-cidr=10.244.0.0/24

# Service CIDR is configured in kube-apiserver
# --service-cluster-ip-range=10.96.0.0/12

Network Components Overview

Kubernetes Network Stack:

Layer Component Purpose

L7 Ingress Controller HTTP routing, TLS termination

L4 Services Stable endpoints, load balancing

L3 CNI Plugin Pod networking, routing

L2 Linux Bridge/veth Container connectivity

Layer	Component	Purpose
L7	Ingress Controller	HTTP routing, TLS termination
L4	Services	Stable endpoints, load balancing
L3	CNI Plugin	Pod networking, routing
L2	Linux Bridge/veth	Container connectivity

5. Pod Network Namespace

What Happens When a Pod Starts

When Kubernetes creates a pod:

kubelet calls the CRI (Container Runtime Interface)
CRI creates the pod sandbox (pause container)
CNI plugin is invoked to set up networking
Network namespace is created for the pod
veth pair connects the namespace to the node's network
IP address is assigned from the pod CIDR

Pod Creation Flow:

kubelet → CRI (containerd) → CNI Plugin → Network Setup Complete

The Pause Container

Every pod has a hidden pause container (also called infrastructure container):

# You can see it with crictl
crictl ps | grep pause

Why pause container exists:

Holds the network namespace
All other containers in the pod join this namespace
If app container crashes, network namespace persists
Extremely lightweight (~700KB)

Containers Within a Pod

All containers in a pod:

Share the same network namespace
Share the same IP address
Can communicate via localhost
Must coordinate port usage (can't bind to same port)

apiVersion: v1
kind: Pod
metadata:
  name: multi-container-pod
spec:
  containers:
  - name: web
    image: nginx
    ports:
    - containerPort: 80      # Container 1 uses port 80
  - name: sidecar
    image: busybox
    command: ['sh', '-c', 'wget -qO- localhost:80']  # Accesses via localhost!

6. Same-Node Pod Communication

This is the simpler case—two pods on the same node need to communicate.

Architecture

Step-by-Step Packet Flow

Scenario: Pod A (10.244.1.2) sends packet to Pod B (10.244.1.3)

Step	Location	Action
1	Pod A	Application creates packet destined for 10.244.1.3
2	Pod A eth0	Packet leaves via pod's eth0 interface
3	vethA	Packet exits pod namespace via veth pair
4	Bridge (cni0)	Bridge receives packet on vethA port
5	Bridge (cni0)	Bridge looks up MAC address for 10.244.1.3
6	Bridge (cni0)	Bridge forwards packet to vethB port
7	vethB	Packet enters Pod B's namespace
8	Pod B eth0	Packet arrives at Pod B's eth0
9	Pod B	Application receives packet

Verification Commands

# See the bridge and connected interfaces
ip link show type bridge
bridge link show

# Example output:
# 5: veth1234@if4: <BROADCAST,MULTICAST,UP> master cni0
# 7: veth5678@if6: <BROADCAST,MULTICAST,UP> master cni0

# See pod IP addresses (from node)
kubectl get pods -o wide

# Check bridge forwarding table (MAC addresses)
bridge fdb show br cni0

# Trace route between pods (from inside a pod)
kubectl exec -it pod-a -- traceroute 10.244.1.3
# Output: 1 hop (directly via bridge)

Why It's Fast

Same-node communication is efficient because:

No encapsulation needed
No routing between nodes
Pure Layer 2 switching via Linux bridge
Stays entirely within kernel (no user-space hops)

7. Cross-Node Pod Communication

This is where it gets interesting—pods on different nodes need to communicate.

The Challenge

How does 10.244.1.2 know how to reach 10.244.2.3? The underlying network only knows about 192.168.1.x!

Solution Approaches

There are three main approaches to solve cross-node communication:

Approach	How It Works	Examples
Layer 3 Routing	Configure routes on hosts/routers	Calico (BGP), host routes
Overlay Network	Encapsulate pod traffic in node traffic	Flannel (VXLAN), Weave
Underlay/Native	Cloud provider integration	AWS VPC CNI, Azure CNI, GKE

Approach 1: Layer 3 Routing

The simplest conceptually—just add routes!

Route on Node 1:

ip route add 10.244.2.0/24 via 192.168.1.101

Route on Node 2:

ip route add 10.244.1.0/24 via 192.168.1.100

Pros:

No encapsulation overhead
Best performance
Easy to debug (standard routing)

Cons:

Requires network infrastructure support
May need BGP for large clusters
Doesn't work across L2 boundaries without additional configuration

Approach 2: Overlay Network (VXLAN)

Encapsulate pod packets inside node packets.

VXLAN Details:

VNI (VXLAN Network Identifier): Like a VLAN tag for overlay
VTEP (VXLAN Tunnel Endpoint): The encap/decap points on each node
UDP Port 4789: Standard VXLAN port

Pros:

Works over any IP network
No special infrastructure needed
Supports very large clusters

Cons:

Encapsulation overhead (~50 bytes)
Slightly higher latency
MTU considerations (inner packet must be smaller)

Approach 3: Cloud Provider Native

Cloud providers can configure their SDN to understand pod IPs directly.

AWS VPC CNI:

Each pod gets a real VPC IP (from ENI secondary IPs)
No overlay needed
Limited by ENI/IP limits per instance

Azure CNI:

Pods get IPs from Azure VNet
Direct routing within VNet

GKE VPC-native:

Uses alias IP ranges
Pods routable within VPC

Cross-Node Packet Flow (Overlay)

Scenario: Pod A (10.244.1.2 on Node 1) → Pod B (10.244.2.3 on Node 2)

Step	Location	Action
1	Pod A	Creates packet: src=10.244.1.2, dst=10.244.2.3
2	Pod A eth0 → veth → Bridge	Packet reaches node network stack
3	Node 1 Routing	Looks up 10.244.2.0/24 → via VXLAN/Flannel interface
4	VTEP (Node 1)	Encapsulates: outer src=192.168.1.100, dst=192.168.1.101
5	Physical NIC	Sends encapsulated packet over physical network
6	Physical Network	Delivers to Node 2 (just sees 192.168.1.x traffic)
7	VTEP (Node 2)	Decapsulates, reveals inner packet
8	Node 2 Routing	Routes to local bridge (10.244.2.0/24 is local)
9	Bridge → veth → Pod B eth0	Packet delivered to Pod B

# See VXLAN interface (Flannel example)
ip -d link show flannel.1

# Check VXLAN FDB (forwarding database)
bridge fdb show dev flannel.1

8. Container Network Interface (CNI)

What is CNI?

CNI (Container Network Interface) is a specification and set of libraries for configuring network interfaces in Linux containers.

CNI is NOT:

A daemon

A specific implementation

Kubernetes-specific (used by Mesos, CloudFoundry, Podman too)

CNI IS:

A specification (how to call plugins)

A set of reference plugins

An interface between runtime and network plugin

How CNI Works

CNI Execution Flow:

kubelet needs network for new pod

kubelet calls CRI (containerd/CRI-O)

CRI creates network namespace

CRI calls CNI plugin with:

ADD command

Namespace path

Container ID

Network config

CNI plugin:

Creates veth pair

Assigns IP (via IPAM)

Sets up routes

Returns IP to kubelet

kubelet updates pod status with IP

CNI Configuration

CNI config is typically at /etc/cni/net.d/:

{
  "cniVersion": "0.4.0",
  "name": "my-network",
  "type": "bridge",
  "bridge": "cni0",
  "isGateway": true,
  "ipMasq": true,
  "ipam": {
    "type": "host-local",
    "subnet": "10.244.1.0/24",
    "routes": [
      { "dst": "0.0.0.0/0" }
    ]
  }
}

Key fields:

type: Which CNI plugin binary to execute
ipam: IP Address Management configuration
bridge: Name of the bridge to create/use

CNI Plugin Location

Plugins are binaries at /opt/cni/bin/:

ls /opt/cni/bin/
# Output: bridge, host-local, loopback, portmap, flannel, calico, etc.

CNI Operations

Operation	When Called	Purpose
`ADD`	Pod created	Set up networking for container
`DEL`	Pod deleted	Clean up networking
`CHECK`	Health check	Verify network is working
`VERSION`	Discovery	Report supported CNI versions

# Manual CNI invocation example (for learning)
export CNI_COMMAND=ADD
export CNI_CONTAINERID=abc123
export CNI_NETNS=/var/run/netns/test
export CNI_IFNAME=eth0
export CNI_PATH=/opt/cni/bin

cat /etc/cni/net.d/10-bridge.conf | /opt/cni/bin/bridge

9. CNI Plugins Deep Dive

Popular CNI Plugins Comparison

Plugin	Approach	Best For	Key Feature
Flannel	Overlay (VXLAN)	Simple clusters	Easy setup
Calico	L3 Routing (BGP)	Performance, Policy	Network policies
Cilium	eBPF	Advanced features	L7 visibility
Weave	Overlay (custom)	Multi-cloud	Encryption built-in
AWS VPC CNI	Native VPC	AWS EKS	No overlay overhead
Azure CNI	Native VNet	AKS	VNet integration

Flannel

Architecture:

Simple overlay network
Uses VXLAN (or host-gw, UDP)
flanneld daemon on each node
Watches Kubernetes API for node/pod CIDRs

Installation:

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

Calico

Architecture:

Uses BGP for route distribution
No overlay by default (but supports VXLAN)
calico-node (Felix + BIRD) on each node
Powerful network policies

Key features:

No encapsulation = better performance
eBPF dataplane option
Robust network policies
Supports huge clusters

Installation:

kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml

Cilium

Architecture:

Uses eBPF (extended Berkeley Packet Filter)
Operates at kernel level
L7 (HTTP, gRPC, Kafka) visibility
Advanced security features

Key features:

Replaces kube-proxy with eBPF
L7-aware network policies
Transparent encryption
Service mesh integration

Installation:

helm install cilium cilium/cilium --namespace kube-system

10. Overlay Networks Explained

What is an Overlay Network?

An overlay network is a virtual network built on top of another network (the underlay). Pod traffic is encapsulated inside packets that the underlay network understands.

Overlay vs Underlay:

Aspect Underlay (Physical) Overlay (Virtual)

IPs Node IPs (192.168.x.x) Pod IPs (10.244.x.x)

Scope Data center Kubernetes cluster

Managed by Infrastructure team Kubernetes/CNI

Visibility Routers see it Routers don't see it

Aspect	Underlay (Physical)	Overlay (Virtual)
IPs	Node IPs (192.168.x.x)	Pod IPs (10.244.x.x)
Scope	Data center	Kubernetes cluster
Managed by	Infrastructure team	Kubernetes/CNI
Visibility	Routers see it	Routers don't see it

VXLAN Deep Dive

VXLAN (Virtual Extensible LAN) is the most common overlay technology.

VXLAN Header:

Outer Ethernet Header (14 bytes)
├── Destination MAC
├── Source MAC
└── EtherType: 0x0800 (IPv4)

Outer IP Header (20 bytes)
├── Source IP: 192.168.1.100 (Node 1)
├── Destination IP: 192.168.1.101 (Node 2)
└── Protocol: UDP

Outer UDP Header (8 bytes)
├── Source Port: <hash-based>
├── Destination Port: 4789 (VXLAN)
└── Length, Checksum

VXLAN Header (8 bytes)
├── Flags: 0x08 (VNI valid)
├── Reserved
└── VNI: 1 (24 bits = 16 million possible)

Inner Ethernet Header (14 bytes)
├── Inner Destination MAC
├── Inner Source MAC
└── EtherType

Inner IP Header + Payload
├── Source IP: 10.244.1.2 (Pod A)
├── Destination IP: 10.244.2.3 (Pod B)
└── Actual Data

MTU Considerations:

Standard MTU:           1500 bytes
VXLAN overhead:         - 50 bytes
Inner packet max:       1450 bytes

# Check MTU on pod interface
kubectl exec -it my-pod -- cat /sys/class/net/eth0/mtu

Geneve (Alternative to VXLAN)

Geneve (Generic Network Virtualization Encapsulation) is a newer overlay protocol:

More extensible than VXLAN
Supports variable-length options
Used by OVN (Open Virtual Network)
UDP port 6081

IPinIP Encapsulation

Calico's IPinIP mode:

Simpler than VXLAN (just IP header, no UDP)
Less overhead (20 bytes vs 50)
Doesn't work across NAT

Outer IP Header
├── Source: 192.168.1.100
├── Destination: 192.168.1.101
└── Protocol: 4 (IP-in-IP)

Inner IP Header + Payload
├── Source: 10.244.1.2
├── Destination: 10.244.2.3
└── Data

11. Kubernetes Services & kube-proxy

Why Services?

The Problem Services Solve:

Pods are ephemeral (IPs change)

Need stable endpoint for clients

Need load balancing across pod replicas

Need service discovery

Service Types

Type	Scope	Use Case
`ClusterIP`	Internal only	Inter-service communication
`NodePort`	External via node	Development, simple external access
`LoadBalancer`	External via cloud LB	Production external access
`ExternalName`	DNS CNAME	Access external services

apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  type: ClusterIP          # Default
  selector:
    app: my-app
  ports:
  - port: 80               # Service port
    targetPort: 8080       # Pod port

How kube-proxy Works

kube-proxy runs on every node and implements service routing.

Three Modes:

Mode	Mechanism	Performance
`iptables`	iptables rules	Good (default)
`ipvs`	IPVS (kernel LB)	Better (large clusters)
`userspace`	User-space proxy	Poor (legacy)

iptables Mode (Default)

View iptables rules:

sudo iptables -t nat -L KUBE-SERVICES -n
sudo iptables -t nat -L KUBE-SVC-XXXXX -n

IPVS Mode

IPVS (IP Virtual Server) is a kernel-level load balancer:

Uses hash tables (O(1) lookup vs O(n) for iptables)
Multiple load balancing algorithms
Better for large clusters (10,000+ services)

# Enable IPVS mode
# Edit kube-proxy ConfigMap
kubectl edit configmap kube-proxy -n kube-system
# Set mode: "ipvs"

# View IPVS rules
sudo ipvsadm -Ln

IPVS Load Balancing Algorithms:

rr (Round Robin)
lc (Least Connection)
sh (Source Hashing)
dh (Destination Hashing)
sed (Shortest Expected Delay)
nq (Never Queue)

Service Packet Flow

Complete flow: Pod A → Service → Pod B

Step	Component	Action
1	Pod A	Sends to Service IP (10.96.0.100:80)
2	Pod A netns	Packet reaches routing
3	iptables/IPVS	DNAT: 10.96.0.100 → 10.244.2.5 (Pod B)
4	Routing	Determines path to 10.244.2.5
5	CNI network	Delivers to Pod B (same or cross node)
6	Pod B	Receives packet, responds
7	iptables/IPVS	Reverse NAT on response
8	Pod A	Receives response

12. DNS in Kubernetes

CoreDNS

CoreDNS is the default DNS server in Kubernetes. It provides:

Service discovery via DNS
Pod DNS records
External DNS resolution

kubectl get pods -n kube-system -l k8s-app=kube-dns
kubectl get svc -n kube-system kube-dns

DNS Records

Service DNS:

<service>.<namespace>.svc.cluster.local

Example:
my-service.default.svc.cluster.local → 10.96.0.100

Pod DNS:

<pod-ip-dashed>.<namespace>.pod.cluster.local

Example:
10-244-1-5.default.pod.cluster.local → 10.244.1.5

Headless Service DNS:

# Headless service (clusterIP: None)
<service>.<namespace>.svc.cluster.local → Returns all pod IPs

# Individual pods
<pod-name>.<service>.<namespace>.svc.cluster.local

Pod DNS Configuration

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
spec:
  containers:
  - name: app
    image: nginx
  dnsPolicy: ClusterFirst    # Default
  dnsConfig:                  # Optional customization
    nameservers:
    - 8.8.8.8
    searches:
    - my-namespace.svc.cluster.local

DNS Policies: | Policy | Behavior | |--------|----------| | ClusterFirst | Use cluster DNS, fall back to node DNS | | Default | Inherit node's DNS config | | ClusterFirstWithHostNet | For hostNetwork pods | | None | Only use dnsConfig settings |

Resolv.conf in Pods

kubectl exec -it my-pod -- cat /etc/resolv.conf

# Output:
# nameserver 10.96.0.10
# search default.svc.cluster.local svc.cluster.local cluster.local
# options ndots:5

The ndots:5 setting:

If hostname has fewer than 5 dots, try search domains first
my-service → tries my-service.default.svc.cluster.local first
Reduces DNS lookups for internal services

DNS Debugging

# Run a debug pod
kubectl run dnsutils --image=gcr.io/kubernetes-e2e-test-images/dnsutils:1.3 --command -- sleep 3600

# Test DNS resolution
kubectl exec -it dnsutils -- nslookup kubernetes.default
kubectl exec -it dnsutils -- nslookup my-service.my-namespace

# Check CoreDNS logs
kubectl logs -n kube-system -l k8s-app=kube-dns

13. Network Policies

What are Network Policies?

Network Policies are Kubernetes resources that control traffic flow at the IP/port level (L3/L4).

Default Behavior:

By default, pods are non-isolated:

Accept traffic from any source

Can send traffic to any destination

Network Policies change this to default-deny for selected pods.

Network Policy Structure

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: my-policy
  namespace: default
spec:
  podSelector:              # Which pods this applies to
    matchLabels:
      app: backend
  policyTypes:
  - Ingress                 # Control incoming traffic
  - Egress                  # Control outgoing traffic
  ingress:                  # Ingress rules
  - from:
    - podSelector:          # Allow from pods with label
        matchLabels:
          app: frontend
    - namespaceSelector:    # Allow from namespace
        matchLabels:
          env: prod
    - ipBlock:              # Allow from IP range
        cidr: 10.0.0.0/8
    ports:
    - protocol: TCP
      port: 8080
  egress:                   # Egress rules
  - to:
    - podSelector:
        matchLabels:
          app: database
    ports:
    - protocol: TCP
      port: 5432

Common Patterns

1. Default Deny All Ingress:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-ingress
spec:
  podSelector: {}    # Applies to all pods
  policyTypes:
  - Ingress
  # No ingress rules = deny all

2. Allow Only from Same Namespace:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-same-namespace
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector: {}    # Any pod in same namespace

3. Allow Specific App to Database:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-backend-to-db
spec:
  podSelector:
    matchLabels:
      app: database
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: backend
    ports:
    - protocol: TCP
      port: 5432

CNI Support

Important: Network Policies require CNI support!

CNI Network Policy Support

Flannel ❌ No

Calico ✅ Yes

Cilium ✅ Yes (L3/L4/L7)

Weave ✅ Yes

AWS VPC CNI ✅ With Calico

CNI	Network Policy Support
Flannel	❌ No
Calico	✅ Yes
Cilium	✅ Yes (L3/L4/L7)
Weave	✅ Yes
AWS VPC CNI	✅ With Calico

14. Debugging Pod Networking

Essential Tools

# Network tools pod
kubectl run netshoot --image=nicolaka/netshoot --command -- sleep 3600
kubectl exec -it netshoot -- bash

Inside the Pod

# Check IP and interfaces
ip addr
ip route

# DNS resolution
nslookup kubernetes.default
dig my-service.my-namespace.svc.cluster.local

# Test connectivity
ping 10.244.2.5
curl http://my-service:80
nc -zv my-service 80

# Trace route
traceroute 10.244.2.5
mtr 10.244.2.5

On the Node

# Check CNI configuration
ls -la /etc/cni/net.d/
cat /etc/cni/net.d/10-flannel.conflist

# Check CNI plugin logs
journalctl -u kubelet | grep -i cni

# View bridge and veth pairs
ip link show type bridge
ip link show type veth
bridge link show

# View routing table
ip route
ip route get 10.244.2.5

# Check iptables rules
sudo iptables -t nat -L -n -v | grep -i kube
sudo iptables -L -n -v | grep -i cali  # Calico

# Check IPVS (if enabled)
sudo ipvsadm -Ln

Common Issues & Solutions

Symptom	Possible Cause	Debug Steps
Pod stuck in `ContainerCreating`	CNI plugin issue	Check kubelet logs, CNI config
Can't reach other pods	Routing issue	Check `ip route`, CNI status
Can't resolve DNS	CoreDNS issue	Check CoreDNS pods, test with IP
Service not working	kube-proxy issue	Check iptables rules, endpoints
Intermittent connectivity	MTU mismatch	Check MTU on all interfaces
Cross-node fails	Overlay issue	Check VXLAN/tunnel interface

Debug Checklist

# 1. Is the pod running?
kubectl get pod my-pod -o wide

# 2. Does it have an IP?
kubectl get pod my-pod -o jsonpath='{.status.podIP}'

# 3. Are endpoints registered?
kubectl get endpoints my-service

# 4. Is DNS working?
kubectl exec -it my-pod -- nslookup kubernetes.default

# 5. Can it reach the service IP?
kubectl exec -it my-pod -- curl -v http://10.96.0.100

# 6. Can it reach pod IP directly?
kubectl exec -it my-pod -- curl -v http://10.244.2.5:8080

# 7. Check network policies
kubectl get networkpolicies

# 8. Check node connectivity
kubectl get nodes -o wide
# SSH to nodes and ping each other

15. Best Practices & Common Issues

Best Practices

1. Choose the Right CNI: | Requirement | Recommended CNI | |-------------|-----------------| | Simple setup | Flannel | | Network policies | Calico, Cilium | | Maximum performance | Calico (no overlay) | | L7 visibility | Cilium | | Cloud native | AWS/Azure/GCP CNI |

2. Plan Your IP Addressing:

# Ensure no overlap between:
# - Node network (e.g., 192.168.0.0/16)
# - Pod network (e.g., 10.244.0.0/16)
# - Service network (e.g., 10.96.0.0/12)

# Leave room for growth
# /16 for pods = 65,536 IPs

3. Consider MTU:

# Base MTU: 1500
# VXLAN overhead: -50
# IPinIP overhead: -20

# Set MTU in CNI config or via environment
# Flannel: FLANNEL_MTU=1450

4. Implement Network Policies:

Start with default-deny
Allow only required traffic
Use namespace isolation
Monitor with tools like Hubble (Cilium)

5. Monitor Network Performance:

# Tools
- Prometheus + Grafana
- Cilium Hubble
- Weave Scope
- kubectl top

Common Issues

Issue 1: Pod can't reach external internet

# Check NAT/masquerade rules
sudo iptables -t nat -L POSTROUTING -n -v

# Ensure IP forwarding is enabled
cat /proc/sys/net/ipv4/ip_forward
# Should be 1

Issue 2: DNS resolution slow

# Check ndots setting
kubectl exec -it my-pod -- cat /etc/resolv.conf

# For external domains, use FQDN with trailing dot
curl http://google.com.  # Note the trailing dot

Issue 3: Connection timeouts on services

# Check if endpoints exist
kubectl get endpoints my-service

# Check if pods are ready
kubectl get pods -l app=my-app

# Check kube-proxy
kubectl logs -n kube-system -l k8s-app=kube-proxy

Issue 4: Cross-node communication fails

# Check if nodes can reach each other
ping <other-node-ip>

# Check VXLAN interface (Flannel)
ip -d link show flannel.1

# Check for firewall rules blocking UDP 4789 (VXLAN)
sudo iptables -L INPUT -n -v | grep 4789

16. Hands-On Examples

Example 1: Observe Same-Node Communication

# Create two pods on the same node
kubectl run pod1 --image=nicolaka/netshoot --command -- sleep 3600
kubectl run pod2 --image=nicolaka/netshoot --command -- sleep 3600

# Force same node (optional - for testing)
kubectl get pods -o wide  # Note the node

# Get pod IPs
POD1_IP=$(kubectl get pod pod1 -o jsonpath='{.status.podIP}')
POD2_IP=$(kubectl get pod pod2 -o jsonpath='{.status.podIP}')

# Test connectivity
kubectl exec -it pod1 -- ping -c 3 $POD2_IP

# Trace route (should be 1 hop via bridge)
kubectl exec -it pod1 -- traceroute $POD2_IP

Example 2: Observe Cross-Node Communication

# Create pods on different nodes
kubectl run pod-node1 --image=nicolaka/netshoot \
  --overrides='{"spec":{"nodeName":"node1"}}' \
  --command -- sleep 3600

kubectl run pod-node2 --image=nicolaka/netshoot \
  --overrides='{"spec":{"nodeName":"node2"}}' \
  --command -- sleep 3600

# Get IPs
NODE1_POD_IP=$(kubectl get pod pod-node1 -o jsonpath='{.status.podIP}')
NODE2_POD_IP=$(kubectl get pod pod-node2 -o jsonpath='{.status.podIP}')

# Test connectivity
kubectl exec -it pod-node1 -- ping -c 3 $NODE2_POD_IP

# Trace route (should show node hop)
kubectl exec -it pod-node1 -- traceroute $NODE2_POD_IP

# On the node, capture VXLAN traffic
sudo tcpdump -i eth0 -n udp port 4789

Example 3: Service Load Balancing

# Create deployment with 3 replicas
kubectl create deployment web --image=nginx --replicas=3

# Expose as service
kubectl expose deployment web --port=80

# Get service IP
SVC_IP=$(kubectl get svc web -o jsonpath='{.spec.clusterIP}')

# Test load balancing
kubectl run test --image=busybox --rm -it --restart=Never -- \
  sh -c "for i in 1 2 3 4 5; do wget -qO- $SVC_IP | head -1; done"

# Watch endpoints
kubectl get endpoints web -w

Example 4: Implement Network Policy

# Create namespaces
kubectl create namespace frontend
kubectl create namespace backend

# Deploy apps
kubectl run web --image=nginx -n frontend
kubectl run api --image=nginx -n backend
kubectl run db --image=nginx -n backend --labels="app=database"

# Apply network policy (only api can reach db)
cat <<EOF | kubectl apply -f -
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: db-policy
  namespace: backend
spec:
  podSelector:
    matchLabels:
      app: database
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          run: api
    ports:
    - protocol: TCP
      port: 80
EOF

# Test: api can reach db
kubectl exec -it api -n backend -- curl -s --max-time 3 db.backend

# Test: web cannot reach db (should timeout)
kubectl exec -it web -n frontend -- curl -s --max-time 3 db.backend
# Should fail!

Example 5: Debug DNS Issues

# Create debug pod
kubectl run dnstest --image=gcr.io/kubernetes-e2e-test-images/dnsutils:1.3 \
  --command -- sleep 3600

# Check resolv.conf
kubectl exec dnstest -- cat /etc/resolv.conf

# Test internal resolution
kubectl exec dnstest -- nslookup kubernetes.default.svc.cluster.local

# Test external resolution
kubectl exec dnstest -- nslookup google.com

# Check CoreDNS
kubectl get pods -n kube-system -l k8s-app=kube-dns
kubectl logs -n kube-system -l k8s-app=kube-dns

# Test with specific DNS server
kubectl exec dnstest -- nslookup kubernetes.default 10.96.0.10

Conclusion

You now understand the complete picture of Kubernetes pod networking:

✅ Linux Fundamentals: Network namespaces, veth pairs, bridges, iptables

✅ Same-Node Communication: Bridge-based L2 switching between pods

✅ Cross-Node Communication: Overlay networks (VXLAN) or L3 routing (BGP)

✅ CNI Plugins: Flannel, Calico, Cilium - when to use each

✅ Services & kube-proxy: How ClusterIP/NodePort work with iptables/IPVS

✅ DNS: CoreDNS, service discovery, pod DNS

✅ Network Policies: Implementing zero-trust networking

✅ Debugging: Tools and techniques for troubleshooting

Quick Reference

Communication	Path
Same-node	Pod → veth → Bridge → veth → Pod
Cross-node (overlay)	Pod → veth → Bridge → VTEP → Node Network → VTEP → Bridge → veth → Pod
Cross-node (L3)	Pod → veth → Bridge → Node routing → Node Network → Node routing → Bridge → veth → Pod
Pod-to-Service	Pod → iptables/IPVS DNAT → (same/cross-node path) → Backend Pod

Next Steps

Hands-on: Set up a cluster with different CNIs and compare
Deep dive: Explore eBPF with Cilium
Security: Implement comprehensive network policies
Observability: Deploy Hubble or Weave Scope
Service Mesh: Explore Istio or Linkerd for advanced traffic management

Command Palette

1. Introduction - Why Networking in Kubernetes is Different

The Challenge

The Kubernetes Networking Requirements

What We'll Cover

2. Linux Networking Fundamentals

Network Namespaces

Virtual Ethernet (veth) Pairs

Linux Bridge

IP Tables & NAT

Routing Tables

3. Container Networking Basics

How Docker Does It (Without Kubernetes)

Why Kubernetes Needs More

4. Kubernetes Networking Model

The Flat Network Model

IP Address Allocation

Network Components Overview

5. Pod Network Namespace

What Happens When a Pod Starts

The Pause Container

Containers Within a Pod

6. Same-Node Pod Communication

Architecture

Step-by-Step Packet Flow

Verification Commands

Why It's Fast

7. Cross-Node Pod Communication

The Challenge

Solution Approaches

Approach 1: Layer 3 Routing

Approach 2: Overlay Network (VXLAN)

Approach 3: Cloud Provider Native

Cross-Node Packet Flow (Overlay)

8. Container Network Interface (CNI)

What is CNI?

How CNI Works

CNI Configuration

CNI Plugin Location

CNI Operations

9. CNI Plugins Deep Dive

Popular CNI Plugins Comparison

Flannel

Calico

Cilium

10. Overlay Networks Explained

What is an Overlay Network?

VXLAN Deep Dive

Geneve (Alternative to VXLAN)

IPinIP Encapsulation

11. Kubernetes Services & kube-proxy

Why Services?

Service Types

How kube-proxy Works

iptables Mode (Default)

IPVS Mode

Service Packet Flow

12. DNS in Kubernetes

CoreDNS

DNS Records

Pod DNS Configuration

Resolv.conf in Pods

DNS Debugging

13. Network Policies

What are Network Policies?

Network Policy Structure

Common Patterns

CNI Support

14. Debugging Pod Networking

Essential Tools

Inside the Pod

On the Node

Common Issues & Solutions

Debug Checklist

15. Best Practices & Common Issues

Best Practices

Common Issues

16. Hands-On Examples

Example 1: Observe Same-Node Communication

Example 2: Observe Cross-Node Communication