Carrier-grade NAT demo on linux
Go to file
Stefan Bühler 4847314db5 fix graph (again) 2021-04-25 20:02:32 +02:00
README.md fix graph (again) 2021-04-25 20:02:32 +02:00
cgnat-demo.sh initial 2021-04-25 19:12:43 +02:00
fix-vrf-rules.sh initial 2021-04-25 19:12:43 +02:00
nft.conf initial 2021-04-25 19:12:43 +02:00
tmux_base.conf initial 2021-04-25 19:12:43 +02:00

README.md

Carrier-grade NAT demo (work in progress)

Current state: cross-VRF routing is working, but NAT breaks it.

conntrack log shows state is immediately destroyed after it gets created, and the packet is "lost" between up and muplink.

The basic idea of 100.64.0.0/10 seems to be that a CGN-Router should be able to handle multiple interfaces using 100.64.0.0/10 (including an uplink), but keeping them separated.

Now theoretically it should work moving each interface (apart from the uplink) into a different network namespace, connect all network namespaces with veth pairs to the main one (using some other IP addresses...), and enable SNAT when forwarding packets to the main namespace, and SNAT again when forwarding to the uplink.

This demo tries to use VRFs; hopefully this results in having to NAT only once (and doesn't need additional local IP addresses).

To test yourself run ./cgnat-demo.sh as root (doesn't need network, so feel free to use some isolated container/VM/...):

  • spawns tmux with multiple windows after setup is done (ip vrf/netns exec ... and others)
  • tmux is configured to use ctrl-a prefix (like screen)
  • tmux shouldn't be detached; default detach keybind (ctrl-a d) is replaced to prompt for session destroy

Dependencies:

  • nftables for NAT / trace
  • conntrack to show conntrack events
  • tmux to open shells in various contexts

Example pings

  • Working in blue_c2:
    • ping -I 192.0.2.2 192.0.2.1 - ping uplink "public" IP
    • ping 100.64.0.1 - ping blue_c1
    • ping 2001:db8:b:10::1 - ping blue_c1
    • ping 100.127.255.254 - ping gateway
    • ping 2001:db8:b:10::ffff - ping gateway
    • ping 2001:db8:b:20::1 - ping red_c1
    • ping 2001:db8:a::ffff - ping uplink
    • ping 2001:db8:a::1 - ping main (i.e. up:muplink)
  • Broken everywhere but uplink:
    • ping 192.0.2.1
  • Broken in up:
    • ping 100.127.255.254 (works as soon NAT gets disabled)

Basic design

  • Run everything in a separate network+mount+UTS namespace
  • Explicit VRFs for everything, including the uplink
    • Uplink VRF (up) with muplink interface
    • Two client VRFs (blue and red), each with a brigde to connect clients to
  • Simulate an uplink with one client (in namespace uplink)
  • Simulate two clients in VRF blue (namespaces blue_c1 and blue_c2)
  • Simulate one clients in VRF red (namespace red_c1)
  • IPv4: NAT from client VRFs (blue and red) to uplink up
  • IPv6: no NAT, proper routing
  • Route 192.0.2.2 from uplink all the way through to blue_c2 (test IPv4 cross-VRF connectivity without NAT)

Topology:

+--------------------+ +-----------------------+     +--------------------+
| uplink:            | | main:                 |     | blue_c1:           |
|   lo               | |  lo                   |     |   lo               |
|                    | |  up (vrf)             |  +--=-> cuplink (veth)   |
|   client1 (veth) <-=-=--> muplink (veth)     |  |  +--------------------+
+--------------------+ |  blue (vrf)           |  |
                       |    br-blue (bridge)   |  |  +--------------------+
                       |      blue_c1 (veth) <-=--+  | blue_c2:           |
                       |      blue_c2 (veth) <-=--+  |   lo               |
                       |  red (vrf)            |  +--=-> cuplink (veth)   |
                       |    br-red (bridge)    |     +--------------------+
                       |      red_c1 (veth) <--=--+
                       +-----------------------+  |  +--------------------+
                                                  |  | red_c1:            |
                                                  |  |   lo               |
                                                  +--=-> cuplink (veth)   |
                                                     +--------------------+

Basic VRF setup

Proper VRF ip rule setup with unreachables if VRF table didn't succeed:

1000:   from all lookup [l3mdev-table]
2000:   from all lookup [l3mdev-table] unreachable
32765:  from all lookup local
32766:  from all lookup main

(+ lookup default in IPv4)

  • Address 192.0.2.1/32 on lo
  • Addresses 100.127.255.254/10 and 2001:db8:a::ffff/64 on client1
  • Route 2001:db8:b::/48 via 2001:db8:a::1 dev client1
  • Route 192.0.2.2 via 100.64.0.1 dev client1

main:up configuration

  • Addresses 100.64.0.1/10 and 2001:db8:a::1/64 on muplink
  • Route default via 100.127.255.254 dev muplink and default via 2001:db8:a::ffff dev muplink
  • Route 2001:db8:b:10::/64 dev blue (forward to VRF blue)
  • Route 2001:db8:b:20::/64 dev red (forward to VRF red)
  • Route 192.0.2.2 dev blue (forward to VRF blue)

main:blue configuration

  • Addresses 100.127.255.254/10 and 2001:db8:b:10::ffff/64 on br-blue
  • Route default dev up (IPv4 + IPv6) - forward to VRF up
  • Route 192.0.2.2 dev br-blue (connected in blue_c2)

main:red configuration

  • Addresses 100.127.255.254/10 and 2001:db8:b:20::ffff/64 on br-red
  • Route default dev up (IPv4 + IPv6) - forward to VRF up

client configuration

  • Addresses on cuplink:
    • blue_c1: 100.64.0.1/10 and 2001:db8:b:10::1/64
    • blue_c2: 100.64.0.2/10 and 2001:db8:b:10::2/64, also 192.0.2.2/32
    • red_c1: 100.64.0.1/10 and 2001:db8:b:20::1/64
  • Route default via 100.127.255.254 dev cuplink
  • Route default via 2001:db8:b:$$$$::ffff dev cuplink (depending on blue/red)

TODO

  • get NAT working
  • test whether one can route to lo instead of VRF up (and drop VRF up), or whether there are other ways for for cross-VRF routing