Interface Management

Ze manages Linux network interfaces via pure netlink (no iproute2 shell-outs). JunOS-style two-layer model: physical interfaces with named logical units.

Capability Table

Category Feature Status Priority
Interface Types Ethernet (physical) have
Dummy (virtual) have
Veth pairs have
Bridge (basic) have
VLAN 802.1Q have
VLAN 802.1p QoS maps (ingress/egress PCP-priority) have
Class-of-service named profiles (802.1p, interface-level inheritance) have
Loopback have
Bonding / LACP missing high
VXLAN missing medium
GRE / GRETAP / IPIP / SIT tunnels have
IP6GRE / IP6GRETAP / IP6TNL / IPIP6 tunnels have
ERSPAN missing lower
WireGuard (declarative peers, $9$-encoded keys) have
MACsec missing lower
MACVLAN missing lower
Geneve missing lower
PPPoE missing lower
L2TPv3 missing lower
WiFi missing lower
XFRM (route-based IPsec, if_id) have
VTI (legacy IPsec tunnel) missing lower
QinQ (802.1ad) missing lower
Logical Model Two-layer physical + unit have
Unit 0 implicit have
VLAN units create subinterfaces have
Lifecycle Create (dummy, veth, bridge, VLAN, tunnel) have
Delete have
Auto-up on create have
Admin state control (explicit up/down) have
Interface rename missing lower
Address Management Add/remove IPv4/IPv6 CIDR have
DAD-aware monitoring have
MAC address set/get have
Gratuitous ARP on add missing medium
Neighbor table (ARP/NDP) missing lower
DHCP DHCPv4 (RFC 2131, full DORA) have
DHCPv6 (SARR, IA_NA, IA_PD) have
Concurrent v4+v6 have
Direct netlink install have
Bus events (acquired/renewed/expired) have
Config-driven (dhcp { enabled true }) have
Default route from DHCP Router option have
Route priority per unit (route-priority) have
Link-down route deprioritization (metric + 1024) have
IPv6 default route from RA with configurable metric have
DNS from DHCP to /tmp/resolv.conf have
Hostname in DHCPv4 (option 12) have
Client-ID in DHCPv4 (option 61) have
NTP servers from DHCP (option 42) have
DHCPv6 proper Renew (not re-solicit) missing medium
DHCP relay missing lower
DHCP server missing lower
Monitoring Netlink multicast (link + addr + neigh) have
Virtual iface state detection have
11 bus topics, JSON payloads have
Interface statistics/counters have
Persistent counter tracking missing medium
Per-Interface Tuning IPv4 forwarding have
ARP filter / ARP accept have
IPv6 autoconf (SLAAC) have
IPv6 accept-ra (0/1/2) have
IPv6 forwarding have
Proxy ARP have
ARP announce / ARP ignore have
RPF / source validation have
TCP MSS clamping (v4+v6) missing medium
Directed broadcast missing lower
IPv6 Extended EUI-64 address generation missing medium
DAD configuration (messages, accept) missing medium
Custom interface identifiers missing lower
Traffic Mirroring Ingress/egress via tc mirred have
Idempotent setup/cleanup have
Traffic redirect (vs mirror) missing lower
Traffic Control QoS / shaping missing lower
Policing missing lower
Queuing disciplines missing lower
Migration Make-before-break 5-phase have
BGP readiness signaling have
Per-phase rollback have
BGP Integration Reactor subscribes to addr events have
Listener start/stop on addr change have
bgp/listener/ready publish have
Bridge Features Create and bring up have
STP have
VLAN filtering missing medium
Add/remove member ports have
Multicast snooping missing lower
Port isolation missing lower
Ageing/forward delay/hello/max age missing lower
Bonding Mode selection missing high
Hash policy missing high
LACP rate missing high
MII monitoring missing high
Min active links missing high
Member management missing high
VRF YANG leaf exists ∿ high
Route table isolation missing high
Per-VRF address assignment missing high
VRF-aware DHCP missing medium
Gateway Redundancy VRRP / keepalived missing medium
Virtual MAC missing medium
State monitoring/failover missing medium
Physical Layer Speed / duplex / autoneg missing medium
Hardware offload (GRO/GSO/TSO/LRO) missing medium
Ring buffer sizing missing lower
RPS / RFS missing lower
ethtool integration missing medium
Security 802.1X / EAPoL missing lower
Storm control missing lower
Configuration YANG model (all types + units) have
Input validation (name, VLAN, MTU) have
Platform Pluggable backend interface have
Linux netlink backend (default) have
YANG backend leaf (config-driven selection) have
VPP backend (ifacevpp, via GoVPP) have ResetCounters via sw_interface_clear_stats; ListKernelRoutes via ip_route_v2_dump; RouteLookup via IPRouteLookupV2 (VPP FIB is authoritative); ListNeighbors via ip_neighbor_dump; VXLAN/GRE/IPIP/LCP/stats-socket/mirror/STP pending third-party vendoring
macOS / Darwin missing lower
FreeBSD / OpenBSD missing lower
systemd-networkd missing lower
Quality Context-wrapped errors have
Panic recovery have
14 test files (unit + integration) have

Architecture

Config (YANG: ze-iface-conf.yang, "backend" leaf selects backend)
  |
  v
iface component (register.go) -- OnConfigure() loads backend, starts monitor
  |
  v
Backend interface (backend.go) -- 34 methods: lifecycle, address, sysctl, mirror, monitor
  |
  v
+------------------+--------------------+
|                  |                    |
netlink backend    DHCP plugin          (future: networkd, FreeBSD)
(ifacenetlink/)    (ifacedhcp/)
|                  |
netlink calls      lease negotiation
|                  |
v                  v
Bus topics -- interface/{created,deleted,up,down,addr/*,dhcp/*}
  |
  v
Subscribers -- BGP reactor (starts/stops listeners on address changes)

Interface Discovery

During ze init, Ze discovers OS network interfaces and generates initial configuration entries. The DiscoverInterfaces function enumerates interfaces via ListInterfaces (netlink on Linux, stdlib on other platforms) and classifies each by Ze type: ethernet, bridge, veth, dummy, or loopback. On Linux, the netlink device type maps to ethernet (except lo, which maps to loopback). Results are sorted by type then name.

The generated config uses descriptive names as YANG list keys (the OS interface name at discovery time). The MAC address serves as the physical binding between configuration and hardware. For ethernet, veth, and bridge interfaces, the MAC address (mac { address }) is optional and, when set, must be unique within each list. Omit it to keep the hardware-assigned MAC, or set it to override the address and pin the named config entry to a specific physical device.

Each discovered interface also records an os-name hidden leaf that preserves the original OS interface name. This field is auto-populated during discovery and remains available for debugging and internal binding after the user renames the config entry.

Operator-facing interface operations resolve the logical name to its kernel device through the shared resolver before touching the kernel, so a configured name that differs from the OS device (via the os-name or mac/match selector) is honored uniformly: the iface CLI ops (set MTU, add/remove address, admin up/down, bridge, mirror, ...), the DHCP client socket binding, and the routing/protocol consumers all act on the bound kernel device. The dispatch layer performs this translation for the by-name backend ops, leaving GetInterface/ListInterfaces raw because the resolver is built on them. A checks gate (make ze-iface-resolution-check) keeps new consumers from resolving the kernel directly instead of through the resolver.

A MAC address validator (ze:validate "mac-address") provides format checking (colon-separated hex octets) and live OS autocomplete. The CompleteFn calls DiscoverInterfaces on each tab press, returning MAC addresses from currently active OS interfaces.

Key Design Decisions

Tunnel Configuration

interface {
    tunnel gre0 {
        encapsulation {
            gre {
                local  { ip 192.0.2.1; }
                remote { ip 198.51.100.1; }
                key 42
            }
        }
        unit default {
            ipv4 {
                address [ 10.0.0.1/30 ]
            }
        }
    }

    tunnel sixin4 {
        encapsulation {
            sit {
                local  { ip 192.0.2.1; }
                remote { ip 198.51.100.1; }
            }
        }
        unit default {
            ipv6 {
                address [ 2001:db8::1/64 ]
            }
        }
    }

    tunnel v6ov6 {
        encapsulation {
            ip6tnl {
                local  { ip 2001:db8::1; }
                remote { ip 2001:db8::2; }
                hoplimit 64
                encaplimit 4
            }
        }
    }
}

The eight supported encapsulation kinds map to Linux netlink kinds:

Kind Linux netlink Underlay Layer Notes
gre gre IPv4 L3 RFC 2784, RFC 2890 key
gretap gretap IPv4 L2 (bridgeable) Ethernet over GRE
ip6gre ip6gre IPv6 L3 hoplimit/tclass per RFC 2473
ip6gretap ip6gretap IPv6 L2 (bridgeable)
ipip ipip IPv4 L3 RFC 2003, no key
sit sit IPv4 L3 6in4 per RFC 4213
ip6tnl ip6tnl IPv6 L3 RFC 2473 with Proto=IPV6
ipip6 ip6tnl IPv6 L3 RFC 2473 with Proto=IPIP

ipip6 shares the kernel ip6tnl netdev with a different inner protocol byte (4 vs 41). Both surface as distinct YANG cases so the schema and config are unambiguous.

L2 tunnel kinds (gretap, ip6gretap) support an optional mac container (with an address leaf) inside the case container. L3 kinds do not carry a MAC address (the kernel does not assign one).

ERSPAN, GRE keepalives, VRF underlay/overlay leaves, and ignore-df on gretap are out of scope for v1; see plan/deferrals.md.

Tunnel Reload Behaviour

On config reload (SIGHUP or transaction commit), applyTunnels compares each tunnel's spec against the previously applied config, indexed by name. Tunnels whose spec is unchanged are left alone; MTU, MAC, and addresses still reconcile through later phases, so non-spec changes still take effect. Tunnels whose spec changed (encapsulation kind, local, remote, key, ttl, hoplimit, and the rest of the per-case leaves) are deleted and recreated, because Linux does not support in-place modification of most tunnel kinds. The recreate briefly drops traffic on the changed tunnel only; unrelated tunnels are not disturbed.

Tunnel Validation Scope

ze config validate, API pre-save validation, and CLI commit validation run YANG schema checks plus registered side-effect-free in-process plugin verifiers. They do not call live external plugin OnConfigVerify callbacks because those callbacks are runtime transaction participants. Live external plugin verification runs when the daemon loads, reloads, or commits config; failed API commits roll the saved file back to the previous content before returning the reload error.

Interface validation that has an in-process verifier, such as tunnel case consistency and backend feature gates, is visible in static validation. Any third-party external plugin that only implements a live OnConfigVerify callback is verified at daemon transaction time, not by ze config validate.

WireGuard Configuration

WireGuard interfaces are a top-level wireguard list under interface, alongside ethernet, tunnel, and the other iface kinds. Each entry carries interface-level parameters plus a nested peer list; unit-level addresses ride the same interface-unit grouping used everywhere else.

interface {
    wireguard wg0 {
        listen-port 51820
        fwmark 0
        private-key "$9$ABCabc..."        # $9$-encoded; see below
        peer site2 {
            public-key "YYYY..."           # base64, plaintext
            preshared-key "$9$DEF..."      # optional, also $9$-encoded
            endpoint {
                ip 198.51.100.2
                port 51820
            }
            allowed-ips [ 10.0.0.2/32 192.168.10.0/24 ]
            persistent-keepalive 25
        }
        unit default {
            ipv4 {
                address [ 10.0.0.1/24 ]
            }
        }
    }
}

Key material and $9$ encoding

private-key and peer preshared-key are marked ze:sensitive in YANG. The config parser auto-decodes $9$-prefixed values on load (internal/component/config/parser.go:127); ze config show / ze config dump always re-encodes them on output (internal/component/config/cli/cmd_dump.go:132), so the plaintext base64 form never reaches the config file on disk. Public keys are public and stored plaintext.

$9$ is JunOS-compatible obfuscation, not encryption. Anyone with read access to the config file (or the zefs blob, depending on storage backend) can trivially recover the plaintext key via secret.Decode. The protection is on the filesystem layer: chmod 600 /etc/ze/ze.conf (or the equivalent on the .zefs blob). This is the same posture ze uses for BGP MD5 passwords, SSH secrets, MCP tokens, and API tokens.

Reconciliation

On reload, applyWireguards compares the new spec to the previously applied spec via wireguardSpecEqual. Unchanged entries are a no-op; the kernel is not touched and peer handshake state is preserved. Changed entries trigger a single ConfigureWireguardDevice call with wgtypes.Config{ReplacePeers: true} -- the kernel matches unchanged peers by public-key and preserves their handshake state, so "apply entire spec on every change" is functionally equivalent to a per-peer diff at a tiny fraction of the code. New wireguard entries get a CreateWireguardDevice (netlink) before the Configure call. Wireguard list entries removed from config are deleted by Phase 4 reconciliation, same as tunnels.

Peer names in config (peer site2 { ... }) are operator-chosen labels. The kernel tracks peers only by public key, so the label can change freely without affecting the handshake. ze init emits discovered peers with synthetic names (peer0, peer1, ...) which operators typically rename via ze config edit.

Port conflict detection

listen-port participates in the same conflict-detection mechanism as TCP services (web, ssh, mcp, etc.) with one Phase-5 extension: ListenerEndpoint gained a Protocol field so wireguard's UDP ports never clash with a TCP service on the same port. Two wireguards with the same listen-port are rejected at reload time.

Dependencies

WireGuard peer and key configuration uses golang.zx2c4.com/wireguard/wgctrl, the reference Go client maintained by the WireGuard authors (Donenfeld, Layher). It is vendored under vendor/golang.zx2c4.com/wireguard/wgctrl along with its transitive dependencies github.com/mdlayher/genetlink and github.com/mdlayher/netlink. WireGuard has no RFC; reference material is the original whitepaper (https://www.wireguard.com/papers/wireguard.pdf), the Linux kernel genetlink spec (https://www.kernel.org/doc/html/latest/userspace-api/netlink/specs/wireguard.html), and wg(8).

Bus Topics

Topic Trigger Payload
interface/created First RTM_NEWLINK for an index name, type, index, mtu, managed
interface/deleted RTM_DELLINK name, type, index, mtu, managed
interface/up OperUp or OperUnknown+IFF_UP name, index
interface/down Other oper states name, index
interface/addr/added RTM_NEWADDR (DAD complete) name, unit, index, address, prefix-length, family, managed
interface/addr/removed RTM_DELADDR name, unit, index, address, prefix-length, family, managed
interface/dhcp/lease-acquired DHCPv4 ACK name, unit, address, prefix-length, router, dns, lease-time
interface/dhcp/lease-renewed Renewal success name, unit, address, prefix-length, router, dns, lease-time
interface/dhcp/lease-expired Lease timeout name, unit, address, prefix-length, router, dns, lease-time

Compound Commands (Auto-Ensure Parent)

When a command creates a sub-resource (VLAN unit, address) on an interface that may not exist yet, the compound form auto-creates the parent:

Command Behavior
create interface dummy name <name> unit <vid> Creates dummy <name> if missing, then creates VLAN <name>.<vid>
create interface dummy name <name> address <prefix> Creates dummy <name> if missing, then adds address
create interface bridge name <name> unit <vid> Creates bridge <name> if missing, then creates VLAN
create interface bridge name <name> address <prefix> Creates bridge <name> if missing, then adds address
create interface <name> unit <vid> Direct form: parent must already exist
create interface <name> address <prefix> Direct form: interface must already exist

The type keyword (dummy, bridge) is required when the parent does not exist, because the system needs to know what kind of interface to create. If the parent already exists with a different type, the command fails (e.g., create interface dummy name br0 unit 100 rejects if br0 is a bridge).

Rollback: if the sub-resource creation fails after the parent was auto-created, the parent is deleted. Pre-existing parents are never deleted on failure.

The mechanism is driven by the ze:ensure-exists YANG extension on the type containers (dummy, bridge). The dispatch system builds an ensure chain at registration time and wraps the leaf handler automatically.

Backend Implementations

The Backend interface declares 34 methods. Three implementations ship in tree; coverage varies per platform and dataplane. The table below lists each method against the backend and whether it is wired or returns an error. err means the method is implemented as a stub that rejects every call with a descriptive error. real means the method drives the underlying mechanism. Cells with a footnote carry a caveat.

Category Method netlink (Linux) VPP stub (non-Linux)
Lifecycle CreateDummy real real (CreateLoopback) err
CreateVeth real err (VPP uses memif/TAP) err
CreateBridge real real (BridgeDomainAddDelV2) err
CreateVLAN real real (CreateVlanSubif) err
CreateTunnel real [1] err (pending GoVPP tunnel API) err
CreateWireguardDevice real (rtnetlink) err (requires VPP wg plugin) err
ConfigureWireguardDevice real (wgctrl) err (requires VPP wg plugin) err
GetWireguardDevice real (wgctrl) err (requires VPP wg plugin) err
CreateXFRM real (rtnetlink) err (XFRM is Linux netlink only) err
GetXFRMInfo real (rtnetlink+xfrm) err (XFRM is Linux netlink only) err
DeleteInterface real real (DeleteLoopback/DeleteSubif) err
Address AddAddress real real (SwInterfaceAddDelAddress) err
RemoveAddress real real (SwInterfaceAddDelAddress) err
ReplaceAddressWithLifetime real ∿ err
AddAddressP2P real err (PPP NCP not supported yet) err
Route AddRoute real err (use fib-vpp plugin) err
RemoveRoute real err (use fib-vpp plugin) err
ListRoutes real err (use fib-vpp plugin) err
RouteLookup real (netlink RouteGet) real (IPRouteLookupV2 LPM) err
ListKernelRoutes real err [3] err
Link state SetAdminUp real real (SwInterfaceSetFlags) err
SetAdminDown real real (SwInterfaceSetFlags) err
Properties SetMTU real real (SwInterfaceSetMtu) err
SetMACAddress real real (SwInterfaceSetMacAddress) err
GetMACAddress real real (via SwInterfaceDump) err
GetStats real err (pending GoVPP stats API) err
ResetCounters real [4] err (pending sw_interface_clear_stats) err
Query ListInterfaces real real (SwInterfaceDump) err
GetInterface real real (SwInterfaceDump) err
ListNeighbors real err (pending ip_neighbor_dump) err
Bridge BridgeAddPort real real (SwInterfaceSetL2Bridge) err
BridgeDelPort real real (SwInterfaceSetL2Bridge) err
BridgeSetSTP real (sysfs) err (VPP STP varies by version) err
Mirror SetupMirror real (tc mirred) err (pending SpanEnableDisableL2) err
RemoveMirror real err (pending SpanEnableDisableL2) err
Monitor StartMonitor real (netlink multicast) real (WantInterfaceEvents) err
StopMonitor real real βœ•
Close real real βœ•

[1] CreateTunnel rejects an unknown kind with unsupported tunnel kind <k>. Valid kinds are gre, gretap, ip6gre, ip6gretap, ipip, sit, ip6tnl, ipip6.

[2] VPP has no kernel-style address lifetimes. The VPP backend ignores the validLft/preferredLft arguments and installs the address without an expiry, matching the exact-or-reject rule only when the operator does not actually need DHCP lease-aware behaviour on VPP. DHCP runs against the netlink backend today.

[3] On VPP the kernel FIB is not authoritative; the VPP backend rejects ListKernelRoutes rather than return misleading data. A VPP FIB dump via ip_route_v2_dump is the correct replacement and is not yet wired.

[4] Linux netlink has no generic counter-reset syscall. The netlink backend returns iface.ErrCountersNotResettable and the dispatch layer falls back to a per-interface baseline-delta model: the current values become a baseline and GetStats / ListInterfaces / GetInterface subtract the baseline before returning, so the operator sees "since last clear" values.

Netlink (Linux, default)

Pure netlink via vishvananda/netlink, with WireGuard peer/key operations via golang.zx2c4.com/wireguard/wgctrl. Every method is implemented; the only caveats are CreateTunnel rejecting unknown kinds and ResetCounters using baseline-delta (both noted above). No iproute2 shell-outs. Non-Linux builds of this package compile into the stub backend described below.

VPP (opt-in, via GoVPP)

Selected by the backend vpp leaf. 17 methods drive the VPP binary API through GoVPP; 16 return errNotSupported with a string naming the missing GoVPP call or the plugin gap. The channel to VPP is acquired lazily on first method call; before the first successful acquire, every method returns iface.ErrBackendNotReady so the reconciliation phase can retry.

Use the VPP backend when VPP owns the dataplane. Use netlink elsewhere. Routes, stats, counter reset, neighbour table, mirrors, and STP are the main gaps against feature parity.

Stub (non-Linux)

On darwin, macOS, BSD, and Windows the netlink backend package compiles to a stub whose constructor succeeds but whose every method returns "interface management not supported on <GOOS>". StopMonitor and Close are no-ops. The stub exists so the rest of the daemon can load and the binary remains testable on developer machines; real interface management requires Linux.

The stub never installs itself as the default silently. A macOS daemon that actually tries ze interface show or any config-driven reconciliation sees the explicit error and rejects under exact-or-reject.