Monitoring

Ze provides real-time BGP event monitoring and a live peer dashboard through the CLI. Commands follow verb-first syntax: monitor <module>.

Live Peer Dashboard

ze cli -c "monitor bgp"

Auto-refreshing dashboard showing router identity, sortable color-coded peer table with update rates. Navigate with j/k, sort with s/S, Enter for detail, Esc to exit. Refreshes every 2 seconds.

Event Streaming

ze cli -c "monitor event"

Filters

Filter Example Description
peer peer upstream1 Show events for one peer
include include update,state Filter by event type (comma-separated)
exclude exclude keepalive Exclude event types
direction direction received Only received or sent events

Combine filters:

ze cli -c "monitor event peer upstream1 include update direction received"

Event Types

Event Has Direction Description
update βœ“ Route announcements and withdrawals
open βœ“ OPEN message exchange
notification βœ“ Session error notifications
keepalive βœ“ Keepalive exchanges
refresh βœ“ Route refresh requests
state βœ• Peer state changes (up/down)
negotiated βœ• Capability negotiation results
eor βœ“ End-of-RIB markers
rpki βœ“ RPKI validation results

Output Formats

Pipe the output through format operators:

ze cli -c "monitor event | json"      # Full JSON envelope
ze cli -c "monitor event | table"     # Tabular format
ze cli -c "monitor event | match rx"  # Regex filter on output

JSON Event Format

All events follow the ze-bgp JSON envelope:

{
  "type": "bgp",
  "bgp": {
    "peer": {
      "address": "10.0.0.1",
      "local": {"address": "10.0.0.2", "as": 65000},
      "remote": {"address": "10.0.0.1", "as": 65001}
    },
    "message": {
      "id": 42,
      "direction": "received",
      "type": "update"
    }
  }
}

UPDATE Event

{
  "type": "bgp",
  "bgp": {
    "peer": {"address": "10.0.0.1", "local": {"address": "10.0.0.2", "as": 65000}, "remote": {"address": "10.0.0.1", "as": 65001}},
    "message": {"id": 1, "direction": "received", "type": "update"},
    "update": {
      "ipv4/unicast": [
        {
          "next-hop": "10.0.0.1",
          "action": "add",
          "nlri": ["10.0.0.0/24", "10.0.1.0/24"]
        }
      ]
    },
    "origin": "igp",
    "as-path": [65001, 65002],
    "local-preference": 100
  }
}

State Event

{
  "type": "bgp",
  "bgp": {
    "peer": {"address": "10.0.0.1", "local": {"address": "10.0.0.2", "as": 65000}, "remote": {"address": "10.0.0.1", "as": 65001}},
    "message": {"type": "state"},
    "state": "up"
  }
}

Programmatic Access

Plugins can subscribe to events via the SDK:

process my-plugin {
    receive [ update state ]
}

The plugin receives events through its OnEvent callback. See Plugins guide for details.

Prometheus Metrics

Ze exposes Prometheus metrics when telemetry { prometheus { ... } } is configured. BGP metrics are refreshed every 10 seconds. By default the HTTP listener binds to 127.0.0.1:9273; configure an explicit server address to expose it to remote scrapers.

The netdata block only controls Netdata-compatible OS collector metrics. It does not rename Ze-native metrics such as ze_bgp_*, ze_bfd_*, or ze_l2tp_*.

telemetry {
    prometheus {
        enabled true;
        server main {
            ip 0.0.0.0;
            port 9273;
        }
        path /metrics;
        basic-auth {
            enabled true;
            username prometheus;
            plaintext-password "secret";
        }
        netdata {
            enabled true;
            prefix netdata;
            interval 1;
            collector diskspace {
                enabled false;
            }
            collector snmp6 {
                interval 10;
            }
        }
    }
}
Path Default Description
enabled false Enable Prometheus HTTP endpoint
server 127.0.0.1:9273 Listener list. Explicit 0.0.0.0 binds all interfaces
path /metrics HTTP metrics path
basic-auth/enabled false Require HTTP Basic Authentication for metrics and health endpoints
basic-auth/realm ze prometheus Basic Auth realm
basic-auth/username unset Basic Auth username
basic-auth/password unset Bcrypt-hashed Basic Auth password
basic-auth/plaintext-password unset Write-only password input, hashed on commit
netdata/enabled true Enable Netdata-compatible OS collectors
netdata/prefix netdata Prefix for Netdata-compatible OS collector metrics only
netdata/interval 1 Netdata-compatible OS collector sampling interval (1-60s)
netdata/collector -- Per-Netdata-collector enable and interval overrides

Deprecated compatibility aliases remain accepted: prefix, interval, and collector directly under prometheus. Prefer netdata/prefix, netdata/interval, and netdata/collector in new config.

HTTP Basic Authentication

When basic-auth/enabled is true, Ze requires HTTP Basic Authentication for every handler on the Prometheus service, including both /metrics and /health. The password is stored as a bcrypt hash in the persisted config. Use plaintext-password when editing the config and the commit hook will replace it with password. If automation already has a hash from ze passwd, set password directly.

Prometheus scrape configuration:

scrape_configs:
  - job_name: ze
    static_configs:
      - targets: ["router.example.net:9273"]
    basic_auth:
      username: prometheus
      password: secret

Basic Auth does not provide transport encryption. Keep the listener on loopback, use a trusted management network, or put TLS in front of the service if the scrape crosses an untrusted network.

Per-collector overrides:

netdata {
    collector diskspace { enabled false; }
    collector snmp6 { interval 10; }
}

OS Metrics (Netdata-compatible)

Ze exports 138 OS metrics matching Netdata's Prometheus format exactly (same names, labels, values), acting as a drop-in replacement for Netdata's /api/v1/allmetrics?format=prometheus endpoint. Existing Grafana dashboards built against Netdata continue to work unchanged.

Metric name format: {prefix}_{context}_{units}_average{chart="...",dimension="...",family="..."}, where {prefix} is telemetry.prometheus.netdata.prefix.

Collector /proc or /sys source Charts exposed
CPU /proc/stat system.cpu, cpu.cpu<N>
cpufreq /sys/devices/system/cpu/cpu*/cpufreq cpufreq.cpufreq, cpu.core_throttling
cpuidle /sys/devices/system/cpu/cpu*/cpuidle cpuidle.cpu<N>_cpuidle
Memory /proc/meminfo system.ram, system.swap, mem.available, mem.committed, mem.kernel, mem.slab, mem.thp, mem.writeback, mem.hugepages, mem.reclaiming, mem.swap_cached, mem.cma, mem.directmaps, mem.hwcorrupt, mem.zswap
Load /proc/loadavg system.load
Processes /proc/stat system.processes, system.forks, system.ctxt, system.intr
Interrupts /proc/softirqs system.softirqs
Pressure (PSI) /proc/pressure/* system.{cpu,memory,io}_{some,full}_pressure
Network (per-iface) /proc/net/dev, /sys/class/net net.net, net.packets, net.errors, net.drops, net.fifo, net.compressed, net.events, net.speed, net.duplex, net.operstate, net.carrier, net.mtu
Network (aggregate) /proc/net/dev, snmp, snmp6 system.net, system.ipv4, system.ipv6
IPv4 /proc/net/snmp ipv4.packets, ipv4.errors, ipv4.tcppackets, ipv4.tcperrors, ipv4.tcphandshake, ipv4.tcpsock, ipv4.udppackets, ipv4.udperrors, ipv4.icmp, ipv4.icmpmsg, ipv4.fragsout, ipv4.fragsin
IPv4 netstat /proc/net/netstat ipv4.mcast, ipv4.mcastpkts, ipv4.bcast, ipv4.bcastpkts, ipv4.ecnpkts, ip.tcpconnaborts, ip.tcpmemorypressures, ip.tcpreorders, ip.tcpofo
IPv6 /proc/net/snmp6 ipv6.packets, ipv6.errors, ipv6.udppackets, ipv6.udperrors, ipv6.mcast, ipv6.fragsout, ipv6.fragsin
Sockets /proc/net/sockstat, sockstat6 ip.sockstat_sockets, ipv4.sockstat_tcp_sockets, ipv4.sockstat_tcp_mem, ipv4.sockstat_udp_sockets, ipv4.sockstat_udp_mem, ipv6.sockstat6_*
Conntrack /proc/net/stat/nf_conntrack netfilter.conntrack_sockets, _new, _changes, _errors, _search, _expect
Softnet /proc/net/softnet_stat system.softnet_stat, cpu.cpu<N>_softnet_stat
Disk I/O /proc/diskstats disk.io, disk.ops, disk.mops, disk.iotime, disk.busy, disk.backlog, disk.await, disk.svctm, disk.avgsz, disk.qops, system.io
Disk space /proc/mounts + statfs disk_space.<mount>
mdstat /proc/mdstat md.health, md.disks, md.mismatch_cnt
ZFS /proc/spl/kstat/zfs/arcstats zfs.arc_size, zfs.reads, zfs.hits, zfs.hits_rate, zfs.l2_size, zfs.l2_hits_rate, zfs.memory_ops
btrfs /sys/fs/btrfs/*/allocation btrfs.disk, btrfs.data, btrfs.metadata, btrfs.system
VMstat /proc/vmstat mem.pgfaults, system.pgpgio, mem.swapio, mem.oom_kill, mem.numa, mem.balloon, mem.zswapio, mem.ksm_cow, mem.thp_faults, mem.thp_collapse
SCTP /proc/net/sctp/snmp sctp.snmp
IPVS /proc/net/ip_vs_stats ipvs.net
Wireless /proc/net/wireless net_wireless.*
Other /proc/uptime, /proc/sys/kernel/random/entropy_avail, /proc/sys/fs/file-nr system.uptime, system.entropy, system.file_nr_used

Collectors whose data sources are absent (no ZFS loaded, no btrfs mounts, no wireless NICs, etc.) skip silently.

Side-by-side validation against Netdata:

curl -s http://localhost:9273/api/v1/allmetrics?format=prometheus | grep "^netdata_" | sed 's/ .*//' | sort -u > nd.txt
curl -s http://localhost:9274/metrics | grep "^netdata_" | sed 's/ .*//' | sort -u > ze.txt
diff nd.txt ze.txt

Host Inventory Metrics

Metric Type Labels Description
ze_host_memory_total_bytes gauge - Total physical memory in bytes
ze_host_memory_available_bytes gauge - Available physical memory in bytes
ze_host_cpu_logical_count gauge - Number of logical CPUs
ze_host_cpu_physical_cores gauge - Number of physical CPU cores
ze_host_uptime_seconds gauge - Host uptime in seconds
ze_host_ecc_correctable_errors_total gauge - ECC correctable error count
ze_host_ecc_uncorrectable_errors_total gauge - ECC uncorrectable error count
ze_host_nic_link_speed_mbps gauge name NIC link speed in Mbps
ze_host_nic_carrier gauge name NIC carrier state (1=up, 0=down)
ze_host_storage_size_bytes gauge name Block device size in bytes
ze_host_thermal_temp_mc gauge name, device Thermal sensor reading in millicelsius

Host metrics are refreshed on a configurable interval (default 60 seconds). Linux only; on other platforms no host metrics are registered.

BGP Metrics

Instance

Metric Type Description
ze_info gauge Instance info (labels: version, router_id, local_as)
ze_uptime_seconds gauge Seconds since reactor started
ze_peers_configured gauge Number of configured peers
ze_cache_entries gauge UPDATE cache entry count

Per-Peer

Metric Type Labels Description
ze_peer_state gauge peer FSM state (0=stopped, 1=connecting, 2=active, 3=established)
ze_peer_messages_received_total counter peer, type Messages received (type: update, keepalive, open, notification, refresh, eor)
ze_peer_messages_sent_total counter peer, type Messages sent (type: update, keepalive, open, notification, refresh, eor)

Startup and Connection Timing

Metric Type Labels Description
ze_plugin_startup_seconds histogram - WaitForPluginStartupComplete duration
ze_api_ready_seconds histogram - WaitForAPIReady duration
ze_peer_dial_seconds histogram peer, result TCP dial duration (result: ok, fail)
ze_peer_connect_attempt_seconds histogram peer Full connection attempt (runOnce) duration
ze_peer_connect_attempts_total counter peer Connection attempts
ze_peer_backoff_seconds histogram peer Backoff wait duration before retry

Forward Pool / Congestion

Metric Type Labels Description
ze_forward_workers_active gauge - Active forward pool workers
ze_bgp_pool_used_ratio gauge - Global overflow pool utilization (0.0 = empty, 1.0 = full)
ze_bgp_overflow_items gauge peer Items in per-destination overflow buffer
ze_bgp_overflow_ratio gauge source Per-source overflow ratio: overflowed / (forwarded + overflowed)

Prefix Limits (RFC 4486)

Metric Type Labels Description
ze_bgp_prefix_count gauge peer, family Current prefix count
ze_bgp_prefix_maximum gauge peer, family Configured hard maximum
ze_bgp_prefix_warning gauge peer, family Warning threshold
ze_bgp_prefix_warning_exceeded gauge peer, family 1 if count >= warning
ze_bgp_prefix_ratio gauge peer, family count / maximum (0.0 to 1.0+)
ze_bgp_prefix_maximum_exceeded_total counter peer, family Times maximum exceeded
ze_bgp_prefix_teardown_total counter peer Sessions torn down for prefix limit
ze_bgp_prefix_stale gauge peer 1 if prefix data older than 6 months

Single Command

For scripting, use -c to execute a single command and exit:

ze cli -c "show bgp summary"
ze cli -c "show bgp rib received"
ze cli -c "show bgp rpki status"