P2P Network Overview
VIZ Ledger uses a custom DLT P2P protocol that replaced the legacy graphene synopsis-based networking layer. The new design is optimized for DLT mode (snapshot-based nodes with rolling block logs) and removes the complex graphene ancestry synopsis in favour of a simpler range-based block exchange.
Architecture
┌─────────────────────────────────────────────────────────┐
│ p2p_plugin (AppBase plugin) │
│ └─ dlt_delegate (implements dlt_p2p_delegate) │
│ └─ bridges chain state: db(), fork_db, block_log │
├─────────────────────────────────────────────────────────┤
│ dlt_p2p_node │
│ ├─ accept loop (incoming TCP connections) │
│ ├─ periodic task (5s tick: reconnect, stats, gaps) │
│ └─ dlt_peer_state × N (one per connected peer) │
├─────────────────────────────────────────────────────────┤
│ Wire format: raw TCP, header (type + length) + data │
│ Fiber model: all I/O on one fc::thread, cooperative │
└─────────────────────────────────────────────────────────┘Design decisions
| Decision | Rationale |
|---|---|
| Delegate pattern | dlt_p2p_node links only fc + graphene_protocol. Direct chain access is exposed via dlt_p2p_delegate to avoid circular dependency. |
| Raw TCP (no STCP encryption) | DLT emergency mode flips all validators simultaneously — no backward-compatible encryption needed. Simpler wire protocol. |
| Cooperative fibers (fc::thread) | All I/O uses readsome()/writesome() which yield the fiber. Multiple peers on one thread without mutexes. |
| Separate P2P mempool | Chain's _pending_tx only takes effect after acceptance. The P2P mempool filters by expiry, TaPoS, and size before pushing to chain, reducing wasted evaluation. |
| In-place plugin replacement | Plugin name is still "p2p", port still 2001/4243, public API unchanged. Old and new protocols are incompatible; dual-mode creates isolated sub-networks. |
Peer Lifecycle
Each peer connection goes through the following states:
CONNECTING ──(TCP established)──► HANDSHAKING
5s timeout ↓ ↓ 10s timeout
DISCONNECTED hello/hello_reply
▲ ↓
│ SYNCING ──(caught up)──► ACTIVE
│ │
└──(disconnect/error)──────────────────┘
│
BANNED ◄──(spam_strikes ≥ 10)──────────┘Timeout values:
- Connecting → DISCONNECTED: 5 seconds
- Handshaking → DISCONNECTED: 10 seconds
Reconnection backoff: 30 s → 60 s → … → 3600 s with ±25% jitter. Backoff resets after a stable connection of >5 minutes. Peers with no response for 8 hours are permanently removed.
Emergency peer reset: If all peers are isolated (zero active connections) for 60 seconds, emergency_peer_reset() clears all soft bans and resets all backoffs to their initial value with immediate reconnect attempts.
Hello Handshake
On connection the initiating peer sends dlt_hello_message containing:
head_block_num/head_block_idlib_block_num/lib_block_iddlt_earliest_block_num— oldest block available in the peer's rolling DLT block lognode_status— SYNC or FORWARD
The receiving peer responds with dlt_hello_reply_message containing:
fork_alignment— whether the blocks overlap on the same forkexchange_enabled— whether the responding peer considers the sender caught up
Fork alignment check (DLT-range-aware)
Because DLT nodes prune old blocks, naive head-ID comparison would falsely flag same-chain peers as "different fork". The check is multi-tiered:
| Case | Check |
|---|---|
Peer has no blocks (head_num == 0) | Aligned |
| Peer head is in our DLT range | is_block_known(peer.head_id) |
| Peer head + 1 == our earliest block | Read our earliest block, verify previous == peer.head_id |
| Fallback | is_block_known(peer.lib_id) |
Sync Modes
Each node is in one of two modes at any time:
SYNC mode (pull-based)
Used when the node is behind the network. The node requests blocks in ranges of up to 200 blocks from a peer:
us peer
│──dlt_get_block_range──►│
│◄──dlt_block_range_reply─│
│ (up to 200 blocks) │
│──apply each block──►chain│
│ │
│ (when is_last=true) │
│──transition_to_forward │Gap detection: If our_head + 1 < peer.dlt_earliest (the missing blocks are no longer in the peer's rolling log), the node searches for another peer that can bridge the gap. If no peer can serve the gap, a snapshot import is recommended.
Stagnation protection: If no block is received for 30 seconds, the node retries up to 3 times, then transitions to FORWARD mode with a warning.
FORWARD mode (push-based)
Used when the node is caught up. Blocks are gossiped via dlt_block_message. Each block is broadcast to all exchange-enabled peers that share the same fork.
FORWARD → SYNC fallback: If the node's head does not advance for 30 seconds (check_forward_stagnation) and at least one peer is ahead, the node re-enters SYNC mode.
SYNC ↔ FORWARD transitions
| Transition | Trigger |
|---|---|
| SYNC → FORWARD | Block range reply with is_last=true |
| SYNC → FORWARD | check_sync_catchup(): our head ≥ all peers |
| SYNC → FORWARD | Stagnation after 3 retries |
| FORWARD → SYNC | check_forward_stagnation(): head stuck for 30s and a peer is ahead |
| FORWARD → SYNC | check_forward_behind(): peer ahead by >2 blocks (15 s grace after entering FORWARD) |
| FORWARD → SYNC | Gap fill fails and no peer is available |
On SYNC → FORWARD, the node broadcasts a dlt_fork_status_message with node_status=FORWARD to all connected peers, enabling them to re-evaluate exchange_enabled for this node.
Gap Fill
Gap fill is a lightweight mechanism to fetch a small number of specific blocks without entering full SYNC mode. It uses two dedicated message types (dlt_gap_fill_request / dlt_gap_fill_reply) and triggers in three places:
- When an out-of-order block arrives (
on_dlt_block_reply) - Every 5 seconds from
periodic_task() - After snapshot pause completes (
resume_block_processing())
Rules:
- Maximum 100 blocks per request (
GAP_FILL_MAX_BLOCKS); larger gaps use chunked requests. - 5-second cooldown between gap fill requests.
- The requesting peer selects the active peer with the highest head block number.
- The serving peer reads blocks from its DLT block log; requests outside the log range are rejected.
- SYNCING lifecycle peers are eligible candidates (not just ACTIVE).
- If no suitable peer is found, the node transitions immediately to SYNC mode.
Mempool
The DLT P2P layer maintains its own mempool separate from the chain's _pending_tx. This allows early filtering before pushing transactions to the chain evaluator.
Admission checks:
- Duplicate by
tx_id— deduplicated on receipt - Expiry — reject if already expired
- TaPoS (
tapos_block_num) — reject if reference block is unknown - Size — reject if
tx.size > dlt-mempool-max-tx-size(default 64 KB) - Expiration horizon — reject if expiry is more than
dlt-mempool-max-expiration-hours(default 24 h) in the future
Eviction: When the mempool exceeds dlt-mempool-max-tx (default 10 000) or dlt-mempool-max-bytes (default 100 MB), the entry with the nearest expiration is evicted first.
Lifecycle:
- Transactions received during SYNC are tagged provisional and revalidated on transition to FORWARD (TaPoS blocks may now be known).
- On block application, included transactions are pruned (
remove_transactions_in_block). - On fork switch, TaPoS-invalid entries are pruned (
prune_mempool_on_fork_switch). periodic_mempool_cleanup()removes expired and TaPoS-invalid entries every cycle.
Fork Resolution
The DLT P2P layer tracks fork state with a 42-block threshold (2 full validator rounds = CHAIN_MAX_VALIDATORS × 2).
track_fork_state() is called after each block application. When a competing fork is detected and sustained for ≥ 42 blocks, resolve_fork() computes the heaviest branch by total vote weight. A candidate branch must accumulate 6 consecutive confirmation blocks (dlt_fork_resolution_state::CONFIRMATION_BLOCKS) before the node switches to it (hysteresis).
The current fork status is exposed via is_on_majority_fork(), which the Validator Plugin uses to decide whether to produce blocks.
Anti-Spam
Each peer has a single spam_strikes counter:
- Incremented on: invalid block, invalid transaction, protocol violation
- Reset on: any valid packet
- Soft-ban threshold: 10 strikes
A soft-banned peer receives dlt_soft_ban_message (containing ban_duration_sec and a human-readable reason) before the connection is closed. The banned peer enters BANNED state for the specified duration and will not reconnect until it expires.
Per-IP connection dedup prevents multiple connections from the same node:
accept_loop()rejects incoming connections from IPs with an existing active entry.connect_to_peer()skips outbound connections if the target IP already has an active entry.- Broadcast (
send_to_all_our_fork_peers) tracks aset<ip::address>and skips IPs already sent to in that broadcast.
Duplicate / out-of-order block tolerance:
- Already-applied blocks are silently skipped (not counted as spam).
- Out-of-order blocks in range replies fall through to
fork_dbinstead of triggering a soft ban. - Deserialization errors do not increment spam strikes.
- Oversized messages from old-protocol peers trigger a disconnect without increasing backoff.
Peer Exchange
Nodes share peer addresses to assist discovery.
Rate limit: 3 requests per 5-minute window per peer.
Filters applied before sharing a peer address:
- Minimum uptime: 600 seconds
- Subnet diversity: maximum 2 peers per /24 subnet
- Ephemeral-port exclusion:
is_incomingpeers are never shared (their port is temporary)
Limits per reply: dlt-peer-exchange-max-per-reply (default 10).
Block Processing Pause/Resume
The snapshot plugin (and other plugins requiring exclusive access) can halt P2P block intake via pause_block_processing(). While paused:
periodic_task()skips operations that need database read locks:sync_stagnation_check(),periodic_peer_exchange(),log_peer_stats().- Stale sync and forward stagnation timers are reset so the node does not enter unnecessary mode transitions.
- Non-DB housekeeping continues: reconnect, lifecycle management, mempool cleanup, banned-peer unban.
On resume_block_processing(), the node attempts gap fill before falling back to SYNC mode.
Configuration
| Option | Default | Description |
|---|---|---|
p2p-endpoint | 0.0.0.0:2001 | Listen address and port |
seed-node | — | Static seed peer address(es) |
p2p-max-connections | — | Maximum simultaneous peer connections |
dlt-block-log-max-blocks | 100000 | Rolling DLT block log capacity |
dlt-peer-max-disconnect-hours | 8 | Remove non-responding peer after N hours |
dlt-mempool-max-tx | 10000 | Hard cap on mempool entry count |
dlt-mempool-max-bytes | 104857600 | Hard cap on total mempool memory (100 MB) |
dlt-mempool-max-tx-size | 65536 | Reject transactions larger than this (64 KB) |
dlt-mempool-max-expiration-hours | 24 | Reject transactions expiring more than N hours in future |
dlt-peer-exchange-max-per-reply | 10 | Max addresses returned per peer-exchange reply |
dlt-peer-exchange-max-per-subnet | 2 | Max peers shared per /24 subnet |
dlt-peer-exchange-min-uptime-sec | 600 | Min peer uptime before sharing address |
dlt-stats-interval-sec | 300 | Interval between peer stats log output (min 30 s) |
Peer Statistics Log
Every dlt-stats-interval-sec (default 5 minutes) the node logs a peer statistics summary:
[DLT-P2P] node=FORWARD head=#79274318 lib=#79274297 fork=MAJORITY
peer 192.168.1.10:2001 ACTIVE head=#79274318 exch=YES dlt=[79174319..79274318] strikes=0
peer 192.168.1.11:2001 SYNCING head=#79274100 exch=no dlt=[79174319..79274100] strikes=0
peer 192.168.1.12:2001 BANNED ban_remaining=3540sFields:
exch=YES/no— whether block/transaction exchange is enabled with this peerdlt=[min..max]— DLT block log range the peer can servestrikes— current spam strike count (resets on any valid packet)ban_remaining— seconds until soft ban expires
The stats interval can be updated at runtime via set_stats_log_interval().
Diagnostic Summary
| Symptom | Likely cause |
|---|---|
| Node stuck in SYNC, head not advancing | Gap between our head and peer's DLT range — peer cannot bridge; consider snapshot import |
| Rapid SYNC ↔ FORWARD oscillation | No peer is ahead, or all peers isolated — check emergency_peer_reset log entries |
All peers show exch=no | FORWARD transition did not notify peers; should self-resolve on next broadcast_chain_status cycle |
spam_strikes growing on all peers | Likely fork divergence — check fork alignment via hello logs |
unlinked_size growing in fork_db | Parent blocks are not arriving; gap fill should recover within 5s |
peer_head_num appears stale in stats | Expected — peer_head_num is a snapshot from the last hello/fork_status exchange, not real-time |
See also: Messages, Sync Scenarios, Forward Mode, Stats Reference, Snapshot, Fork Resolution.