Skip to main content

Trinity Storage Network v1.6 Report

20-Node Scale, Shard Scrubbing, Node Reputation, Graceful Shutdown, Network Stats

Build on v1.5 (Proof-of-Storage, Shard Rebalancing, Bandwidth Aggregation)


Key Metrics

MetricValueStatus
Total Tests~1,415+ (140 integration alone)PASS
v1.6 Integration Tests5 new (20-node scale, scrubbing, reputation, shutdown, stats)PASS
v1.6 Unit Tests~27 new across 4 modules + extensionsPASS
New Zig Modules4 (shard_scrubber, node_reputation, graceful_shutdown, network_stats)Complete
Modified Modules7 (protocol, shard_rebalancer, storage_discovery, network, main, integration_test, build.zig)Complete
Protocol Messages4 new (0x2C-0x2F)Backward-compatible
20-Node Scale Test5 files, 4-node churn, PoS + bandwidth + RSPASS
Shard Scrubbing20 nodes, 2 corruptions detectedPASS
Node Reputation20 nodes ranked, top-5 selection with exclusionPASS
Graceful Shutdown10 nodes, 1 departs, 0 data lossPASS
Network Stats20 nodes, text + JSON report generationPASS

What This Means

For Users

  • Bit-rot protection: Shard Scrubber periodically re-verifies SHA256 hashes of all locally stored shards. Detects silent data corruption before Proof-of-Storage catches it.
  • Smarter peer selection: Node Reputation scores combine PoS pass rate (40%), uptime (30%), and bandwidth contribution (30%) into a composite 0.0-1.0 score. Data is preferentially placed on higher-reputation nodes.
  • Zero-downtime departures: Graceful Shutdown proactively redistributes a departing node's shards before it leaves, avoiding reactive rebalancing delays.

For Operators

  • CLI flags: --scrub enables periodic shard scrubbing, --reputation enables reputation scoring, --report generates a one-shot network health report (text output).
  • Health monitoring: Network Stats Reporter aggregates data from all subsystems (storage, replication, PoS, bandwidth, scrubber, reputation) into a single report.
  • 20-node proven: The integration test suite now validates at 20-node scale with multi-file RS storage, 4-node churn, and all v1.5+v1.6 subsystems running concurrently.

For the Network

  • Production hardening: v1.6 adds the defensive layers needed for long-running deployments: data integrity checking, reputation-based trust, graceful node lifecycle management, and observability.
  • Backward compatible: New message types 0x2C-0x2F fall into the else wildcard on older nodes. Reputation scores are transmitted as score * 1,000,000 (u64) to avoid float serialization issues.

Technical Details

Architecture

v1.6 Module Dependency Graph:

protocol.zig ───────────────────────────────────────────
│ ShardScrubReportMsg (0x2C, 48B) │
│ ReputationQueryMsg (0x2D, 64B) │
│ ReputationResponseMsg (0x2E, 64B) │
│ GracefulShutdownMsg (0x2F, 44B) │
└─────────────────────────────────────────────────────┘
│ │ │ │
┌────▼─────┐ ┌──────▼──────┐ ┌─────▼─────┐ ┌───▼──────────┐
│ shard_ │ │ node_ │ │ graceful_ │ │ network_ │
│ scrubber │ │ reputation │ │ shutdown │ │ stats │
│ .zig │ │ .zig │ │ .zig │ │ .zig │
│ │ │ │ │ │ │ │
│ Scrubber │ │ Reputation │ │ Shutdown │ │ Stats │
│ ScrubRes │ │ System │ │ Manager │ │ Reporter │
│ ScrubStat│ │ Weights │ │ Plan │ │ HealthReport │
└──────────┘ │ Score │ └────────────┘ └──────────────┘
│ └─────────────┘ │ │
│ │ │ │
┌────▼───────────────▼───────────────▼───────────────▼──┐
│ network.zig │
│ Wires all 4 modules: poll() triggers scrub/reputation│
│ handleConnection dispatches 0x2C-0x2F messages │
└───────────────────────────────────────────────────────┘

┌─────▼─────┐
│ main.zig │
│ --scrub │
│ --reputat │
│ --report │
└───────────┘

New Modules

1. Shard Scrubber (shard_scrubber.zig)

Background SHA256 re-verification of locally-held shards. Detects bit-rot, disk corruption, and tampering between PoS challenge rounds.

  • scrubNode(provider) — iterates all shards in a StorageProvider, re-computes SHA256, flags mismatches
  • isCorrupted(hash) / getCorruptedShards() — query corruption state
  • shouldScrub() — respects configurable interval (default 600s)
  • getStats() — returns ScrubStats { shards_checked, corruptions_found, last_scrub_time }

2. Node Reputation (node_reputation.zig)

Composite reputation scoring for smarter peer selection.

  • Weights: PoS pass rate (40%) + uptime (30%) + bandwidth contribution (30%)
  • recordPosResult(node_id, passed) — increments PoS pass/total counters
  • recordUptime(node_id, uptime_secs, window_secs) — stores uptime fraction
  • recordBandwidth(node_id, bytes) — tracks cumulative bandwidth; normalized against max_bandwidth_bytes
  • getScore(node_id) — returns weighted composite 0.0-1.0
  • rankNodes() — returns all nodes sorted by score descending
  • selectBestPeers(count, exclude, alloc) — picks top-N peers excluding one (for rebalancing/replication)

3. Graceful Shutdown (graceful_shutdown.zig)

Pre-departure shard redistribution — proactive vs. reactive rebalancing.

  • initiateShutdown(node_id, rebalancer) — scans rebalancer for all shards held by node, creates plan
  • executeShutdown(node_id, rebalancer, peers, peer_ids) — calls rebalancer.removeNode() then rebalancer.rebalance()
  • isShuttingDown(node_id) — check if shutdown in progress
  • Tracks completed_plans and total_shards_moved statistics

4. Network Stats (network_stats.zig)

Aggregated health report from all subsystems.

  • generateReport(peers, rebalancer, pos, bw_agg, registry, scrubber?, reputation?) — collects metrics from all subsystems
  • formatText(report) — human-readable text output
  • formatJson(report) — machine-readable JSON output
  • NetworkHealthReport fields: node_count, total_shards, storage bytes, replication stats, PoS stats, bandwidth, scrub results, reputation avg/min/max

Protocol Extensions

MessageOpcodeSizePurpose
ShardScrubReportMsg0x2C48BReport scrub results to network
ReputationQueryMsg0x2D64BRequest reputation score for a node
ReputationResponseMsg0x2E64BReply with reputation score (millionths)
GracefulShutdownMsg0x2F44BAnnounce impending departure

Modified Modules

ModuleChanges
shard_rebalancer.zigAdded getShardLocationsForNode() for shutdown planning
storage_discovery.zigAdded reputation_score field, updateReputation()/getReputation() methods
network.zig4 imports, 4 nullable module fields, 3 message handlers, poll() scrub/reputation
main.zig4 imports, --scrub/--reputation/--report flags, initialization, report handler
protocol.zig4 new MessageType values, 4 message structs with serialize/deserialize
build.zig4 new test targets (shard_scrubber, node_reputation, graceful_shutdown, network_stats)

Integration Tests

TestNodesWhat It Validates
20-node multi-file RS with churn + PoS + bandwidth205 files stored, 4 nodes killed, rebalance, PoS challenge, bandwidth aggregate
Shard scrubbing — detect corruption2020 shards stored, 2 tampered, scrubber finds both
Node reputation ranking20Proportional PoS/uptime/bandwidth, sorted ranking, top-5 selection
Graceful shutdown redistribution105 shards, 1 node departs gracefully, 0 data loss
Network stats report20Full report generation, text format, JSON format

Cumulative Version History

VersionFeaturesTest Count
v1.0Storage, Sharding, Encryption~200
v1.1Disk Persistence, LRU Eviction~400
v1.2Protocol, Networking, Discovery~600
v1.3Rewards, Bandwidth Metering~800
v1.4Reed-Solomon, Connection Pool, DHT, 12-node test~1,100
v1.5Proof-of-Storage, Rebalancer, Bandwidth Aggregation~1,382
v1.6Shard Scrubbing, Reputation, Graceful Shutdown, Network Stats, 20-node test~1,415+

Conclusion

Trinity Storage Network v1.6 completes the production-hardening layer:

  1. Data integrity — Shard Scrubber provides a second line of defense against data corruption, complementing PoS challenges with periodic local re-verification.
  2. Trust management — Node Reputation enables data-driven peer selection, incentivizing reliability and contribution.
  3. Lifecycle management — Graceful Shutdown eliminates the "pull the plug" failure mode, allowing orderly node departures.
  4. Observability — Network Stats Reporter provides a unified view of network health across all subsystems.
  5. Scale validation — 20-node integration tests with multi-file RS, churn, PoS, bandwidth, scrubbing, and reputation running concurrently.

Next Steps (v1.7 Candidates)

  • Erasure-coded repair from scrub results: When scrubber detects corruption, automatically trigger RS recovery from healthy replicas
  • Reputation decay: Time-weighted scoring so stale reputation data fades
  • Multi-region topology awareness: Prefer cross-region replication for geographic redundancy
  • Incentive slashing: Reduce rewards for nodes with low reputation scores
  • Prometheus/Grafana export: Machine-consumable metrics endpoint for external monitoring