Storage System
Content-addressed blob store with RocksDB indexing, Reed-Solomon erasure coding, Merkle proof verification, and on-chain challenge/response auditing.
Architecture
Module Overview
The aleph-storage crate exports:
| Module | Key Types | Purpose |
|---|---|---|
engine | StorageEngine | Core blob store: put, get, delete, exists |
index | StorageIndex, BlobMetadata | RocksDB index for content-addressed lookup |
cache | ContentCache, EvictionPolicy | In-memory LRU/LFU cache layer |
cached_engine | CachedStorageEngine | Engine + cache composition |
chunking | ChunkingEngine, ChunkManifest | Fixed-size chunking with manifests |
merkle | MerkleTree, MerkleProof | Merkle tree construction and proof generation |
proofs | StorageProofGenerator, ChallengeResponder | On-chain storage proof generation |
replication | ErasureEncoder, ReplicationManager | Reed-Solomon encoding and shard placement |
ipfs | CidV0, CidV1, IpfsGateway | IPFS CID compatibility and gateway |
gc | Garbage collection | Cleanup of unreferenced blobs |
Content Addressing
All data is stored by its content hash (SHA-256). The StorageEngine provides the core interface:
pub trait StorageEngine {
async fn put(&self, data: &[u8]) -> Result<ContentHash>;
async fn get(&self, hash: &ContentHash) -> Result<Vec<u8>>;
async fn exists(&self, hash: &ContentHash) -> bool;
async fn delete(&self, hash: &ContentHash) -> Result<()>;
async fn size(&self, hash: &ContentHash) -> Result<u64>;
}
Chunking
Large files are split into fixed-size chunks (default: 256 KiB, configurable between MIN_CHUNK_SIZE and MAX_CHUNK_SIZE). Each chunk is stored independently and tracked by a ChunkManifest:
pub struct ChunkManifest {
pub content_hash: ContentHash, // hash of original file
pub chunks: Vec<ChunkInfo>,
pub total_size: u64,
}
pub struct ChunkInfo {
pub hash: ContentHash,
pub offset: u64,
pub size: u32,
}
Merkle Trees & Proofs
Each storage commitment generates a Merkle tree from chunk hashes. The root is committed on-chain in the StorageRegistry contract.
// Build Merkle tree from chunks
let tree = MerkleTree::from_leaves(&chunk_hashes);
let root = tree.root();
// Generate proof for a specific chunk
let proof = tree.proof(chunk_index);
// Verify proof (done on-chain via StorageRegistry)
assert!(proof.verify(root, leaf_hash));
Erasure Coding
Reed-Solomon encoding provides data redundancy. Data shards are distributed across nodes for fault tolerance.
// Encode with Reed-Solomon
let encoder = ErasureEncoder::new(
data_shards, // e.g., 4
parity_shards, // e.g., 2 (tolerates 2 failures)
);
let shards: Vec<Shard> = encoder.encode(&data)?;
// Place shards across nodes
let placements = ReplicationManager::place_shards(
&shards,
&available_nodes,
replication_factor,
)?;
Challenge/Response
Storage providers must prove data possession via on-chain challenges:
- Challenger issues a challenge with a random seed via
StorageRegistry.issueChallenge() - Node computes the challenge response using the seed to select which chunks to prove
- Node submits Merkle proof via
StorageRegistry.respondToChallenge() - Contract verifies proof on-chain using
MerkleProof.verify() - Failed challenges trigger slashing via
StakingManager.slash()
// Challenge response flow (node side)
let responder = ChallengeResponder::new(&storage_engine);
let response: ChallengeResponse = responder
.respond(challenge_seed, commitment_id)
.await?;
// Submit proof on-chain
storage_registry
.respondToChallenge(
challenge_id,
response.proof,
response.leaf,
)
.send().await?;
Caching
The CachedStorageEngine wraps the base engine with an in-memory cache supporting LRU and LFU eviction policies:
let cache = ContentCache::new(CacheConfig {
max_size_bytes: 512 * 1024 * 1024, // 512 MB
eviction_policy: EvictionPolicy::LRU,
});
let engine = CachedStorageEngine::new(base_engine, cache);