Compute System
Three execution modes: Firecracker microVMs for lightweight workloads, QEMU full VMs for GPU and custom kernels, and serverless functions for event-driven compute.
Executor Architecture
The aleph-executor crate manages the full VM lifecycle. Key modules:
| Module | Purpose |
|---|---|
executor | High-level job execution orchestration |
lifecycle | VM state machine (create, start, stop, destroy) |
hypervisor | Hypervisor trait abstraction |
firecracker | Firecracker microVM driver |
qemu | QEMU/KVM full VM driver |
volumes | Persistent volume management |
cloud_init | Cloud-init configuration generation |
metering | Resource usage tracking and reporting |
functions | Serverless function runtime |
migration | Live VM migration between nodes |
model_serving | AI/ML model serving infrastructure |
Firecracker MicroVMs
Primary execution mode for lightweight workloads. Provides strong isolation with minimal overhead.
Sub-200ms Cold Start
MicroVMs boot in under 200ms, enabling true serverless-style compute.
Minimal Footprint
~5MB memory overhead per VM. Run hundreds of VMs on a single host.
Strong Isolation
Hardware-level isolation via KVM. Each VM gets its own kernel and network stack.
Configuration
// Firecracker VM configuration
struct FirecrackerConfig {
kernel_image_path: PathBuf,
rootfs_path: PathBuf,
vcpu_count: u16,
mem_size_mib: u32,
network_interfaces: Vec<NetworkInterface>,
drives: Vec<Drive>,
boot_args: String,
}
QEMU Full VMs
For workloads requiring GPU passthrough, custom kernels, or full OS features.
GPU Passthrough
# QEMU with NVIDIA GPU passthrough (IOMMU/VFIO)
[executor.qemu]
enable_gpu = true
gpu_devices = ["0000:01:00.0"] # PCI address
iommu_group = 1
vfio_driver = "vfio-pci"
The scheduler matches GPU job requirements (gpuType, gpuVramMiB) against registered node capabilities. GPU types are tracked per-node in the NodeRegistry via _nodeGpuTypes.
Serverless Functions
Event-driven compute using pre-warmed Firecracker pools. The functions module maintains a pool of idle microVMs to achieve near-instant invocation.
// Function invocation flow
// 1. Request arrives at API server
// 2. Scheduler finds available function slot
// 3. If warm pool available: <5ms invocation
// 4. If cold start needed: ~150ms boot + invoke
// 5. Response returned, VM returned to pool
pub struct FunctionRuntime {
pool: VmPool,
max_concurrent: usize,
idle_timeout: Duration,
max_execution_time: Duration,
}
Cloud-Init
VMs are configured via cloud-init. The cloud_init module generates user-data and meta-data from job specifications.
#cloud-config (auto-generated)
hostname: job-0x1234abcd
users:
- name: aleph
ssh_authorized_keys:
- ssh-ed25519 AAAA... user@host
write_files:
- path: /etc/aleph/job.env
content: |
JOB_ID=0x1234abcd
NODE_ID=0x5678efgh
runcmd:
- systemctl start application
VM Networking
Each VM gets a TAP interface managed by aleph-vm-networking. Traffic is routed through nftables rules for isolation and port forwarding.
// Network setup per VM
// 1. Create TAP interface (tap-{vm_id})
// 2. Assign IP from subnet pool (10.0.x.x/30)
// 3. Configure nftables for NAT + port forwarding
// 4. Attach TAP to Firecracker/QEMU
pub struct VmNetwork {
tap_name: String,
vm_ip: Ipv4Addr,
host_ip: Ipv4Addr,
forwarded_ports: Vec<PortForward>,
}
Resource Metering
The metering module tracks CPU, memory, network, and disk usage per VM using cgroups v2. Metrics are reported in heartbeats and used for payment calculation.