v1 · verifiable inferencecurrent · stabilising
Every rollout proves the right checkpoint
Training happens — the reference checkpoint rotates on schedule. v1 carries the robustness layer (every rollout has a GRAIL proof, miners serving stale weights get caught) and proves a second thesis: verified rollouts measurably accelerate downstream training. Verification is the product.
- ▲Free-market slot settlement with advantage-based scoring over the full accepted set
- ▲Strategic targeting via /window/{n}/state histogram
- ▲Checkpoint rotation — GRAIL flags any miner still serving the previous weights
- ◇Benchmark training-efficiency delta — random prompts vs. Reliquary verified rollouts
order · 01v2 · open inference APInext · design in flight
Training-ready rollouts on demand
Open the subnet to external workloads. Researchers, RL labs and agent builders submit prompts and a reward function; Reliquary miners ship verified, training-optimised rollouts — selected by advantage, dedup'd, with a GRAIL proof on every token. The same lift v1 proves on the canonical checkpoint, delivered straight into your trainer. Pay-per-rollout, no infra to run.
- ?REST + Bittensor dendrite endpoints — submit a batch, stream verified completions
- ?Rollouts pre-optimised for training — advantage-weighted, dedup'd, ready to feed your trainer
- ?Sandboxed reward functions — deterministic exec, resource caps, no network
- ?Pricing — per-rollout TAO with credit packs for sustained workloads
- ?GRAIL proofs returned per token — re-verify offline before feeding your trainer
- ?Job isolation — external workloads never collide with v1 canonical rollouts
order · 02v3 · on-policy trainingfuture · open design
Rollouts in, smarter checkpoint out
Miners turn the verified rollouts they produce — both subnet-canonical traffic and v2 external jobs — into training signal. The reference checkpoint is updated on-chain on a fixed cadence, closing the GRPO loop and rewarding miners that generate learnable trajectories.
- ?Who runs the GRPO step — every miner trains locally, validators pick consensus?
- ?Cadence — one training step per window or k-window batches?
- ?Model announcement — on-chain commit hash or signed endpoint?
- ?Use of v2 external rollouts — opt-in as training signal by the submitter
- ?Economics — training emission slice on top of rollout rewards
order · 03every token carries its proof · every checkpoint is earned · every rollout is a receipt.