Reliquary

v1 · verifiable inferencecurrent · stabilising

Every rollout proves the right checkpoint

Training happens — the reference checkpoint rotates on schedule. v1 carries the robustness layer (every rollout has a GRAIL proof, miners serving stale weights get caught) and proves a second thesis: verified rollouts measurably accelerate downstream training. Verification is the product.

Free-market slot settlement with advantage-based scoring over the full accepted set
Strategic targeting via /window/{n}/state histogram
Checkpoint rotation — GRAIL flags any miner still serving the previous weights
Benchmark training-efficiency delta — random prompts vs. Reliquary verified rollouts

v2 · open inference APInext · design in flight

Training-ready rollouts on demand

Open the subnet to external workloads. Researchers, RL labs and agent builders submit prompts and a reward function; Reliquary miners ship verified, training-optimised rollouts — selected by advantage, dedup'd, with a GRAIL proof on every token. The same lift v1 proves on the canonical checkpoint, delivered straight into your trainer. Pay-per-rollout, no infra to run.

REST + Bittensor dendrite endpoints — submit a batch, stream verified completions
Rollouts pre-optimised for training — advantage-weighted, dedup'd, ready to feed your trainer
Sandboxed reward functions — deterministic exec, resource caps, no network
Pricing — per-rollout TAO with credit packs for sustained workloads
GRAIL proofs returned per token — re-verify offline before feeding your trainer
Job isolation — external workloads never collide with v1 canonical rollouts

v3 · on-policy trainingfuture · open design

Rollouts in, smarter checkpoint out

Miners turn the verified rollouts they produce — both subnet-canonical traffic and v2 external jobs — into training signal. The reference checkpoint is updated on-chain on a fixed cadence, closing the GRPO loop and rewarding miners that generate learnable trajectories.

Who runs the GRPO step — every miner trains locally, validators pick consensus?
Cadence — one training step per window or k-window batches?
Model announcement — on-chain commit hash or signed endpoint?
Use of v2 external rollouts — opt-in as training signal by the submitter
Economics — training emission slice on top of rollout rewards

v1 · verifiable inferencecurrent · stabilising

Every rollout proves the right checkpoint

Free-market slot settlement with advantage-based scoring over the full accepted set
Strategic targeting via /window/{n}/state histogram
Checkpoint rotation — GRAIL flags any miner still serving the previous weights
Benchmark training-efficiency delta — random prompts vs. Reliquary verified rollouts

order · 01

v2 · open inference APInext · design in flight

Training-ready rollouts on demand

REST + Bittensor dendrite endpoints — submit a batch, stream verified completions
Rollouts pre-optimised for training — advantage-weighted, dedup'd, ready to feed your trainer
Sandboxed reward functions — deterministic exec, resource caps, no network
Pricing — per-rollout TAO with credit packs for sustained workloads
GRAIL proofs returned per token — re-verify offline before feeding your trainer
Job isolation — external workloads never collide with v1 canonical rollouts

order · 02

v3 · on-policy trainingfuture · open design

Rollouts in, smarter checkpoint out

Who runs the GRPO step — every miner trains locally, validators pick consensus?
Cadence — one training step per window or k-window batches?
Model announcement — on-chain commit hash or signed endpoint?
Use of v2 external rollouts — opt-in as training signal by the submitter
Economics — training emission slice on top of rollout rewards

order · 03

every token carries its proof · every checkpoint is earned · every rollout is a receipt.

v1 · verifiable inferencecurrent · stabilising

Every rollout proves the right checkpoint

Free-market slot settlement with advantage-based scoring over the full accepted set
Strategic targeting via /window/{n}/state histogram
Checkpoint rotation — GRAIL flags any miner still serving the previous weights
Benchmark training-efficiency delta — random prompts vs. Reliquary verified rollouts

v2 · open inference APInext · design in flight

Training-ready rollouts on demand

REST + Bittensor dendrite endpoints — submit a batch, stream verified completions
Rollouts pre-optimised for training — advantage-weighted, dedup'd, ready to feed your trainer
Sandboxed reward functions — deterministic exec, resource caps, no network
Pricing — per-rollout TAO with credit packs for sustained workloads
GRAIL proofs returned per token — re-verify offline before feeding your trainer
Job isolation — external workloads never collide with v1 canonical rollouts

v3 · on-policy trainingfuture · open design

Rollouts in, smarter checkpoint out

Who runs the GRPO step — every miner trains locally, validators pick consensus?
Cadence — one training step per window or k-window batches?
Model announcement — on-chain commit hash or signed endpoint?
Use of v2 external rollouts — opt-in as training signal by the submitter
Economics — training emission slice on top of rollout rewards

v1 · verifiable inferencecurrent · stabilising

Every rollout proves the right checkpoint

Free-market slot settlement with advantage-based scoring over the full accepted set
Strategic targeting via /window/{n}/state histogram
Checkpoint rotation — GRAIL flags any miner still serving the previous weights
Benchmark training-efficiency delta — random prompts vs. Reliquary verified rollouts

order · 01

v2 · open inference APInext · design in flight

Training-ready rollouts on demand

REST + Bittensor dendrite endpoints — submit a batch, stream verified completions
Rollouts pre-optimised for training — advantage-weighted, dedup'd, ready to feed your trainer
Sandboxed reward functions — deterministic exec, resource caps, no network
Pricing — per-rollout TAO with credit packs for sustained workloads
GRAIL proofs returned per token — re-verify offline before feeding your trainer
Job isolation — external workloads never collide with v1 canonical rollouts

order · 02

v3 · on-policy trainingfuture · open design

Rollouts in, smarter checkpoint out

Who runs the GRPO step — every miner trains locally, validators pick consensus?
Cadence — one training step per window or k-window batches?
Model announcement — on-chain commit hash or signed endpoint?
Use of v2 external rollouts — opt-in as training signal by the submitter
Economics — training emission slice on top of rollout rewards

order · 03

every token carries its proof · every checkpoint is earned · every rollout is a receipt.

Roadmap — verifiable inference, open API, on-policy training · Reliquary