⬡ Prospector Labs luckrig_
on-prem LLM inference api · sharing platform

Try the LLM rig running on someone’s machine — right now.

A contribution-based, real-time map of local LLM inference. If Arena maps models, luckrig maps the rigs.

Concept + working POC Apache-2.0 no public nodes yet
access · earned by contribution not money OpenAI-compatible
tracker · /api/nodes
3 listed · 1 available
first-5090-qwen3 RTX 5090 · Q4_K_XL · ctx 65536 scarcity 0.62
weekend-m3max Apple M3 Max · Q5_K_M · ctx 32768 —  showcase scarcity 0.88
shed-pi5 Raspberry Pi 5 · Q4_K_M · ctx 8192 2.3 tok/s scarcity 1.00
267.4 tok/s p50 · TTFT 842 ms · n=14 · client-measured
fingerprint sha256:aZuK…aWc
01 the gap

HF Spaces and Arena let you try models.
Neither lets you try a specific rig.

Exact GPU, exact quantization, exact context length, with someone’s actual tuning notes — and your own prompt, right now. That’s the gap. luckrig is what fills it.

Service What you taste How you pay What you see about the hardware
HF Spaceshosted demos An author’s wrapper around a model Free / quota Whatever the author chose to print
LMSys Arenamodel evaluation A blind A/B between two models Free Model name. Nothing about the rig.
Vast.aiGPU rental marketplace A bare GPU you rent and provision yourself Money, per hour Hardware specs — but no one’s tuning
AI Hordedistributed mutual aid A response from any worker that fits Kudos Hardware is abstracted away
luckriginfra-first A specific rig someone is running, with their prompt-tuning Contribution, not money GPU · quant · ctx · LoRA · tuning notes · fingerprint

“If Arena is a map of model evaluation, luckrig is a real-time map of infrastructure evaluation.”

02 how it works

Three moves. No sign-up.

A tracker keeps a public list of nodes. Each node runs a thin proxy in front of ollama or llama.cpp. You earn access by contributing — a node, notes, or upstream patches.

STEP · 01

Browse rigs that are online

A real, public list. Liveness, rarity score, exact spec, fingerprint. Sorted by what’s rare, not what’s loudest.

GET /api/nodes → 200 OK { 3 listed, 1 available } order: rarity-first (not leaderboard)
STEP · 02

Taste one with your own prompt

Queue → response → local replay file you keep on disk. Real SSE in plain mode; OpenAI-compatible — your vanilla client works.

$ curl rig.local/v1/chat/completions \     --header "Authorization: Bearer $TOK" \     --data @my-prompt.json data: { "choices": [...] }
STEP · 03

Earn access by contributing

Register a node, write a tuning note, upload a timing measurement. Access scales with contribution score — a homage to Hotline Connect.

contribution: 5.40   exist 0.3 · rare 5.0 · use 0   discover 0 · note 0.1 token issued (short-lived)
stack Tracker Node Proxy ollama / llama.cpp · OpenAI-compatible · contribution tiers
03 infra map

The hardware is the star.

Other contribution networks abstract the worker away. luckrig does the opposite — environment metadata, tuning notes, a specific rig name, and rarity-based ordering. Showcase, not leaderboard.

● available scarcity 0.62

first-5090-qwen3

model
Qwen3-35B-A3B
quant
Q4_K_XL
GPU
RTX 5090
ctx
65 536
VRAM
32 GB
LoRA

MTP n_max=2 is fastest. For long-context RAG, prefer SPEC=none. ctx=131072 is the VRAM ceiling here.

initial-nodehigh-endllama.cpp
community 267.4 tok/s p50 TTFT 842 ms
node fingerprint sha256:aZuK8JLDYROmCqe8y7_gianXYuAgdIUabuGpFa72aWc
● showcase scarcity 0.88

weekend-m3max

model
Qwen2.5-14B-Instruct
quant
Q5_K_M
GPU
Apple M3 Max
ctx
32 768
VRAM
48 GB unified
LoRA

Reference frame for unified-memory builds. Evaluated with the memory config, not just the GPU label.

unified-memoryapple-siliconshowcase
community opt-in timing
node fingerprint sha256:f9KrL2nT8mYqVH3jPxsZb6Cek1WoARyEvDgUcMt0pN4
● showcase scarcity 1.00

shed-pi5

model
llama3.2-1B
quant
Q4_K_M
GPU
Raspberry Pi 5
ctx
8 192
VRAM
shared / CPU
LoRA

Lowest-spec slot. The value here is not speed — it’s “it runs on this”. 2.3 tok/s p50.

lowest-speclow-powercpu
community 2.3 tok/s p50 TTFT 1 480 ms
node fingerprint sha256:Q3pXmK8wT2YjVdLn7BeFgRz4hCsAU5oEMxqIvNyOcbW
AI Horde abstracts the worker away. luckrig makes the hardware the star. differentiation

Rarity-first

Rare configurations surface before popular ones. The default order is a showcase of breadth, not a leaderboard of speed.

Client-measured tok/s

All performance numbers come from client-side measurement. A node never self-reports its throughput.

Reproducible at home

Same GPU + quant? The config is yours to copy. Taste, then build.

04 prototype

A dependency-free Node.js POC
you can run end-to-end today.

Tracker, tokens, node proxy, subtext (optional), pseudo SSE, replay save — all wired together using only the Node.js standard library. No npm tree.

127.0.0.1:8787/ · luckrig tracker prototype
npm start
luckrig tracker prototype — public list UI loaded with three seed nodes, ordered by rarity
~/luckrig — bash
# clone & run, no external deps $ git clone github.com/prospectorlabs/luckrig $ cd luckrig $ npm start http://127.0.0.1:8787 # run the suites $ npm run check $ npm run test:smoke $ npm run test:e2e   ✓ $ npm test          ✓ all passing

What’s wired up in POC v1

  • · public list UI
  • · liveness monitoring
  • · SQLite + JSONL persistence
  • · durable token quota
  • · node proxy
  • · prompt filter
  • · public-key subtext
  • · browser tasting POC
  • · queue visualization
  • · env filter / compare
  • · pseudo SSE
  • · replay save
  • · IP token rate-limit
  • · abuse reporting
05 honesty

What we won’t pretend.

This is the part most landing pages skip. We’d rather lose the reader who was here for marketing copy than mislead the reader who wasn’t.

No SLA. No warranty. If you’re lucky, you land on a fast node.
trust model

Plain TLS + a node-operator promise — that’s the baseline.

plain mode is real SSE over TLS, plus the node proxy’s “don’t write plaintext to logs” convention. Nothing more, nothing dressed up as more.

privacy, said honestly

Processed so it won’t show up in ordinary logs.

A node operator who instruments the process internals can still see plaintext. Don’t send secrets, personal data, or anything you’re obligated to protect.

moderation

Three stages. Fails closed where it has to.

1) Local regex in the node proxy. 2) External moderation hook (LUCKRIG_MODERATION_ENDPOINT), fails closed on input. 3) Tracker-side Notice-and-Takedown — no auto-bans.

zero tolerance

Illegal input/output is prohibited. Reporting path is built in.

CSAM, terror / mass-violence support, and other illegal content are out. POST /api/abuse/report is rate-limited; operator review precedes any ban.

end of the page

Explore the code on GitHub.

luckrig is a concept and a working POC, in the open. Read the spec, run the POC locally, decide for yourself.