
Submit your workload type and model size. Jungle Grid classifies it, scores every eligible node, and dispatches to the right GPU automatically.
Right now, you pick GPUs by hand, guess at VRAM, and discover mismatches at runtime. That is the job Jungle Grid replaces.
Every job moves through four stages. You control the first. The system handles the rest.
Submit workload_type and model_size. No GPU names. The API accepts inference, training, fine-tuning, or batch — and resolves the rest.
Classifier maps workload intent to one of 7 GPU tiers. Matcher resolves each tier to eligible hardware. A 5 GB inference job hits T4/RTX3090/RTX4090. A training job hits A100/H100. No config.
Consumer GPUs and data center cards in the same cluster. Tiers define eligibility. The scheduler routes a fine-tuning job to an L4, A10G, A100, or H100 — whichever scores best.
Every eligible node is scored on price, reliability, latency, and performance with configurable weights. Ties are broken deterministically. Best score wins. VRAM-fit and queue depth signals shipping next.
Classifier, matcher, scorer, dispatch. Every stage logged. Nothing hidden.
Users never see GPU names. The matcher resolves tiers to hardware behind the scenes. Here is what the routing map looks like.
Unknown workload types route to the full 7-GPU pool. The scorer decides placement — no job is ever dropped.
From job submission to GPU execution — every stage is observable, scorable, and fault-tolerant.
Real milestones from the engineering roadmap. No vapor. Every item maps to code.
Classifier + Matcher shipped. 6 tiers, 40+ tests passing. Users submit intent, not hardware.
VRAM-fit pre-filter ensures no model is dispatched to a GPU with insufficient memory.
Wire QueueDepth into scheduler scoring so jobs avoid overloaded nodes automatically.
GPUProvider interface so adding new GPU hardware is a single registration, not a code change.
optimize_for: "cost" | "speed" | "balanced" — per-job weight profiles for the scorer.
Node agents report thermal state. Scheduler excludes or deprioritizes throttled hardware.
Classifier + Matcher shipped. 6 tiers, 40+ tests passing. Users submit intent, not hardware.
VRAM-fit pre-filter ensures no model is dispatched to a GPU with insufficient memory.
Wire QueueDepth into scheduler scoring so jobs avoid overloaded nodes automatically.
GPUProvider interface so adding new GPU hardware is a single registration, not a code change.
optimize_for: "cost" | "speed" | "balanced" — per-job weight profiles for the scorer.
Node agents report thermal state. Scheduler excludes or deprioritizes throttled hardware.
Jungle Grid is in private beta and under active development. Follow along as we ship the routing engine, or request early access.