Tokenized GPU cloud
The cloud for AI
infrastructure, made liquid.
Buy and trade compute tokens backed by AMD Instinct accelerators. Redeem for GPU sessions, API credits, or batch jobs — with the clarity of a modern AI platform.
AMD
Super Micro Computers
Micron Technology
Region
Lubbock, Texas
Low-latency GPU fabric
How it works
From tokens to production GPUs
Four steps — the same clarity you expect from a modern AI cloud.
Acquire Tokens
Purchase compute tokens backed by specific GPU hardware. Each token represents one GPU-hour of compute on that accelerator.
Buy Tokens →Redeem or Trade
Redeem tokens for live GPU sessions, API credits, or batch jobs. Or trade unused tokens on the open marketplace at market price.
Start Trading →Lock in Vaults
Lock tokens for 3, 6, or 12 months to earn yield, discounted compute rates, and priority queue access.
Explore Vaults →Deploy at Scale
Lease enterprise clusters with dedicated hardware, InfiniBand networking, and white-glove support for production workloads.
View Clusters →Infrastructure
We build AI-optimized, power-aware data centers
Lubbock Cloud couples West Texas grid dynamics with dense GPU footprints: liquid cooling paths, high-uptime networking, and interruptible workloads when markets demand it.
Tokens map to real racks — not synthetic credits. When you redeem, you land on hardware we operate, instrument, and support end to end.
Case study
Tested with GenAI workloads
Teams run inference, fine-tuning, and batch pipelines on the same tokenized footprint — with metrics that match what you see in the marketplace.
LLM inference where tokens meet telemetry
Goal: Give builders a single surface to buy capacity, deploy models, and trace cost per token in real time.
Approach: ROCm-native images, managed endpoints, and a job fabric that reports queue depth back to the dashboard you already use for balances.
- Inference
- ROCm
- Tokens
- Observability
Sub-100ms
target TTFT on tuned vLLM routes for 70B-class models on MI300X.
1 token
= 1 GPU-hour on the printed accelerator — no hidden conversion tables.
Full stack
API, Terraform-style provisioning hooks, and live job telemetry in one place.
We provide every essential resource for your AI journey
Hardware-backed tokens, a trading surface, and managed services — composed the way modern AI teams expect.
Latest AMD Instinct accelerators
MI210 through MI355X — matched 1:1 to liquid compute tokens.
Console
Cloud-native experience
LUB-MI300X 2.84 USD · 934 GPUs live
job submit --gpu MI325X --hours 8
Ready-to-run stack
Ship training and inference without assembling drivers by hand.
Fully managed services
Data and control planes you do not babysit.
- PostgreSQL & Redis
- Metrics & tracing
- Secrets & API keys
- Backup & snapshots
Architects & expert support
Multi-node training, InfiniBand, and ROCm — covered by engineers who run the metal.
Enterprise
Dedicated clusters for inference & training
Pre-built topologies with reserved capacity, InfiniBand, and white-glove onboarding.
Marketplace
Live token prices
Each token maps to one GPU-hour on dedicated hardware.
192 GB HBM3
$2.84
+3.42%
256 GB HBM3E
$3.56
+1.87%
288 GB HBM3E
$5.12
+7.23%
128 GB HBM2e
$2.18
+1.12%
64 GB HBM2e
$1.42
-0.28%
128 GB HBM3
$2.95
+2.04%
Fleet
Accelerator availability
Live view of capacity across our Lubbock, Texas region.
| Accelerator | Architecture | HBM | FP16 TFLOPS | TDP | Availability |
|---|---|---|---|---|---|
AMDMI300X | CDNA 3 | 192 GB HBM3 | 1,307 | 750W | 342/512 |
AMDMI325X | CDNA 3 | 256 GB HBM3E | 1,307 | 750W | 128/256 |
AMDMI355X | CDNA 4 | 288 GB HBM3E | 2,300 | 800W | 64/128 |
AMDMI250X | CDNA 2 | 128 GB HBM2e | 383 | 500W | 198/320 |
AMDMI210 | CDNA 2 | 64 GB HBM2e | 181 | 300W | 112/192 |
AMDMI300A | CDNA 3 | 128 GB HBM3 | 1,307 | 760W | 88/160 |
Managed ROCm services
Purpose-built AI services running natively on AMD Instinct hardware. No ROCm complexity — just APIs.
LLM Inference Endpoint
Deploy any open-weight LLM with vLLM on MI300X. Sub-100ms TTFT, automatic batching, OpenAI-compatible API.
Embeddings API
High-throughput text embeddings with BGE, E5, or custom models. Batch processing up to 10K docs/min.
Fine-Tuning Pipeline
Managed fine-tuning with LoRA, QLoRA, or full-parameter training. Weights stored in your vault.
Vector Database
GPU-accelerated vector search powered by FAISS on AMD hardware. Millisecond queries at billion-scale.
Image Generation
SDXL and Flux inference on ROCm. API and batch modes. Custom model uploads supported.
Video Generation
Coming SoonText-to-video and image-to-video pipelines on MI355X. Up to 4K resolution, 30fps output.
Get started
Own, trade, and deploy
GPU infrastructure
Stop renting by the hour. Build on tokenized compute with a product experience modeled for AI teams.