// QML v4.2 · Production ready

Computing, elevated.

The quantum-classical ML platform trusted by research labs and frontier AI teams. Hybrid algorithms, distributed training, and a developer surface that doesn't feel like 1998.

ENVproduction UPTIME99.997% LATENCY12ms p99
qml/inference.py
01# Quantum-classical inference
02from qml import Pipeline, Hybrid
03 
04pipe = Pipeline(
05  model="qlab/transformer-v4",
06  qpu="ibm-127q",
07  gpu="h100-x8",
08)
09 
10result = pipe.run(prompt)
11# → 12.4ms · $0.0011
RUNNING · 1,240 active jobs · 18 QPU nodes online
QPU_NODES_ONLINE18
GPU_HOURS_DAILY94,200
P99_LATENCY12ms
REQUESTS_MONTH1.8B
STACK / 01

Nine primitives. One runtime.

Every quantum-classical building block you need to train, serve and observe models — in a single, type-safe SDK.

01 / QPU_FABRIC

Quantum Fabric

Unified access to superconducting, trapped-ion and photonic processors. Automatic backend selection per job.

IBMIonQXanadu
02 / GPU_MESH

GPU Mesh

Distributed training across H100 clusters. Gradient checkpointing and offload handled by the runtime.

H100TPU-v5MI300
03 / HYBRID

Hybrid Runner

Schedule quantum + classical workloads in the same job graph. Failover, retry and cost-budgeting built in.

DAGRetryBudget
04 / DATA

Tensor Store

Columnar storage optimised for ML — petabyte datasets streamed with sub-millisecond latency.

ParquetZarrArrow
05 / OBS

Observability

End-to-end traces for hybrid jobs — from classical preprocessing through the quantum circuit and back.

OTLPGrafana
06 / GOV

Governance

Policy-as-code for data access, model deployment, and cost caps. SOC 2 and ISO 27001 certified.

SOC2ISOHIPAA
PIPELINE / 02

A four-stage hybrid runtime.

Quantum-classical pipelines are easy to start, hard to operate. QML's runtime handles the hard parts so your researchers can focus on the model — not the scheduler.

01

Compile

Circuit transpilation targeted to the specific backend, with error-correction strategies baked in.

< 200ms
02

Schedule

Jobs routed to the cheapest QPU meeting your SLA. Queue-aware. Budget-enforcing.

Policy-as-code
03

Execute

Hybrid execution with lock-step classical preprocessing and postprocessing on adjacent GPUs.

Low-latency
04

Observe

Live traces with fidelity, duration and cost per shot. Exported to your stack via OTLP.

Open-telemetry
CLI / 03

Twelve seconds from commit to production.

bash — qlab deployssh qlab@prod
$ qlab deploy model.yaml
→ compiling quantum circuit · 180ms
→ allocating QPU (ibm-127q) · 420ms
→ warming GPU pool (h100 x8) · 2.1s
→ health-check · 600ms
→ traffic shifted 0% → 100% · 6.8s

deploy complete · 12.4s
endpoint: https://inference.qlab.dev/v1/run
"We used to wait forty minutes for a single hybrid job to schedule. On QML, it's twelve seconds — and the tracing tells us exactly where the time went."
Dr. Yuki TanakaHead of Quantum ML · Anthropic Research

Build on the next runtime.

Request production access, or jump straight into a free research tier with 8 QPU-hours per month.

Request Access ▸ Free Research Tier