Performance Benchmarks
Compare throughput, latency, and time-to-first-token across different hardware configurations.
Loading metrics...
| Model | Hardware | Concurrency | Throughput (tok/s) ↓ | Latency (s) | TTFT (s) |
|---|
Performance metrics are averages from recent observations. Throughput is tokens/query/second (higher is better). Latency and TTFT (Time To First Token) are in seconds (lower is better).