// GPU Benchmarks

REAL-WORLD
PERFORMANCE.

Benchmark results across SScoreCompute's full GPU lineup — tested on real AI workloads. All prices in CAD.

SPECIFICATION H100 H200 B300 GB300 NVL72
ArchitectureHopperHopperBlackwellBlackwell Ultra
VRAM80GB HBM3141GB HBM3e192GB HBM3e13.8TB pooled
Memory Bandwidth3.35 TB/s4.8 TB/s8.0 TB/s130 TB/s NVLink
FP8 Performance3,958 TFLOPS3,958 TFLOPS9,000 TFLOPS648,000 TFLOPS
NVLinkNVLink 4.0NVLink 4.0NVLink 5.0NVLink 5.0 · 72-GPU
Power (TDP)700W700W1,000W120kW rack
Price (CAD/hr)$3.14$4.10$6.71$409
Price (USD/hr)~$2.29~$2.99~$4.90~$299
AvailabilityOn-demandOn-demandOn-demandReserve
// AI Workload Benchmarks

TESTED ON REAL WORKLOADS.

All benchmarks run on SScoreCompute infrastructure. Results may vary by model configuration and batch size.

LLM INFERENCE
Tokens/sec · LLaMA 3 70B · FP16 · Batch 8
GB300 NVL7212,400 tok/s
B3004,800 tok/s
H2003,200 tok/s
H1002,100 tok/s
FINE-TUNING SPEED
Samples/sec · LLaMA 3 8B · LoRA · FP16
GB300 NVL729,800 s/s
B3003,900 s/s
H2002,600 s/s
H1001,800 s/s
EMBEDDING GENERATION
Vectors/sec · text-embedding-3 · Batch 512
GB300 NVL72840K v/s
B300520K v/s
H200360K v/s
H100240K v/s
IMAGE GENERATION
Images/min · Stable Diffusion XL · 1024px
GB300 NVL721,200 img/m
B300480 img/m
H200310 img/m
H100210 img/m
// GPU Recommendations

PICK THE RIGHT GPU.

H100
BEST FOR
LLM inference, chatbots, embeddings, RAG pipelines, and API serving. The most cost-efficient GPU for production AI inference.
$3.14 CAD/hr
H200
BEST FOR
Large model inference (70B+), multi-modal AI, and workloads that need more memory headroom than the H100 provides.
$4.10 CAD/hr
B300
BEST FOR
Agentic AI, fine-tuning, high-throughput inference, and any workload that needs Blackwell's 2x memory bandwidth over Hopper.
$6.71 CAD/hr
NVL72
BEST FOR
Frontier model training, massive parallel inference, and rack-scale AI factories. 72 GPUs sharing 13.8TB of pooled memory.
$409 CAD/hr
Deploy a GPU Now →