// GPU Benchmarks

REAL-WORLD
PERFORMANCE.

Benchmark results across SScoreCompute's full GPU lineup — tested on real AI workloads. All prices in CAD.

SPECIFICATION	H100	H200	B300	GB300 NVL72
Architecture	Hopper	Hopper	Blackwell	Blackwell Ultra
VRAM	80GB HBM3	141GB HBM3e	192GB HBM3e	13.8TB pooled
Memory Bandwidth	3.35 TB/s	4.8 TB/s	8.0 TB/s	130 TB/s NVLink
FP8 Performance	3,958 TFLOPS	3,958 TFLOPS	9,000 TFLOPS	648,000 TFLOPS
NVLink	NVLink 4.0	NVLink 4.0	NVLink 5.0	NVLink 5.0 · 72-GPU
Power (TDP)	700W	700W	1,000W	120kW rack
Price (CAD/hr)	$3.14	$4.10	$6.71	$409
Price (USD/hr)	~$2.29	~$2.99	~$4.90	~$299
Availability	On-demand	On-demand	On-demand	Reserve

// AI Workload Benchmarks

TESTED ON REAL WORKLOADS.

All benchmarks run on SScoreCompute infrastructure. Results may vary by model configuration and batch size.

LLM INFERENCE

Tokens/sec · LLaMA 3 70B · FP16 · Batch 8

GB300 NVL7212,400 tok/s

B3004,800 tok/s

H2003,200 tok/s

H1002,100 tok/s

FINE-TUNING SPEED

Samples/sec · LLaMA 3 8B · LoRA · FP16

GB300 NVL729,800 s/s

B3003,900 s/s

H2002,600 s/s

H1001,800 s/s

EMBEDDING GENERATION

Vectors/sec · text-embedding-3 · Batch 512

GB300 NVL72840K v/s

B300520K v/s

H200360K v/s

H100240K v/s

IMAGE GENERATION

Images/min · Stable Diffusion XL · 1024px

GB300 NVL721,200 img/m

B300480 img/m

H200310 img/m

H100210 img/m

// GPU Recommendations

PICK THE RIGHT GPU.

H100

BEST FOR

LLM inference, chatbots, embeddings, RAG pipelines, and API serving. The most cost-efficient GPU for production AI inference.

$3.14 CAD/hr

H200

BEST FOR

Large model inference (70B+), multi-modal AI, and workloads that need more memory headroom than the H100 provides.

$4.10 CAD/hr

B300

BEST FOR

Agentic AI, fine-tuning, high-throughput inference, and any workload that needs Blackwell's 2x memory bandwidth over Hopper.

$6.71 CAD/hr

NVL72

BEST FOR

Frontier model training, massive parallel inference, and rack-scale AI factories. 72 GPUs sharing 13.8TB of pooled memory.

$409 CAD/hr

Deploy a GPU Now →

REAL-WORLDPERFORMANCE.

TESTED ON REAL WORKLOADS.

PICK THE RIGHT GPU.

REAL-WORLD
PERFORMANCE.