AVAILABLE NOW · CAD PRICING

NVIDIA
H100 SXM
GPU CLOUD

The world's most deployed AI GPU. 80GB HBM3, 989 TFLOPS FP16, 3.35 TB/s memory bandwidth — on AWS and Azure, billed per hour in Canadian dollars.

Deploy H100 Now → Get Enterprise Quote

Starting From

$3.14

CAD / HR / GPU

≈ $2.29 USD at current rates

No contracts or minimums

Hourly billing in CAD

Deploy in under 60 seconds

99.9% uptime SLA

AWS & Azure infrastructure

H100 SXM · $3.14 H200 SXM · $4.10 B300 · $6.71 GB300 NVL72 · $409

// Technical Specifications

H100 SXM
FULL SPECS

SPECIFICATION	H100 SXM
Architecture	NVIDIA Hopper
GPU Memory	80 GB HBM3
Memory Bandwidth	3.35 TB/s
FP16 / BF16 Performance	989 TFLOPS
FP8 Performance	1,979 TFLOPS
TF32 Performance	494 TFLOPS
FP64 Performance	33.5 TFLOPS
NVLink Bandwidth	900 GB/s
PCIe Bandwidth	128 GB/s
TDP (Power)	700W
Transformer Engine	Yes — 2nd Gen
CUDA Cores	16,896
SScoreCompute Price	$3.14 CAD/hr
Cloud Platform	AWS + Azure

// Ideal Workloads

WHAT THE H100
IS BUILT FOR

🤖

LLM INFERENCE

Serve LLaMA, Mistral, GPT-class models at scale with vLLM. H100's Transformer Engine and HBM3 memory deliver industry-leading token throughput.

🎯

MODEL FINE-TUNING

Fine-tune 7B–70B models with LoRA or full-parameter training. 80GB VRAM fits most open-source models without quantization.

🔍

RAG PIPELINES

Power high-throughput retrieval-augmented generation. Embed, retrieve, and generate at production scale with sub-200ms latency.

🤝

MULTI-AGENT SYSTEMS

Run parallel agent inference loops — planning, tool use, and reasoning — across multiple H100s with NVLink for fast inter-GPU communication.

🖼️

IMAGE & VIDEO AI

Stable Diffusion, FLUX, video generation, and multimodal models run efficiently on 80GB HBM3 with high-bandwidth memory access.

📊

ML TRAINING

Train models up to ~13B parameters on a single H100. Scale to multi-GPU with NVLink. Ideal for research and production training runs.

// Performance

H100 BENCHMARK
RESULTS

Real-world throughput on SScoreCompute H100 SXM instances. LLaMA-3 70B, batch size 8.

LLM Inference — Output Tokens / Second

LLaMA-3 70B (FP16)

FP16

1,680tok/s

LLaMA-3 70B (FP8)

FP8

2,240tok/s

Mistral 7B (FP16)

FP16

8,400tok/s

Training Throughput

GPT-3 style 6.7B (BF16)

BF16

14,200tok/s

Time to First Token

TTFT

42ms

READY TO DEPLOY
H100 GPUS?

On-demand, by the hour, in CAD. No contracts. No waitlists. Live in 60 seconds.

Deploy H100 Now → Get Enterprise Pricing

NVIDIAH100 SXMGPU CLOUD

H100 SXMFULL SPECS

WHAT THE H100IS BUILT FOR

H100 BENCHMARKRESULTS

READY TO DEPLOYH100 GPUS?

NVIDIA
H100 SXM
GPU CLOUD

H100 SXM
FULL SPECS

WHAT THE H100
IS BUILT FOR

H100 BENCHMARK
RESULTS

READY TO DEPLOY
H100 GPUS?