// Documentation

GETTING
STARTED.

Welcome to SScoreCompute. Deploy NVIDIA H100, H200, B300 and GB300 NVL72 GPUs on AWS and Azure in under 60 seconds. Billed per hour in CAD.

QUICK START

01

Create Your Account

Sign up at sscorecompute.com/signup. No credit card required. Free to create an account.

02

Choose Your GPU

Select H100, H200, B300 or GB300 NVL72. Pick your cloud (AWS or Azure) and region. All priced in CAD per hour.

03

Deploy in 60 Seconds

Click Deploy. Your GPU instance is ready in under 60 seconds. SSH in or connect via our web terminal.

04

Run Your Workload

Install your framework, upload your model, and start training or inference. Stop the instance when done — you only pay for what you use.

AVAILABLE GPUS

GPUARCHVRAMBANDWIDTHCAD/HRBEST FOR
H100Hopper80GB HBM33.35 TB/s$3.14Inference, RAG
H200Hopper141GB HBM3e4.8 TB/s$4.10Large models
B300Blackwell192GB HBM3e8.0 TB/s$6.71Agentic AI
GB300 NVL72Blackwell Ultra13.8TB pooled130 TB/s$409Training

PYTORCH QUICKSTART

All SScoreCompute GPU instances come with CUDA 12.4, cuDNN 9, and Python 3.11 pre-installed.
# Install PyTorch with CUDA 12.4
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

# Verify GPU is available
python -c "import torch; print(torch.cuda.get_device_name(0))"
# Output: NVIDIA H100 80GB HBM3

VLLM — LLM INFERENCE

Deploy LLaMA, Mistral, Qwen or any HuggingFace model with vLLM for high-throughput inference on SScoreCompute GPUs.

# Install vLLM
pip install vllm

# Serve LLaMA 3 70B on H200
python -m vllm.entrypoints.openai.api_server \
  --model meta-llama/Meta-Llama-3-70B-Instruct \
  --tensor-parallel-size 1 \
  --max-model-len 8192 \
  --port 8000

HUGGING FACE TRANSFORMERS

pip install transformers accelerate bitsandbytes

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
  model_id, torch_dtype=torch.bfloat16, device_map="cuda"
)

FINE-TUNING WITH LORA

pip install peft trl datasets

from peft import LoraConfig, get_peft_model
from trl import SFTTrainer

lora_config = LoraConfig(
  r=16, lora_alpha=32,
  target_modules=["q_proj", "v_proj"],
  lora_dropout=0.05, bias="none"
)

BILLING IN CAD

All SScoreCompute instances are billed per hour in Canadian dollars. Billing starts when an instance is provisioned and stops when it is terminated. There is no minimum spend and no contracts.

GPUCAD/HRCAD/DAY (8HRS)CAD/MONTH (8HRS/DAY)
H100$3.14$25.12$753.60
H200$4.10$32.80$984.00
B300$6.71$53.68$1,610.40
GB300 NVL72$409$3,272$98,160
NEED HELP?

Our team responds within 2 hours on business days.

Contact Support →