−50% on all plans · starting at €2.48/mo · Blog·Docs·Sales

GPU VPS

NVIDIA L4, L40S, and H100 GPU VPS with full CUDA stack pre-installed. Hourly billing from €0.40/hour. Perfect for LLM inference, fine-tuning, image generation. Hosted in France.

Deploy now → View pricing
From €0.40/mo

What is GPU VPS?

A GPU VPS is a virtual server with dedicated, passed-through GPU access — meaning the GPU is not shared with any other tenant. We offer three NVIDIA tiers: L4 (for cost-efficient inference), L40S (for mid-scale training and high-throughput inference), and H100 (for serious training, fine-tuning, and FP8 acceleration).

Every GPU VPS ships with the full CUDA stack pre-installed: NVIDIA driver, CUDA toolkit (12.x), cuDNN, NCCL, and a choice of pre-baked images for PyTorch, TensorFlow, vLLM, TGI, and ComfyUI. You can be running torch.cuda.is_available() == True within 90 seconds of clicking deploy.

Pricing is hourly with no minimum commitment. An L4 starts at €0.40/hour. An H100 is €3.20/hour. Spin one up to fine-tune for an afternoon, tear it down when you're done, only pay for active GPU time. For sustained workloads, monthly reserved pricing is available with up to 60% off hourly rates.

GPU tiers explained

NVIDIA L4 — €0.40/hour

The L4 is our entry-level inference GPU. With 24 GB of GDDR6 memory and 30 TFLOPs FP16, it's optimized for serving small-to-medium LLMs (up to 13B parameters) and SDXL image generation. Perfect for production inference of fine-tuned models, embeddings generation, and ComfyUI workflows.

NVIDIA L40S — €1.40/hour

The L40S brings 48 GB of memory and 91 TFLOPs FP16 — enough headroom to serve quantized 70B models, run mid-scale fine-tuning jobs, or batch-process large image generation pipelines. The L40S is also our most cost-effective training option for LoRA fine-tuning of 7B-13B models.

NVIDIA H100 — €3.20/hour

The H100 is the apex predator. 80 GB HBM3 memory, 1979 TFLOPs FP8, and the Transformer Engine that accelerates attention computation by 6x compared to A100. Use this when training serious models, full-precision 70B+ inference, or any workload where GPU is the bottleneck.

Use cases

Specifications

GPUsNVIDIA L4 (24 GB), L40S (48 GB), H100 (80 GB)
vCPU8 to 32 cores · AMD EPYC
RAM32 GB to 256 GB DDR5
Storage200 GB to 4 TB NVMe scratch
Network40 Gbps
CUDA12.x pre-installed
ML stacksPyTorch, TensorFlow, vLLM, TGI, ComfyUI
DatacentersParis (all tiers), Marseille (L4)
BillingHourly, no minimum
Reserved discountUp to −60% on monthly/yearly

Ready to deploy?

14-day money-back guarantee, cancel anytime, support in French and English.