Name: GPU VPS
Brand: FranceVPS
Price: 0.40 EUR
Availability: InStock
Rating: 4.7 (412 reviews)

What is GPU VPS?

A GPU VPS is a virtual server with dedicated, passed-through GPU access — meaning the GPU is not shared with any other tenant. We offer three NVIDIA tiers: L4 (for cost-efficient inference), L40S (for mid-scale training and high-throughput inference), and H100 (for serious training, fine-tuning, and FP8 acceleration).

Every GPU VPS ships with the full CUDA stack pre-installed: NVIDIA driver, CUDA toolkit (12.x), cuDNN, NCCL, and a choice of pre-baked images for PyTorch, TensorFlow, vLLM, TGI, and ComfyUI. You can be running torch.cuda.is_available() == True within 90 seconds of clicking deploy.

Pricing is hourly with no minimum commitment. An L4 starts at €0.40/hour. An H100 is €3.20/hour. Spin one up to fine-tune for an afternoon, tear it down when you're done, only pay for active GPU time. For sustained workloads, monthly reserved pricing is available with up to 60% off hourly rates.

GPU tiers explained

NVIDIA L4 — €0.40/hour

The L4 is our entry-level inference GPU. With 24 GB of GDDR6 memory and 30 TFLOPs FP16, it's optimized for serving small-to-medium LLMs (up to 13B parameters) and SDXL image generation. Perfect for production inference of fine-tuned models, embeddings generation, and ComfyUI workflows.

NVIDIA L40S — €1.40/hour

The L40S brings 48 GB of memory and 91 TFLOPs FP16 — enough headroom to serve quantized 70B models, run mid-scale fine-tuning jobs, or batch-process large image generation pipelines. The L40S is also our most cost-effective training option for LoRA fine-tuning of 7B-13B models.

NVIDIA H100 — €3.20/hour

The H100 is the apex predator. 80 GB HBM3 memory, 1979 TFLOPs FP8, and the Transformer Engine that accelerates attention computation by 6x compared to A100. Use this when training serious models, full-precision 70B+ inference, or any workload where GPU is the bottleneck.

Use cases

LLM inference: Serve Llama 3, Mistral, Mixtral, Qwen via vLLM or TGI.
Fine-tuning: LoRA and full fine-tuning of 7B-13B models on a single H100.
Image generation: Stable Diffusion XL, Flux, ComfyUI workflows.
Computer vision: YOLOv8/v9, Detectron2, MMDetection workloads.
Scientific compute: CUDA-accelerated simulations, computational chemistry.

Specifications

GPUs	NVIDIA L4 (24 GB), L40S (48 GB), H100 (80 GB)
vCPU	8 to 32 cores · AMD EPYC
RAM	32 GB to 256 GB DDR5
Storage	200 GB to 4 TB NVMe scratch
Network	40 Gbps
CUDA	12.x pre-installed
ML stacks	PyTorch, TensorFlow, vLLM, TGI, ComfyUI
Datacenters	Paris (all tiers), Marseille (L4)
Billing	Hourly, no minimum
Reserved discount	Up to −60% on monthly/yearly

GPU VPS