Overview

VESSL AI — Control plane for machine learning and computing

VESSL AI provides a unified interface for training and deploying AI models on the cloud. Simply define your GPU resource and pinpoint to your code & dataset. VESSL AI does the orchestration & heavy lifting for you:

Create a GPU-accelerated container with the right Docker Image.
Mount your code and dataset from GitHub, Hugging Face, Amazon S3, and more.
Launches the workload on our fully managed GPU cloud.

One any cloud, at any scale

Instantly scale workloads across multiple clouds.

Streamlined interface

Launch any AI workloads with a unified YAML definition.

End-to-end coverage

A single platform for fine-tuning to deployment.

A centralized compute platform

Optimize GPU usage and save up to 80% in cloud.

What can you do?

Run compute-intensive AI workloads remotely within seconds.
Fine-tune LLMs with distributed training and auto-failover with zero-to-minimum setup.
Scale training and inference workloads horizontally.
Deploy an interactive web applicaiton for your model.
Serve your AI models as web endpoints.

How to get started

Head over to VESSL AI and sign up for a free account. No docker build or kubectl get.

Create your account at VESSL AI.
Install our Python package — pip install vessl.
Follow our Quickstart guide or try out our example models at VESSL Hub.

How does it work?

VESSL AI abstracts the obscure infrastructure and complex backends inherent to launching AI workloads into a simple YAML file, so you don’t have to mess with AWS, Kubernetes, Docker, or more. Here’s an example that launches a chatbot app for Llama 3.2.

name: huggingface-chatbot
description: Chatbot using HuggingFace OSS models
tags:
  - chatbot
  - LLM
import:
  /code/: git://github.com/vessl-ai/examples.git
resources:
  cluster: vessl-oci-sanjose
  preset: gpu-a10-small
image: quay.io/vessl-ai/vllm:0.6.4
run:
  - command: |
      pip install -r requirements.txt
      python app.py --model-id $MODEL_ID
    workdir: /code/runs/hf-chatbot-vllm
ports:
  - name: gradio
    type: http
    port: 7860
env:
  HF_HUB_ENABLE_HF_TRANSFER: "1"
  MODEL_ID: unsloth/Llama-3.2-3B-Instruct

With every YAML file, you are creating a VESSL Run. VESSL Run is an atomic unit of VESSL AI, a single unit of Kubernetes-backed AI workload. You can use our YAML definition as you progress throughout the AI lifecycle from checkpointing models for fine-tuning to exposing ports for inference.

What’s next?

See VESSL AI in action with our examples Runs and pre-configured open-source models.

Quickstart – Hello, world!

Launch a barebone GPU-accelerated workload on VESSL

GPU-accelerated notebook

Launch a Jupyter Notebook server with an SSH connection

Stable Diffusion Playground

Interactive playground of Stable Diffusion

Llama 3.1 fine-tuning

Fine-tune Llama 3.1-8B with instruction dataset

Get Started

Compute

Resource

Admin

Private Hub

Pricing

VESSL AI — Control plane for machine learning and computing

One any cloud, at any scale

Streamlined interface

End-to-end coverage

A centralized compute platform

What can you do?

How to get started

How does it work?

What’s next?

Quickstart – Hello, world!

GPU-accelerated notebook

Stable Diffusion Playground

Llama 3.1 fine-tuning

Get Started

Compute

Resource

Admin

Private Hub

Pricing

​VESSL AI — Control plane for machine learning and computing

One any cloud, at any scale

Streamlined interface

End-to-end coverage

A centralized compute platform

​What can you do?

​How to get started

​How does it work?

​What’s next?

Quickstart – Hello, world!

GPU-accelerated notebook

Stable Diffusion Playground

Llama 3.1 fine-tuning

VESSL AI — Control plane for machine learning and computing

What can you do?

How to get started

How does it work?

What’s next?