Stable Diffusion playground

This example deploys a simple web app for Stable Diffusion. You will learn how you can set up an interactive workload for inference — mounting models from Hugging Face and opening up a port for user inputs. For a more in-depth guide, refer to our blog post.

Try it on VESSL Hub

Try out the Quickstart example with a single click on VESSL Hub.

See the final code

See the completed YAML file and final code for this example.

Note that if you want to save your credits, remember to “Terminate” to stop and end the runs.

What you will do

Host a GPU-accelerated web app built with Streamlit
Mount model checkpoints from Hugging Face
Open up a port to an interactive workload for inference

Writing the YAML

Let’s fill in the stable-diffusion.yaml file.

Spin up an interactive workload

We already learned how you can launch an interactive workload in our previous guide. Let’s copy & paste the YAML we wrote for notebook.yaml.

name: Stable Diffusion Playground
description: An interactive web app for Stable Diffusion
resources:
  cluster: vessl-oci-sanjose
  preset: gpu-l4-small
image: quay.io/vessl-ai/torch:2.1.0-cuda12.2-r3
interactive:
  jupyter:
    idle_timeout: 120m
  max_runtime: 24h

Configure an interactive run

Let’s mount a GitHub repo and import a model checkpoint from Hugging Face. We already learned how you can mount a codebase from our Quickstart guide.

VESSL comes with a native integration with Hugging Face so you can import models and datasets simply by referencing the link to the Hugging Face repository. Under import, let’s create a working directory /model/ and import the model.

name: Stable Diffusion Playground
description: An interactive web app for Stable Diffusion
resources:
  cluster: vessl-oci-sanjose
  preset: gpu-l4-small
image: quay.io/vessl-ai/torch:2.1.0-cuda12.2-r3
	import:
		/code/:
			git:
				url: https://github.com/vessl-ai/hub-model
				ref: main
		/model/: hf://huggingface.co/VESSL/SSD-1B
interactive:
  jupyter:
    idle_timeout: 120m
  max_runtime: 24h

Open up a port for inference

The ports key expose the workload ports where VESSL listens for HTTP requests. This means you will be able to interact with the remote workload — sending input query and receiving an generated image through port 80 in this case.

name: Stable Diffusion Playground
description: An interactive web app for Stable Diffusion
resources:
  cluster: vessl-oci-sanjose
  preset: gpu-l4-small
image: quay.io/vessl-ai/torch:2.1.0-cuda12.2-r3
	import:
		/code/:
			git:
				url: https://github.com/vessl-ai/hub-model
				ref: main
		/model/: hf://huggingface.co/VESSL/SSD-1B
interactive:
  jupyter:
    idle_timeout: 120m
  max_runtime: 24h
ports:
  - name: streamlit
    type: http
    port: 80

Write the run commands

Let’s install additional Python dependencies with requirements.txt and finally run our app ssd_1b_streamlit.py.

Here, we see how our Streamlit app is using the port we created previously with the --server.port=80 flag. Through the port, the app receives a user input and generates an image with the Hugging Face model we mounted on /model/.

name: Stable Diffusion Playground
description: An interactive web app for Stable Diffusion
resources:
  cluster: vessl-oci-sanjose
  preset: gpu-l4-small
image: quay.io/vessl-ai/torch:2.1.0-cuda12.2-r3
import:
  /code/:
    git:
      url: https://github.com/vessl-ai/hub-model
      ref: main
  /model/: hf://huggingface.co/VESSL/SSD-1B
run:
  - command: |-
      pip install -r requirements.txt
      streamlit run ssd_1b_streamlit.py --server.port=80
    workdir: /code/SSD-1B
interactive:
  max_runtime: 24h
  jupyter:
    idle_timeout: 120m
ports:
  - name: streamlit
    type: http
    port: 80

Running the app

Once again, running the workload will guide you to the workload Summary page.

vessl run create -f stable-diffusion.yaml

Under ENDPOINTS, click the streamlit link to launch the app.

Using our web interface

You can repeat the same process on the web. Head over to your Organization, select a project, and create a New run.

What’s next?

See how VESSL takes care of the infrastructural challenges of fine-tuning a large language model with a custom dataset.

Llama 3.2 Fine-tuning

Fine-tune Llama 3.2-3B with instruction datasets

Llama 3.1 deployment

Serve & deploy vLLM-accelerated Llama 3.1-8B

Enable Serverless Mode

Deploy Serverless mode using Text Generation Inference(TGI)

Get Started

Compute

Resource

Admin

Private Hub

Pricing

Stable Diffusion playground

Try it on VESSL Hub

See the final code

What you will do

Writing the YAML

Spin up an interactive workload

Configure an interactive run

Open up a port for inference

Write the run commands

Running the app

Using our web interface

What’s next?

Llama 3.2 Fine-tuning

Llama 3.1 deployment

Enable Serverless Mode

Get Started

Compute

Resource

Admin

Private Hub

Pricing

Try it on VESSL Hub

See the final code

​What you will do

​Writing the YAML

Spin up an interactive workload

Configure an interactive run

Open up a port for inference

Write the run commands

​Running the app

​Using our web interface

​What’s next?

Llama 3.2 Fine-tuning

Llama 3.1 deployment

Enable Serverless Mode

What you will do

Writing the YAML

Running the app

Using our web interface

What’s next?