Creating a Run with Web Console
Constructing a Run
With VESSL’s user-friendly Web Console, setting up a new machine learning run is easier than ever. There are two primary ways to create a run in the Web Console.
Create a Run from scratch
Metadata
Metadata configuration is used to annotate runs with additional contextual infromation. This includes name
, description
, and tags
. Note that name is required field and tags are unique in the project.
Resources
You can create a run on either VESSL’s managed cluster or your custom cluster. Start by selecting a cluster.
Cluster & Resource
VESSL managed cluster
Custom cluster
Once you selected VESSL’s managed cluster, you can view a list of available resources under the dropdown menu.
You also have an option to use sopt instances.
Run on spot instance
Handling spot interruption and checkpointing to preserve your work.
Check out the full list of resource types and corresponding prices:
Billing information
Calculating fees according to the time and type of computational resources consumed.
Container image
The Container image specifies typically the Docker image to be used for the run. The image encompasses all required dependencies and the environment needed for executing your machine learning model seamlessly. You can either use a VESSL-managed image or your own custom image.
VESSL managed image
Custom image
Managed images serve as wrapper images built on top of NVIDIA GPU Cloud (NGC) images, providing an optimized and streamlined environment for GPU-accelerated applications and workflows.
Task
Volumes
The volumes configuration plays a crucial role in mananging data flows with respect to the run container. Three primary volume operations — import
, mount
, and export
— determine the data accessibility and transfer mechanisms.
Import
Mount
Export
During import operation, specified data will be downloaded into the run container. This is particularly useful when container requires local access to certain data before or during execution.
- Code: Source code required for the run.
- Dataset: The dataset registered in VESSL Dataset.
- Model: Pre-trained ML checkpoints registered in VESSL Model Registry.
- VESSL Artifcat: The storage manged within VESSL. You can use it as a backup volume.
- Object Storage: Data stored in a generic object storage.
- Files: Uploaded local files.
Backup and Restore Data
Run, Backup, Repeat: GPU-powered JupyterLab with VESSL Artifact
By understanding and correctly configuring these volumes
options, users can create a flexible and efficient data flow strategy in their VESSL Runs.
Start commands
Start commands are a collection of commands that specify how a container should begin execution after it is initialized. These commands can be grouped into two categories.
- Commands that include a pair of working directory and the command to be run in the container.
- A wait command to introduce a delay before or between command execution.
The start command can be empty to signify an interactive run where the user is expected to manually execute commands within the container.
Interactive
Interactive is a key feature designed to specify whether the container allows interactive communication with the user.
It is particularly useful for debugging, data analysis, or running services that require user interaction. By default, the interactive run supports JupyterLab and SSH. Both
Max runtime
and Jupyter idle timeout
are useful to mange resource usage and costs. You can also use multiple types of custom service via specified ports.
Port
Port configuration is a list of maps that specifies infromation about a particular application or service should expose. Each map within the list defines specific attributes of a port such as its number, name, and type.
Variables
Environment variables
You can set environment variables as key-value pairs.
A typical machine learning run will include hyperparameters such as
learning_rate
and optimizer
. You can also use them at runtime by appending them to the start command as follows.
python main.py \
--learning_rate $learning_rate
--optimzer $optimizer
If you have sensitive information like API keys or passwords that you need to include in your environment, you can mark these variables as secrets. The values will never be shown in the UI, ensuring an extra layer of secrutiy.
Advanced Settings
Service account name
A service account is a type of non-human account that Kubernetes provides a distinct identity in a cluster. The account is useful to implement identity-based security policies. Create one in a Kubernetes cluster and specify its name.
Termination protection
Checking the termination protection option puts the run in idle once it completes running, so you to access the container of the finished run.
Create a Run from template
Initiate a new run using a pre-configured template as a baseline. Instead of setting up each parameter and configuration from scratch, you can use a template that already has essential settings and parameters defined. This can significantly accelerate the deployment and testing phases of your projects by reusing configurations that are known to work well for specific use-cases.
The template typically comes in a YAML format. You can further customize these templates to better fit your specific requirements, making it a versatile tool for repetitive or complex tasks.
Additionally, for more advanced configurations and examples, you can visit VESSL Hub. The hub offers a variatey of YAML examples that you can use as references.