vessl experiment
Overview
Run vessl experiment --help
to view the list of commands, vessl experiment [COMMAND] -help
to view individual command instructions.
Create an experiment
Option | Description |
---|---|
-c , --cluster | Cluster name (must be specified before other options) |
-x , --command | Start command to execute in experiment container |
-r , --resource | Resource type to run an experiment (for managed cluster only) |
--processor-type | CPU or GPU (for custom cluster only) |
--cpu-limit | Number of vCPUs (for custom cluster only) |
--memory-limit | Memory limit in GiB (for custom cluster only) |
--gpu-type | GPU type (for custom cluster only) ex. |
--gpu-limit | Number of GPU cores (for custom cluster only) |
--image-url | Kernel docker image URL. |
--upload-local-file (multiple) | Upload local file. Format: [local_path] or [local_path]:[remote_path]. ex. |
--upload-local-git-diff | Upload local git commit hash and diff (only works in project repositories) |
-i , --image-url | Kernel docker image URL ex. |
-m , --message | Message |
--termination-protection | Enable termination protection |
-h , --hyperparameter (multiple) | Hyperparameters in the form of ex. |
--dataset (multiple) | Dataset mounts in the form of ex. |
--root-volume-size | Root volume size (defaults to 20Gi ) |
--working-dir | Working directory path (defaults to /root/ ) |
--output-dir | Output directory path (defaults to /output |
--local-project | Local project file URL |
--worker-count | Number of workers (for distributed experiment only) |
--framework-type | Specify pytorch or tensorflow (for distributed experiment only) |
Download experiment output files
Each user can define experiment output files. You can save validation results, trained checkpoints, best performing models and other artifacts.
Argument | Description |
---|---|
NUMBER | Experiment number |
Option | Description |
---|---|
-p , --path | Local download path (defaults to./output ) |
--worker-number | Worker number (for distributed experiment only) |
List all experiments
List experiment output files
Each user can define experiment output files. You can save validation results, trained checkpoints, best models, and other artifacts.
Argument | Description |
---|---|
NUMBER | Experiment number |
Option | Description |
---|---|
-r , --recursive | List files recursively |
--worker-number | Worker number (for distributed experiment only) |
View logs of the experiment container
Argument | Description |
---|---|
NUMBER | Experiment number |
Option | Description |
---|---|
--tail | Number of lines to display from the end (defaults to 200) |
--worker-number | Worker number (for distributed experiment only) |
View information on the experiment
Argument | Description |
---|---|
NUMBER | Experiment number |
Terminate an experiment
Argument | Description |
---|---|
NUMBER | Experiment number |