Before deploying a service, ensure that VESSL Service is enabled in the cluster settings. Refer to the Endpoint Configuration Guide for details on how to enable VESSL Service and configure endpoints properly.
Deploy a new service using CLI
Using YAML file
To deploy a service using the CLI, you’ll first need to define your service configuration in a YAML file. This YAML-based configuration allows you to deploy services programmatically. If you were using this feature as beta and want to migrate to new version, please refer to migration guide.Example YAML configuration:
Steps to deploy using CLI:
- Create or edit your YAML file to define the service configuration.
- Deploy the service by running the following command in your terminal. Replace
[your-yaml-file].yaml
with the path to your YAML file.:If you want to deploy a serverless mode, make sure to append--serverless
flag.If deploying to a NodePort cluster, specify the —port option to expose the service on a specific port.
Using VESSL Hub templates
VESSL Hub provides service templates for rapid service deployment.Steps to deploy using VESSL Hub templates:
- Get the key of the template you want to use from the VESSL Hub
- Start your service by
vessl service create --from-hub=[template-key]
.
Deploy a new service using web console
Deploying through the web console is user-friendly and suitable for those who prefer a graphical interface over command line operations. The interactive demo below will guide you to through the process.Provisioned Mode — Steps to create a new service in the web console
The explanations of each field are as follows:
The explanations of each field are as follows:
- Initialize this revision with:: Select initialization method.
- Template from VESSL hub: Use a template from the VESSL Hub.
- Recent revision configuration: Select the configuration of the recent revisions.
- YAML file: Upload a YAML file or paste its content to initialize the revision.
- Message: Enter a message for the revision.
- Resources: Select the compute resources and container image you want to use for the Service.
- Resource: Select the compute resources you want to use for the Service.
- Container image: The Docker image to use for the Revision.
- Task:
- Volumes: Import or mount code, data
- Command: The command to run inside the container. This is similar to running a command in the terminal on your computer.
- Port: The port to expose from the container. For example, if you’re using a BentoML model server, you’ll want to expose port 3000 and use the HTTP protocol to access the service endpoint.
- Monitoring: Enable monitoring to track default system metrics from service workers.
- Healthcheck: Check API health using the specified port and path.
- Autoscaling: Set autoscaling strategy for the revision.
- Target Metric: The metric to use for autoscaling - cpu, memory, nvidia.com/GPU, requests.
- Target Value: The target value for the metric.
- Min value: The minimum number of replicas.
- Max value: The maximum number of replicas.
- Variables: Environment variables or secret variables to inject into the container.
Serverless Mode — Steps to create a new service in the web console:
The explanations of each field are as follows:
The explanations of each field are as follows:
- Resources: Select the compute resources and container image you want to use for the Service.
- Resource: Select the compute resources you want to use for the Service. Custom resource specs cannot be set in Serverless mode. If you need to use a custom resource spec, please contact our sales team.
- Container image: The Docker image to use for the Revision.
- Task:
- Command: The command to run inside the container. This is similar to running a command in the terminal on your computer.
- Port: The port to expose from the container. For example, if you’re using a BentoML model server, you’ll want to expose port 3000 and use the HTTP protocol to access the service endpoint. You can open only one port in Serverless mode.
- Advanced Settings: Set additional configurations.
- Variables: Environment variables or secret variables to inject into the container.