You can integrate an AWS cluster in your AWS account and connect it to VESSL.
In order to integrate VESSL, the following resources will be created:
S3 Bucket: A bucket for storing configuration, state, and data.
EKS Cluster: An AWS-managed Kubernetes cluster for running ML workloads.
EKS Node Groups: Autoscaling groups for selected resource types.
1. Install Terraform and AWS CLI
VESSL uses Terraform to add a EKS cluster, EKS node groups, and Kubernetes installations.
2. Configure cluster config
First, clone VESSL’s cloud integration terraform code from Github.
git clone https://github.com/vessl-ai/vessl-cloud-integration cd vessl-cloud-integration/examples/aws-eks-existing-vpc
Using VESSL CLI, you can configure Terraform variables and the Terraform backend.
pip install vessl vessl cluster create-config aws
In your directory and in the bucket, two config files and a node group definition file will be generated.
terraform.tfbackend: This file configures Terraform’s backend storage.
terraform.tfvars: This file specifies the variables for your cluster configuration.
nodes.tf: Thie file defines the node groups of your resource types
3. Applying terraform
To initialize your terraform state,
terraform init -backend-config="terraform.tfbackend"
The actual resources will be created by applying terraform.
terraform apply -var-file="terraform.tfvars"
The installation process takes about 20~30 minutes. While installing, please keep your internet connection on.
Once the cluster is installed, you can find it on the cluster page.
Destroy and delete the cluster
In order to destroy all resources created by VESSL, including the clusters, follow these steps:
terraform destroy -var-file="terraform.tfvars"
When the config file is missing in local, you can download it and start from scratch.
git clone https://github.com/vessl-ai/vessl-cloud-integration cd vessl-cloud-integration/examples/aws-eks-existing-vpc vessl cluster get-config [cluster_name] terraform init -backend-config="terraform.tfbackend" terraform destroy -var-file="terraform.tfvars"
After destroying a cluster, you can delete it from the cluster page.