Example 1: Data Science Project Setup
This example shows how to set up a workspace with multiple data sources for a typical data science project.Web Console Setup
When creating a new workspace, configure the following volumes:-
Import code repository
- Type: Import
- Source:
git://github.com/myorg/customer-analytics - Target path:
/workspace/code
-
Import training dataset
- Type: Import
- Source: VESSL Dataset
myorg/customer-data-train - Target path:
/data/train
-
Mount shared results storage
- Type: Mount
- Source: VESSL Storage
shared-experiments - Target path:
/results
CLI Configuration
Workspace Structure
After startup, your workspace will have:Benefits
- Automatic setup: All data is ready when workspace starts
- Team collaboration: Shared results storage enables team access to experiments
- Version control: Code automatically synced from Git
- Organized structure: Clear separation of code, data, and results
Example 2: Large Dataset Workflow
This example demonstrates working with large datasets that exceed workspace disk capacity.Problem
Your workspace has 50GB disk space, but you need to work with a 100GB dataset.Solution: Mount the Dataset
Working with Mounted Data
In your Jupyter notebook:Example 3: Multi-Modal AI Development
This example shows setting up a workspace for multi-modal AI development with various data types.CLI Setup
Development Workflow
- Data Exploration: Use mounted datasets for exploration without disk usage
- Model Development: Access pre-trained models immediately
- Experiment Tracking: Save experiments to shared storage
- Checkpoint Management: Persistent checkpoint storage across team
Example 4: External Cloud Storage Integration
This example demonstrates integrating external cloud storage for seamless data access.Prerequisites
- Set up AWS credentials in organization settings
- Ensure S3 bucket has appropriate permissions
Setup with External Storage
Working with Cloud Data
Example 5: Hugging Face Integration
This example shows how to efficiently work with Hugging Face datasets and models.Setup
Development
Best Practices for Volume Usage
1. Choose the Right Volume Type
Use Import for:- Small to medium datasets (< 10GB)
- Code repositories
- Pre-trained models
- Data that doesn’t change during development
- Large datasets (> 10GB)
- Frequently updated data
- Shared storage across workspaces
- Real-time data pipelines
2. Organize Your Data
3. Performance Optimization
- Co-locate storage and compute: Use storage in the same region as your cluster
- Cache frequently accessed data: Copy small, frequently used files to
/root - Use appropriate storage types: SSDs for random access, object storage for large sequential reads
4. Cost Optimization
- Import small data: For datasets under 1GB, import is usually more cost-effective
- Mount large data: Avoid duplicating large datasets across workspaces
- Clean up exports: Regularly clean up exported data to avoid storage costs
Common Troubleshooting
Volume Mount Failures
Performance Issues
Access Permission Issues
Learn more about workspace volumes
Explore the complete guide to workspace volume configuration and advanced usage patterns.

