Developer Guide
Storage
Overview
Komodo supports reading/writing to cloud object storage buckets on your machines/jobs/services.
The following object storage providers are currently supported:
- Amazon S3
- Google Cloud Storage (GCS)
- Azure Blob Storage
To use these providers, you will first need to connect your cloud account
There are 2 modes for using storage buckets:
- COPY - copies the contents of the bucket to a specific location on the instance. This is helpful for datasets where you need fast read access. Any files written here will NOT be uploaded back to the bucket.
- MOUNT - mounts the bucket as a file system on the instance. This is helpful for having a perisistent storage location, where you can store your model checkpoints and access them even after your tasks finish. Any files written here will be uploaded back to the bucket.
Here is an example of what you’d need to include in your task config to effectively use cloud storage buckets: Here is how you can use cloud storage from different providers:
Picking a storage mode
MOUNT | COPY | |
---|---|---|
Best For | Writing task outputs (eg. checkpoints, logs), reading very large data that won’t fit on disk. | High performance read-only access to datasets that fit on disk. |
Performance | Slow to read/write files. Fast to provision. | Fast file access. Slow at initial provisioning. |
Writing to buckets | Most write operations are supported. | Not supported, read-only. |
Disk Size | No disk size requirements. | Disk size must be greater than the size of the bucket. |