Cloud Object Storage
Komodo supports reading/writing to cloud object storage buckets on your machines/jobs/services.
Currently, only S3 buckets are supported, however support for GCS and R2 is coming soon!
There are 2 modes for using storage buckets:
- COPY - copies the contents of the bucket to a specific location on the instance. This is helpful for datasets where you need fast read access. Any files written here will NOT be uploaded back to the bucket.
- MOUNT - mounts the bucket as a file system on the instance. This is helpful for having a perisistent storage location, where you can store your model checkpoints and access them even after your tasks finish. Any files written here will be uploaded back to the bucket.
Here is an example of what you’d need to include in your task config to effectively use cloud storage buckets:
Picking a storage mode
MOUNT | COPY | |
---|---|---|
Best For | Writing task outputs (eg. checkpoints, logs), reading very large data that won’t fit on disk. | High performance read-only access to datasets that fit on disk. |
Performance | Slow to read/write files. Fast to provision. | Fast file access. Slow at initial provisioning. |
Writing to buckets | Most write operations are supported. | Not supported, read-only. |
Disk Size | No disk size requirements. | Disk size must be greater than the size of the bucket. |