AbstractDataStore
AbstractDataStore()Protocol for data storage backends (S3, local disk, PDS blobs).
Separates index (metadata) from data store (shard files), enabling flexible deployment combinations.
Examples
>>> store = S3DataStore(credentials, bucket="my-bucket")
>>> urls = store.write_shards(dataset, prefix="training/v1")Methods
| Name | Description |
|---|---|
| read_url | Resolve a storage URL for reading (e.g., sign S3 URLs). |
| write_shards | Write dataset shards to storage. |
read_url
AbstractDataStore.read_url(url)Resolve a storage URL for reading (e.g., sign S3 URLs).
write_shards
AbstractDataStore.write_shards(ds, *, prefix, **kwargs)Write dataset shards to storage.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| ds | Dataset | The Dataset to write. | required |
| prefix | str | Path prefix (e.g., 'datasets/mnist/v1'). |
required |
| **kwargs | Backend-specific options (maxcount, maxsize, etc.). |
{} |
Returns
| Name | Type | Description |
|---|---|---|
| list[str] | List of shard URLs suitable for atdata.Dataset(). |