AbstractDataStore

AbstractDataStore()

Protocol for data storage backends (S3, local disk, PDS blobs).

Separates index (metadata) from data store (shard files), enabling flexible deployment combinations.

Examples

>>> store = S3DataStore(credentials, bucket="my-bucket")
>>> urls = store.write_shards(dataset, prefix="training/v1")

Methods

Name Description
read_url Resolve a storage URL for reading (e.g., sign S3 URLs).
write_shards Write dataset shards to storage.

read_url

AbstractDataStore.read_url(url)

Resolve a storage URL for reading (e.g., sign S3 URLs).

write_shards

AbstractDataStore.write_shards(ds, *, prefix, **kwargs)

Write dataset shards to storage.

Parameters

Name Type Description Default
ds Dataset The Dataset to write. required
prefix str Path prefix (e.g., 'datasets/mnist/v1'). required
**kwargs Backend-specific options (maxcount, maxsize, etc.). {}

Returns

Name Type Description
list[str] List of shard URLs suitable for atdata.Dataset().