AbstractIndex
Protocol for index operations — implemented by Index and AtmosphereIndex.
Manages dataset metadata: publishing/retrieving schemas, inserting/listing datasets. A single index holds datasets of many sample types, tracked via schema references.
Examples
>>> def publish_and_list(index: AbstractIndex) -> None :
... index.publish_schema(ImageSample, version= "1.0.0" )
... index.insert_dataset(image_ds, name= "images" )
... for entry in index.list_datasets():
... print (f" { entry. name} -> { entry. schema_ref} " )
Attributes
data_store
Optional data store for reading/writing shards.
Methods
decode_schema
AbstractIndex.decode_schema(ref)
Reconstruct a Packable type from a stored schema.
Examples
>>> SampleType = index.decode_schema(entry.schema_ref)
>>> ds = Dataset[SampleType](entry.data_urls[0 ])
get_dataset
AbstractIndex.get_dataset(ref)
Get a dataset entry by name or reference.
get_schema
AbstractIndex.get_schema(ref)
Get a schema record by reference.
insert_dataset
AbstractIndex.insert_dataset(ds, * , name, schema_ref= None , ** kwargs)
Register an existing dataset in the index.
Parameters
ds
Dataset
The Dataset to register.
required
name
str
Human-readable name.
required
schema_ref
Optional [str ]
Explicit schema ref; auto-published if None.
None
**kwargs
Backend-specific options.
{}
publish_schema
AbstractIndex.publish_schema(sample_type, * , version= '1.0.0' , ** kwargs)
Publish a schema for a sample type.
Parameters
sample_type
type
A Packable type (@packable-decorated or subclass).
required
version
str
Semantic version string.
'1.0.0'
**kwargs
Backend-specific options.
{}
Returns
str
Schema reference string (local://... or at://...).
write
AbstractIndex.write(samples, * , name, schema_ref= None , ** kwargs)
Write samples and create an index entry in one step.
Serializes samples to WebDataset tar files, stores them via the appropriate backend, and creates an index entry.
Parameters
samples
Iterable
Iterable of Packable samples. Must be non-empty.
required
name
str
Dataset name, optionally prefixed with target backend.
required
schema_ref
Optional [str ]
Optional schema reference.
None
**kwargs
Backend-specific options (maxcount, description, etc.).
{}