atdata benchmark report

Generated from pytest-benchmark JSON output

runnervmwffz4 — AMD EPYC 7763 64-Core Processor (4 cores) · Python 3.12.3 · Linux 6.11.0-1018-azure · main@19819302

Index Providers (μs–ms scale)

Write operation benchmarks across all providers.; Read operation benchmarks across all providers.; Benchmarks through the full Index API.

Parameters: n = entries to store, any_provider = storage backend

Test Median IQR OPS
test_store_single_entry[sqlite]
Store single entry [sqlite storage backend]
163.14 μs 199.98 μs 3.95 Kops/s
test_store_single_entry[redis]
Store single entry [redis storage backend]
150.43 μs 15.27 μs 6.45 Kops/s
test_store_entries_bulk[sqlite-10]
Store entries bulk [sqlite storage backend, 10 entries to store]
1.50 ms 385.33 μs 539.2 ops/s
test_store_entries_bulk[sqlite-100]
Store entries bulk [sqlite storage backend, 100 entries to store]
14.71 ms 1.33 ms 59.9 ops/s
test_store_entries_bulk[sqlite-1k]
Store entries bulk [sqlite storage backend, 1000 entries to store]
155.72 ms 8.38 ms 6.4 ops/s
test_store_entries_bulk[redis-10]
Store entries bulk [redis storage backend, 10 entries to store]
1.53 ms 219.41 μs 630.3 ops/s
test_store_entries_bulk[redis-100]
Store entries bulk [redis storage backend, 100 entries to store]
15.11 ms 414.22 μs 65.3 ops/s
test_store_entries_bulk[redis-1k]
Store entries bulk [redis storage backend, 1000 entries to store]
151.44 ms 4.30 ms 6.5 ops/s
test_store_schema[sqlite]
Store schema [sqlite storage backend]
147.86 μs 201.15 μs 4.03 Kops/s
test_store_schema[redis]
Store schema [redis storage backend]
140.71 μs 17.60 μs 6.85 Kops/s
test_store_schema_versions[sqlite-10v]
Store schema versions [sqlite storage backend, 10 entries to store]
1.41 ms 1.46 ms 472.0 ops/s
test_store_schema_versions[sqlite-50v]
Store schema versions [sqlite storage backend, 50 entries to store]
7.17 ms 740.14 μs 106.6 ops/s
test_store_schema_versions[redis-10v]
Store schema versions [redis storage backend, 10 entries to store]
1.43 ms 209.66 μs 678.7 ops/s
test_store_schema_versions[redis-50v]
Store schema versions [redis storage backend, 50 entries to store]
7.05 ms 172.62 μs 140.1 ops/s
test_get_entry_by_name[sqlite]
Get entry by name [sqlite storage backend]
8.50 μs 161.0 ns 108.76 Kops/s
test_get_entry_by_name[redis]
Get entry by name [redis storage backend]
29.52 ms 4.29 ms 32.6 ops/s
test_get_entry_by_cid[sqlite]
Get entry by cid [sqlite storage backend]
8.60 μs 120.0 ns 113.69 Kops/s
test_get_entry_by_cid[redis]
Get entry by cid [redis storage backend]
156.18 μs 24.96 μs 6.14 Kops/s
test_iter_entries[sqlite-10]
Iter entries [sqlite storage backend, 10 entries to store]
37.69 μs 340.0 ns 26.12 Kops/s
test_iter_entries[sqlite-100]
Iter entries [sqlite storage backend, 100 entries to store]
333.96 μs 11.58 μs 2.96 Kops/s
test_iter_entries[sqlite-1k]
Iter entries [sqlite storage backend, 1000 entries to store]
3.41 ms 32.49 μs 261.6 ops/s
test_iter_entries[redis-10]
Iter entries [redis storage backend, 10 entries to store]
176.57 ms 11.40 ms 5.6 ops/s
test_iter_entries[redis-100]
Iter entries [redis storage backend, 100 entries to store]
175.90 ms 6.80 ms 5.6 ops/s
test_iter_entries[redis-1k]
Iter entries [redis storage backend, 1000 entries to store]
172.96 ms 12.10 ms 5.7 ops/s
test_get_schema_json[sqlite]
Get schema json [sqlite storage backend]
4.63 μs 51.0 ns 211.78 Kops/s
test_get_schema_json[redis]
Get schema json [redis storage backend]
139.83 μs 20.90 μs 6.89 Kops/s
test_find_latest_version[sqlite-5v]
Find latest version [sqlite storage backend, 5 entries to store]
8.67 μs 89.0 ns 113.40 Kops/s
test_find_latest_version[sqlite-20v]
Find latest version [sqlite storage backend, 20 entries to store]
22.58 μs 261.0 ns 43.66 Kops/s
test_find_latest_version[sqlite-50v]
Find latest version [sqlite storage backend, 50 entries to store]
49.94 μs 341.0 ns 19.77 Kops/s
test_find_latest_version[redis-5v]
Find latest version [redis storage backend, 5 entries to store]
15.00 ms 1.27 ms 64.5 ops/s
test_find_latest_version[redis-20v]
Find latest version [redis storage backend, 20 entries to store]
17.41 ms 2.62 ms 55.1 ops/s
test_find_latest_version[redis-50v]
Find latest version [redis storage backend, 50 entries to store]
22.11 ms 663.95 μs 44.4 ops/s
test_iter_schemas[sqlite]
Iter schemas [sqlite storage backend]
16.59 μs 101.0 ns 59.45 Kops/s
test_iter_schemas[redis]
Iter schemas [redis storage backend]
17.53 ms 1.59 ms 55.9 ops/s
test_index_insert_dataset
Index insert dataset
333.09 μs 295.55 μs 2.10 Kops/s
test_index_get_dataset
Index get dataset
17.15 μs 230.0 ns 56.98 Kops/s
test_index_list_datasets
Index list datasets
7.98 μs 110.0 ns 122.38 Kops/s
test_index_publish_schema
Index publish schema
253.78 μs 113.77 μs 3.08 Kops/s

Dataset I/O (ms scale)

Shard writing throughput benchmarks.; Shard reading and iteration benchmarks.; Full write-then-read round-trip benchmarks.

Parameters: n = samples per shard, batch_size = samples per batch

Test Median IQR OPS Med/sample Samples/s
test_write_basic_shard[100]
Write basic shard [100 samples per shard]
7.87 ms 129.76 μs 125.7 ops/s 78.69 μs 12.71 Kops/s
test_write_basic_shard[1k]
Write basic shard [1000 samples per shard]
75.25 ms 732.85 μs 13.2 ops/s 75.25 μs 13.29 Kops/s
test_write_basic_shard[10k]
Write basic shard [10000 samples per shard]
744.36 ms 37.30 ms 1.3 ops/s 74.44 μs 13.43 Kops/s
test_write_numpy_shard[100]
Write numpy shard [100 samples per shard]
31.66 ms 622.29 μs 31.1 ops/s 316.61 μs 3.16 Kops/s
test_write_numpy_shard[1k]
Write numpy shard [1000 samples per shard]
352.50 ms 25.87 ms 2.8 ops/s 352.50 μs 2.84 Kops/s
test_write_large_numpy_shard
Write large numpy shard
8.508 s 133.33 ms 0.1 ops/s 850.83 ms 1.2 ops/s
test_write_with_manifest[100]
Write with manifest [100 samples per shard]
13.99 ms 254.27 μs 71.3 ops/s 139.89 μs 7.15 Kops/s
test_write_with_manifest[1k]
Write with manifest [1000 samples per shard]
97.29 ms 492.26 μs 10.2 ops/s 97.29 μs 10.28 Kops/s
test_write_multi_shard
Write multi shard
745.62 ms 7.93 ms 1.3 ops/s 74.56 μs 13.41 Kops/s
test_read_ordered[100]
Read ordered [100 samples per shard]
7.29 ms 149.10 μs 137.6 ops/s 72.93 μs 13.71 Kops/s
test_read_ordered[1k]
Read ordered [1000 samples per shard]
69.60 ms 1.60 ms 14.4 ops/s 69.60 μs 14.37 Kops/s
test_read_ordered[10k]
Read ordered [10000 samples per shard]
674.72 ms 19.16 ms 1.5 ops/s 67.47 μs 14.82 Kops/s
test_read_shuffled[100]
Read shuffled [100 samples per shard]
7.06 ms 114.83 μs 141.8 ops/s 70.64 μs 14.16 Kops/s
test_read_shuffled[1k]
Read shuffled [1000 samples per shard]
70.96 ms 1.19 ms 14.2 ops/s 70.96 μs 14.09 Kops/s
test_read_batched[batch32]
Read batched [32 samples per batch]
66.99 ms 832.17 μs 14.9 ops/s 66.99 μs 14.93 Kops/s
test_read_batched[batch128]
Read batched [128 samples per batch]
66.35 ms 560.86 μs 15.1 ops/s 66.35 μs 15.07 Kops/s
test_read_numpy_ordered[100]
Read numpy ordered [100 samples per shard]
13.88 ms 205.29 μs 71.9 ops/s 138.77 μs 7.21 Kops/s
test_read_numpy_ordered[1k]
Read numpy ordered [1000 samples per shard]
137.52 ms 734.39 μs 7.3 ops/s 137.52 μs 7.27 Kops/s
test_roundtrip_basic[100]
Roundtrip basic [100 samples per shard]
14.95 ms 163.26 μs 66.7 ops/s 149.52 μs 6.69 Kops/s
test_roundtrip_basic[1k]
Roundtrip basic [1000 samples per shard]
142.47 ms 678.00 μs 7.0 ops/s 142.47 μs 7.02 Kops/s
test_roundtrip_numpy[100]
Roundtrip numpy [100 samples per shard]
40.01 ms 519.89 μs 24.9 ops/s 400.09 μs 2.50 Kops/s
test_roundtrip_numpy[500]
Roundtrip numpy [500 samples per shard]
196.75 ms 9.16 ms 4.9 ops/s 393.49 μs 2.54 Kops/s

Query System (ms scale)

Benchmark different query predicate types on a medium dataset.; Benchmark iterating through query results to access sample locations.; Benchmark query performance at different scales.; Benchmark manifest loading from disk.; Benchmark manifest construction from samples.

Parameters: n_shards = number of shards (100 samples each), n = samples in manifest

Test Median IQR OPS Med/sample Samples/s
test_query_simple_equality
Query simple equality
8.69 ms 138.95 μs 114.0 ops/s
test_query_numeric_range
Query numeric range
9.12 ms 85.59 μs 109.6 ops/s
test_query_combined
Query combined
4.32 ms 64.49 μs 229.4 ops/s
test_query_isin
Query isin
14.95 ms 201.09 μs 66.8 ops/s
test_query_no_results
Query no results
1.83 ms 25.72 μs 544.0 ops/s
test_query_all_results
Query all results
31.27 ms 215.08 μs 31.9 ops/s
test_iterate_equality_results
Iterate equality results
8.52 ms 302.73 μs 117.6 ops/s 42.61 μs 23.47 Kops/s
test_iterate_range_results
Iterate range results
18.66 ms 197.45 μs 53.4 ops/s 33.93 μs 29.48 Kops/s
test_iterate_large_result_set
Iterate large result set
294.38 ms 42.97 ms 3.2 ops/s 29.44 μs 33.97 Kops/s
test_query_small
Query small
2.07 ms 43.44 μs 482.5 ops/s
test_query_medium
Query medium
18.69 ms 198.57 μs 53.5 ops/s
test_query_large
Query large
162.90 ms 2.71 ms 6.1 ops/s
test_load_from_directory[2s]
Load from directory [2 number of shards (100 samples each)]
3.30 ms 45.89 μs 301.5 ops/s
test_load_from_directory[5s]
Load from directory [5 number of shards (100 samples each)]
8.16 ms 80.07 μs 115.2 ops/s
test_load_from_directory[10s]
Load from directory [10 number of shards (100 samples each)]
16.50 ms 231.94 μs 60.5 ops/s
test_load_from_directory[20s]
Load from directory [20 number of shards (100 samples each)]
33.15 ms 751.63 μs 28.8 ops/s
test_load_from_shard_urls[2s]
Load from shard urls [2 number of shards (100 samples each)]
3.24 ms 36.45 μs 306.2 ops/s
test_load_from_shard_urls[5s]
Load from shard urls [5 number of shards (100 samples each)]
8.10 ms 72.78 μs 121.9 ops/s
test_load_from_shard_urls[10s]
Load from shard urls [10 number of shards (100 samples each)]
16.37 ms 170.90 μs 58.7 ops/s
test_manifest_build[100]
Manifest build [100 samples in manifest]
1.30 ms 22.12 μs 767.2 ops/s 13.00 μs 76.95 Kops/s
test_manifest_build[1k]
Manifest build [1000 samples in manifest]
7.82 ms 107.78 μs 123.0 ops/s 7.82 μs 127.93 Kops/s
test_manifest_build[5k]
Manifest build [5000 samples in manifest]
38.88 ms 1.37 ms 21.8 ops/s 7.78 μs 128.60 Kops/s
test_manifest_write[100]
Manifest write [100 samples in manifest]
3.62 ms 110.35 μs 274.5 ops/s 36.21 μs 27.61 Kops/s
test_manifest_write[1k]
Manifest write [1000 samples in manifest]
6.98 ms 166.17 μs 140.3 ops/s 6.98 μs 143.20 Kops/s

S3 Storage (ms+ scale)

S3 shard writing benchmarks via moto mock.

Parameters: n = samples per shard

Test Median IQR OPS Med/sample Samples/s
test_s3_write_shards[100]
S3 write shards [100 samples per shard]
20.67 ms 630.20 μs 43.8 ops/s 206.67 μs 4.84 Kops/s
test_s3_write_shards[500]
S3 write shards [500 samples per shard]
87.05 ms 942.02 μs 10.4 ops/s 174.09 μs 5.74 Kops/s
test_s3_write_with_manifest
S3 write with manifest
61.78 ms 1.47 ms 14.3 ops/s 308.90 μs 3.24 Kops/s
test_s3_write_cache_local
S3 write cache local
44.56 ms 2.46 ms 20.2 ops/s 222.81 μs 4.49 Kops/s
test_s3_write_direct
S3 write direct
37.47 ms 586.03 μs 26.5 ops/s 187.36 μs 5.34 Kops/s
test_s3_write_numpy
S3 write numpy
42.68 ms 1.02 ms 23.2 ops/s 426.80 μs 2.34 Kops/s

Serialization (μs scale)

Pure serialization/deserialization without disk I/O.

Test Median IQR OPS
test_serialize_basic_sample
Serialize basic sample
1.36 μs 39.0 ns 721.99 Kops/s
test_deserialize_basic_sample
Deserialize basic sample
1.84 μs 51.0 ns 525.60 Kops/s
test_serialize_numpy_sample
Serialize numpy sample
20.41 μs 491.0 ns 42.65 Kops/s
test_deserialize_numpy_sample
Deserialize numpy sample
5.62 μs 161.0 ns 173.34 Kops/s
test_serialize_large_numpy
Serialize large numpy
429.70 ms 1.63 ms 2.3 ops/s
test_deserialize_large_numpy
Deserialize large numpy
33.63 ms 260.17 μs 29.7 ops/s
test_as_wds_basic
As wds basic
4.68 μs 70.0 ns 209.80 Kops/s
test_as_wds_numpy
As wds numpy
24.19 μs 351.0 ns 40.61 Kops/s