compute_row_group_size¶
- lsst.daf.butler.formatters.parquet.compute_row_group_size(schema: Schema, target_size: int = 1000000000) int ¶
Compute approximate row group size for a given arrow schema.
Given a schema, this routine will compute the number of rows in a row group that targets the persisted size on disk (or smaller). The exact size on disk depends on the compression settings and ratios; typical binary data tables will have around 15-20% compression with the pyarrow default
snappy
compression algorithm.