ParquetTable¶
-
class
lsst.pipe.tasks.parquetTable.
ParquetTable
(filename=None, dataFrame=None)¶ Bases:
object
Thin wrapper to pyarrow’s ParquetFile object
Call
toDataFrame
method to get apandas.DataFrame
object, optionally passing specific columns.The main purpose of having this wrapper rather than directly using
pyarrow.ParquetFile
is to make it nicer to load selected subsets of columns, especially from dataframes with multi-level column indices.Instantiated with either a path to a parquet file or a dataFrame
Parameters: - filename : str, optional
Path to Parquet file.
- dataFrame : dataFrame, optional
Attributes Summary
columnIndex
Columns as a pandas Index columns
List of column names (or column index if df is set) pandasMd
Methods Summary
toDataFrame
([columns])Get table (or specified columns) as a pandas DataFrame write
(filename)Write pandas dataframe to parquet Attributes Documentation
-
columnIndex
¶ Columns as a pandas Index
-
columns
¶ List of column names (or column index if df is set)
This may either be a list of column names, or a pandas.Index object describing the column index, depending on whether the ParquetTable object is wrapping a ParquetFile or a DataFrame.
-
pandasMd
¶
Methods Documentation
-
toDataFrame
(columns=None)¶ Get table (or specified columns) as a pandas DataFrame
Parameters: - columns : list, optional
Desired columns. If
None
, then all columns will be returned.
-
write
(filename)¶ Write pandas dataframe to parquet
Parameters: - filename : str
Path to which to write.