ParquetTable¶
-
class
lsst.pipe.tasks.parquetTable.ParquetTable(filename=None, dataFrame=None)¶ Bases:
objectThin wrapper to pyarrow’s ParquetFile object
Call
toDataFramemethod to get apandas.DataFrameobject, optionally passing specific columns.The main purpose of having this wrapper rather than directly using
pyarrow.ParquetFileis to make it nicer to load selected subsets of columns, especially from dataframes with multi-level column indices.Instantiated with either a path to a parquet file or a dataFrame
Parameters: - filename : str, optional
Path to Parquet file.
- dataFrame : dataFrame, optional
Attributes Summary
columnIndexColumns as a pandas Index columnsList of column names (or column index if df is set) pandasMdMethods Summary
toDataFrame([columns])Get table (or specified columns) as a pandas DataFrame write(filename)Write pandas dataframe to parquet Attributes Documentation
-
columnIndex¶ Columns as a pandas Index
-
columns¶ List of column names (or column index if df is set)
This may either be a list of column names, or a pandas.Index object describing the column index, depending on whether the ParquetTable object is wrapping a ParquetFile or a DataFrame.
-
pandasMd¶
Methods Documentation
-
toDataFrame(columns=None)¶ Get table (or specified columns) as a pandas DataFrame
Parameters: - columns : list, optional
Desired columns. If
None, then all columns will be returned.
-
write(filename)¶ Write pandas dataframe to parquet
Parameters: - filename : str
Path to which to write.