Skip to main content

Pandas

Introduction

General

DataFrames

Series

Memory usage

  • Stop wasting memory in your Pandas DataFrame! - YouTube • When reading in a CSV with read_csv, specify which cols you care about with usecols and specify the data type of those columns with dtype, preferring efficient types like Category over broader types like strings. Profile before/after each change with memory_usage='deep' • Visual Studio Code 📺

Testing

Editing a *.parquet file

  1. Read the parquet file using pandas.read_parquet(‘some/path.parquet’), which returns a DataFrame
  2. Modify the df however you need
  3. Write the df to a parquet file again with df.to_parquet(‘some/path.parquet’)

Pandas 2

Pandas 2.0 : Everything You Need to Know - YouTube • Focuses on the switch from NumPy to Pyarrow under the hood • Rob Mulla 📺

Inbox