- Nice side-by-side comparison of pandas vs polars syntax for common operations
Original video by Anton T. Ruberts:
Polars is a Python package (written in Rust) for working with DataFrames. Think Pandas, but faster. Much faster! In this tutorial, I’ll explore basics of Polars and will compare it against Pandas - both in speed and syntax. This is a notebook walkthrough video, so you’ll be able to follow along easily with the links below.
Links:
Notebook link - https://github.com/aruberts/tutorials/blob/main/polars/basics.ipynb
Dataset link - https://www.kaggle.com/datasets/datasnaek/youtube-new
Medium blog - https://medium.com/me/stats/post/b2ec500a1008
Polars documentation - https://pola-rs.github.io/polars/py-polars/html/reference/
0:00 Intro
0:08 Pandas vs Polars
1:01 Goal of the video
1:27 Setup
2:43 Notebook structure
4:03 Installing Polars
4:46 Set Polars configs
5:43 Reading data with Polars
6:18 DataFrame basic exploration
7:34 Column selection
10:52 DataFrame filtering
12:42 DataFrame quality checks (checking for NAs and static)
15:55 Data cleaning and pre-processing
18:55 Univariate analysis (value counts, mean, median, etc.)
21:35 Multivariate analysis (groupby, aggregates)
27:14 Custom functions with Polars
32:00 Saving Polars DataFrame
32:20 Summary