Parquet → CSV: Four Python Libraries Compared
Among Polars, DuckDB, PyArrow, and Pandas, which one delivers the fastest Parquet-to-CSV conversions?
Combining CSVs With Slightly Different Schemas
Learn three ways to merge multiple CSV files that don’t share the exact same schema in Python...
Recoding (Column) Values in Python
Data recoding is a dreaded task, but the results are well worth the effort. In this post, I share several methods for quickly recoding column values using the Polars and Pandas libraries in Python...
Renaming Columns in Python
This post shares several methods for renaming DataFrame columns using the Polars and Pandas libraries in Python...
Order Your Data with Intention – ggplot edition
Having trouble displaying data in ggplot? This post shares two strategies for sorting your chart data using ggplot…
Be Careful With (Data) Binning
Grouping data into bins or categories can make it easier to analyze. But beware, binning can lead to deceptive data (re)presentations…