My book is out: Python Data Analysis, Fourth Edition
Co-written with my co-author, and what the year of writing it was like.
Python Data Analysis, Fourth Edition, which I wrote with my co-author, is out today. You can get it here:
Before I describe what’s in it, I want to write down what the year of working on it was like.
The work
My co-author and I started this over a year ago. I’ve worked in data for years and I know the material, so I expected the writing to be straightforward. It wasn’t.
Knowing something and explaining it in a book are different jobs. Every example has to run. Every explanation has to make sense to someone reading it for the first time. And you have to keep doing that, chapter after chapter, for months. This book has 17 of them.
Writing with anxiety
I’ll be honest about this part. There were stretches of the year when my anxiety was high. Some days, opening the manuscript felt harder than it should have.
What worked was not waiting to feel better before writing. On bad days I wrote a paragraph, sometimes a bad one, and fixed it later. The manuscript kept moving, and imperfect pages slowly became a book.
If you’re building something while dealing with your own anxiety, this is the only advice I have: keep the work moving, even in small pieces.
The collaboration
The other thing that carried this book was working with a co-author. When one of us was stuck, the other pushed forward. When one of us wrote something unclear, the other caught it. We each brought different expertise, and the book shows it.
What’s in the book
The full title is Python Data Analysis: Master Python Analytics with Machine Learning, Deep Learning, GenAI, LLMs, and Data Engineering.
Modern data analysis goes beyond cleaning and visualizing data. Practitioners today build scalable pipelines, apply machine learning, work with text and image data, and use Generative AI and LLMs. Rather than focus on a single library, the book covers the whole workflow:
Foundations: Python libraries, NumPy, pandas, statistics, linear algebra, and visualization
Data work: retrieving, processing, storing, and cleaning messy data
Time-series analysis, forecasting, and signal processing
Supervised and unsupervised learning: regression, classification, clustering, dimensionality reduction, and anomaly detection
Ensemble methods, neural networks, and deep learning
Text and image analytics, including NLP and sentiment analysis
LLMs and Generative AI
Scaling with Dask, Modin, Ray, and PySpark
By the end, you can build end-to-end data analysis pipelines and apply modern data science and AI techniques to real problems.
It’s written for data analysts, data scientists, business analysts, statisticians, and students. You’ll need basic math and working knowledge of Python.
Thank you
Get the book here:
If you read it, tell me what you think. I read every reply and comment.


