Python for Data Analysis: Data Wrangling with pandas, NumPy, and Jupyter

Vital stats

Language

Python 3

Reading time

Approximately 57 days

What you will learn

Numerical Programming and Data Mining

Author

Wes McKinney

Published

2 years, 10 months ago

Packages you will be introduced to

matplotlib
numpy
scipy
pandas
statsmodels
scikit-learn

Book cover of Python for Data Analysis: Data Wrangling with pandas, NumPy, and Jupyter by Wes McKinney

External links

Book description (click to open)

Get the definitive handbook for manipulating, processing, cleaning, and crunching datasets in Python. Updated for Python 3.10 and pandas 1.4, the third edition of this hands-on guide is packed with practical case studies that show you how to solve a broad set of data analysis problems effectively. You'll learn the latest versions of pandas, NumPy, and Jupyter in the process.

Written by Wes McKinney, the creator of the Python pandas project, this book is a practical, modern introduction to data science tools in Python. It's ideal for analysts new to Python and for Python programmers new to data science and scientific computing. Data files and related material are available on GitHub.

Use the Jupyter notebook and IPython shell for exploratory computing
Learn basic and advanced features in NumPy
Get started with data analysis tools in the pandas library
Use flexible tools to load, clean, transform, merge, and reshape data
Create informative visualizations with matplotlib
Apply the pandas groupby facility to slice, dice, and summarize datasets
Analyze and manipulate regular and irregular time series data
Learn how to solve real-world data analysis problems with thorough, detailed examples

See 5 Author Credentials

The author Wes McKinney has the following credentials.

Python Software Foundation Fellow, a major contributor to the language or its community
Prominent person behind the widely used Python package pandas
Prominent person behind the Python package feather
Prominent person behind Apache Arrow: a cross-language development platform for in-memory data (for more information, see author's Wikipedia page)
Prominent person behind Apache Parquet: a columnar storage format available to any project in the Hadoop ecosystem (for more information, see author's Wikipedia page)

Book description (click to open)

See 5 Author Credentials

See 69 Reddit comments mentioning the book