Language
  • Python 3
Reading time
  • Approximately 57 days
What you will learn
  • Numerical Programming and Data Mining
Author
  • Wes McKinney
Published
  • 1 year, 7 months ago
Packages you will be introduced to
  • matplotlib
  • numpy
  • scipy
  • pandas
  • statsmodels
  • scikit-learn

Get the definitive handbook for manipulating, processing, cleaning, and crunching datasets in Python. Updated for Python 3.10 and pandas 1.4, the third edition of this hands-on guide is packed with practical case studies that show you how to solve a broad set of data analysis problems effectively. You'll learn the latest versions of pandas, NumPy, and Jupyter in the process.

Written by Wes McKinney, the creator of the Python pandas project, this book is a practical, modern introduction to data science tools in Python. It's ideal for analysts new to Python and for Python programmers new to data science and scientific computing. Data files and related material are available on GitHub.

  • Use the Jupyter notebook and IPython shell for exploratory computing
  • Learn basic and advanced features in NumPy
  • Get started with data analysis tools in the pandas library
  • Use flexible tools to load, clean, transform, merge, and reshape data
  • Create informative visualizations with matplotlib
  • Apply the pandas groupby facility to slice, dice, and summarize datasets
  • Analyze and manipulate regular and irregular time series data
  • Learn how to solve real-world data analysis problems with thorough, detailed examples
The author Wes McKinney has the following credentials.

  • Python Software Foundation Fellow, a major contributor to the language or its community
  • Prominent person behind the widely used Python package pandas
  • Prominent person behind the Python package feather
  • Prominent person behind Apache Arrow: a cross-language development platform for in-memory data (for more information, see author's Wikipedia page)
  • Prominent person behind Apache Parquet: a columnar storage format available to any project in the Hadoop ecosystem (for more information, see author's Wikipedia page)