Language
  • Python 2
Reading time
  • Approximately 44 days
What you will learn
  • Web Scraping
  • Natural Language Processing
  • Machine Learning and AI
  • Numerical Programming and Data Mining
Author
  • Matthew A. Russell
Published
  • 10 years, 5 months ago
Packages you will be introduced to
  • nltk
  • numpy
  • matplotlib
  • networkx
  • scrapy
  • requests
  • beautifulsoup4
Book cover of Mining the Social Web: Data Mining Facebook, Twitter, Linkedin, Google+, Github, And More by Matthew A. Russell

How can you tap into the wealth of social web data to discover who’s making connections with whom, what they’re talking about, and where they’re located? With this expanded and thoroughly revised edition, you’ll learn how to acquire, analyze, and summarize data from all corners of the social web, including Facebook, Twitter, LinkedIn, Google+, GitHub, email, websites, and blogs.

  • Employ the Natural Language Toolkit, NetworkX, and other scientific computing tools to mine popular social web sites
  • Apply advanced text-mining techniques, such as clustering and TF-IDF, to extract meaning from human language data
  • Bootstrap interest graphs from GitHub by discovering affinities among people, programming languages, and coding projects
  • Build interactive visualizations with D3.js, an extraordinarily flexible HTML5 and JavaScript toolkit
  • Take advantage of more than two-dozen Twitter recipes, presented in O’Reilly’s popular "problem/solution/discussion" cookbook format

The example code for this unique data science book is maintained in a public GitHub repository. It’s designed to be easily accessible through a turnkey virtual machine that facilitates interactive learning with an easy-to-use collection of IPython Notebooks.