Python 3 Text Processing with NLTK 3 Cookbook

Vital stats

Language

Python 3

Reading time

Approximately 31 days

What you will learn

Natural Language Processing

Author

Jacob Perkins

Published

10 years, 10 months ago

Packages you will be introduced to

nltk
numpy
scipy
scikit-learn
pyenchant
pymongo
lxml
beautifulsoup4

Book cover of Python 3 Text Processing with NLTK 3 Cookbook by Jacob Perkins

External links

Book description (click to open)

Over 80 practical recipes on natural language processing techniques using Python's NLTK 3.0

About This Book

Break text down into its component parts for spelling correction, feature extraction, and phrase transformation
Learn how to do custom sentiment analysis and named entity recognition
Work through the natural language processing concepts with simple and easy-to-follow programming recipes

Who This Book Is For

This book is intended for Python programmers interested in learning how to do natural language processing. Maybe you've learned the limits of regular expressions the hard way, or you've realized that human language cannot be deterministically parsed like a computer language. Perhaps you have more text than you know what to do with, and need automated ways to analyze and structure that text. This Cookbook will show you how to train and use statistical language models to process text in ways that are practically impossible with standard programming tools. A basic knowledge of Python and the basic text processing concepts is expected. Some experience with regular expressions will also be helpful.

What You Will Learn

Tokenize text into sentences, and sentences into words
Look up words in the WordNet dictionary
Apply spelling correction and word replacement
Access the built-in text corpora and create your own custom corpus
Tag words with parts of speech
Chunk phrases and recognize named entities
Grammatically transform phrases and chunks
Classify text and perform sentiment analysis

In Detail

This book will show you the essential techniques of text and language processing. Starting with tokenization, stemming, and the WordNet dictionary, you'll progress to part-of-speech tagging, phrase chunking, and named entity recognition. You'll learn how various text corpora are organized, as well as how to create your own custom corpus. Then, you'll move onto text classification with a focus on sentiment analysis. And because NLP can be computationally expensive on large bodies of text, you'll try a few methods for distributed text processing. Finally, you'll be introduced to a number of other small but complementary Python libraries for text analysis, cleaning, and parsing.

This cookbook provides simple, straightforward examples so you can quickly learn text processing with Python and NLTK.

See 1 Author Credential

The author Jacob Perkins has the following credentials.

Works/Worked at AT&T