Natural Language Processing
Back to course list | Guidelines For Editing
| Table of contents |
|---|
| Introduction Prerequisites Reading list Video materials Lecture notes Software Further studies |
Information
Natural Language Processing will start on (was January 23 2012) now Feb 2012.
Natural language processing is the technology for dealing with our most ubiquitous product: human language, as it appears in emails, web pages, tweets, product descriptions, newspaper stories, social media, and scientific articles, in thousands of languages and varieties.
In the past decade, successful natural language processing applications have become part of our everyday experience, from spelling and grammar correction in word processors to machine translation on the web, from email spam detection to automatic question answering, from detecting people's opinions about products or services to extracting appointments from your email.
In this class, you'll learn the fundamental algorithms and mathematical models for human language processing and how you can use them to solve practical problems in dealing with language data wherever you encounter it.
Home page of Stanford class CS 124/LINGUIST 180 with slides. Many but not all lectures are the same as the online course.
Prerequisites
No background in natural language processing is required.
Students will be expected to know a bit of basic probability (know Bayes rule), a bit about vectors and vector spaces (could length normalize a vector), a bit of calculus (know that the derivative of a function is zero at a maximum or minimum of a function), but we will review these concepts as we first use them.
You should have reasonable programming ability (know about hash tables and graph data structures), be able to write programs in Java or Python, and have a computer (Windows, Mac or Linux) with internet access.
Recommended reading
Textbooks for the 2012 classes - several lists of books, a lot of them free.
Preparation
To prepare for the class in advance, you may consider reading through some sections of the textbooks (Jurafsky and Martin, Speech and Language Processing 2nd Edition, and Manning, Schütze and Raghavan 2008).
The following topics will be covered in the first two weeks:
Introduction and Overview:
- Basic Text Processing: J+M Chapters 2.1, 3.9; MR+S Chapters 2.1-2.2
- Minimum Edit Distance: J+M Chapter 3.11
- Language Modeling: J+M Chapter 4
- Spelling Correction: J+M Chapters 5.9, Peter Norvig (2007) How to Write a Spelling Corrector