# [Resources] Textbooks for the 2012 classes?

 A lot of people must be purchasing books for the Stanford classes-- I bought Peter Norvig's AI book, and because of that, Amazon is now recommending Daphne Koller's Probabilistic Graphical Models book. Which textbooks, if any, are required for the Machine Learning, Probabilistic Graphical Models and Natural Language Processing classes? Which books would it be a good idea to have? As I said, I bought the Norvig book, and I'm glad I did, even though it wasn't required. I found the more in-depth discussion of some of the topics helped me understand them more deeply.
## The NLP reccomended textbooks:

are mentioned in the description of course in this post. Some of them are completely free, so no harm in getting them.

## Other Free online books (for ML and PGM):

• Computer Vision: Models, Learning, and Inference (Simon J.D. Prince) - A free online version of upcoming book. There is also number of additional resources that complement the data in each chapter. These include links to project pages, other descriptions of the same material and useful datasets.
• The Elements of Statistical Machine Learning (Hastie, Tibshirani and Friedman (2008)). It should be a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting--the first comprehensive treatment of this topic in any book.
• Reinforcement Learning: An Introduction (Richard S. Sutton and Andrew G. Barto). This book was designed to be used as a text in a one-semester course, perhaps supplemented by readings from the literature or by a more mathematical text such as the excellent one by Bertsekas and Tsitsiklis (1996). This book can also be used as part of a broader course on machine learning, artificial intelligence, or neural networks.
• Introduction to Machine Learning (Alex Smola and S.V.N. Vishwanathan (2008)). Check out his blog and PURDUE Machine Learning Summer School 2011 video lectures.
• Bayesian Reasoning and Machine Learning (Barber, D.). The book is designed to appeal to students with only a modest mathematical background in undergraduate calculus and linear algebra. No formal computer science or statistical background is required to follow the book, although a basic familiarity with probability, calculus and linear algebra would be useful. The book should appeal to students from a variety of backgrounds, including Computer Science, Engineering, applied Statistics, Physics, and Bioinformatics that wish to gain an entry to probabilistic approaches in Machine Learning.
• Graphical Models, Exponential Families, and Variational Inference (Martin J. Wainwright and Michael I. Jordan (2008)). We describe how a wide variety of algorithms — among them sum-product, cluster variational methods, expectation-propagation, mean field methods, max-product and linear programming relaxation, as well as conic programming relaxations — can all be understood in terms of exact or approximate forms of these variational representations.

 5 Amazon is fast! I asked this question yesterday, and today I have the Koller-Friedman Probabilistic Graphical Models book in my hand. First impression: this book is big. The package holding it alone had one of those yellow Warning, Heavy labels on it. The book goes into considerably more depth on several main ideas of the AI class: naive Bayes networks, Bayes networks in general, Markov models, particle filters. If you didn't like those sections of the AI class, or if you found them difficult, I hope you're not taking the PGM class, because it will be more of the same. This book costs a fortune, but having read the introductory section and the first part on Bayes networks, I'm glad I bought it. Koller & Friedman give clear, mathematically detailed expositions of what's going on. The equations, of which there are many, come with verbal explanations, making the text readable and easy to follow. In the naive Bayes section (and I presume in the other sections as well) they explain not only the mathematical underpinning of the technique, but how to apply the technique to real problems, which real problems to use the technique on, and where the pitfalls are. For algorithms, they supply code. I only gave a cursory look at the exercises, but they look good too. Koller & Friedman write with both precision and clarity. Five stars. answered 30 Dec '11, 16:48 Anne Paulson 4.2k●1●20 Thanks for the review. This may sound odd after some of my other comments in this thread, but I thought the book was a relative bargain. $72 for a current (2009) and well written 1200 page hardcover textbook seems very reasonable to me. My only complaint so far is I am not impressed with the errata. It does not include typos and is not set up to distinguish between the different printings. This impression was not helped by the first thing I looked at in it (eq. A.4 on p. 1149 in appendix A) has what I believe to be an incorrect correction (I'd be interested in a second opinion if you have a chance to look, in my 4th printing it is actually p. 1151 and equation A.4 should add ^q1 not "neg q_1" as stated in the errata if I understand correctly). The book web page is at http://pgm.stanford.edu/ It includes the errata, TOC, and chapter 1. (30 Dec '11, 18:25) rseiter ♦ @rseiter, evidently LaTex isn't allowed in comments on comments. See my top-level comment about the errata. (30 Dec '11, 19:17) Anne Paulson 2 "If you didn't like those sections of the AI class, or if you found them difficult, I hope you're not taking the PGM class, because it will be more of the same." I would not say that. Someone may have difficulties with the AI class or not like how this topic was taught in the AI class, it doesn't necessarily mean Daphne Koller would have the same effect on the student. I don't know her teaching style. And provided this is a very central topic in AI, I don't think someone wishing to work in that field can avoid it. If you had difficulties with this topic and really want to do AI, work hard and take the course. (02 Jan '12, 21:26) AchilleTalon  4 I bought AIMA and thought it helpful for the additional depth it provided. I especially liked the historical notes and detailed bibliography. For ML I did not use a textbook (except for AIMA where applicable). I looked around a bit, but did not find one I thought would add enough value to buy. One that tempted me was Pattern Recognition and Machine Learning. That might be worth considering if you really like learning from books. I purchased the optional text for PGM: Probabilistic Graphical Models: Principles and Techniques. After reading the introductory material (chapters 1 and 2 and appendix A) I'm glad I did (it looks like a good book). I ran across some online books that might be worth checking out as well. This book is recommended in PGM appendix A for more detail on optimization if desired: http://www.stanford.edu/~boyd/cvxbook/ This book has been recommended in a few aiqus threads: http://www.inference.phy.cam.ac.uk/mackay/itprnn/book.html I purchased a copy of the first edition of Jurafsky and Martin for NLP. I'm hoping that will be sufficient. http://nlp.stanford.edu/IR-book/ (by Chris Manning, the other instructor) might be worth a look as well. answered 29 Dec '11, 14:59 rseiter ♦ 5.9k●2●15  1 I also bought Peter Norvig's AIMA and have hardly used it during the course. I mean, I began using it extensively in the first lessons but after that, job, family and those kind of things forced me to only watch the videos and very few times (midterm Q1 about agents and environments, noise and Laplace smoothing and so) go to the book. Now I'm going to take the Probabilistic Graphical Models course and I haven't planned to buy any book. answered 29 Dec '11, 14:37 ezubiriagonz... 640●6 Same here ... read the first few chapters and half of another assignment, then just used it for reference. (The ink smell was unpleasant, so I tried airing out a few pages before reading them.) (29 Dec '11, 16:28) EllenL  1 As robrambusch wrote, the textbook for NLP is Jurafsky's Speech and Language Processing, available at Amazon: http://www.amazon.com/Speech-Language-Processing-Daniel-Jurafsky/dp/0131873210/ Take a look at who wrote the first review of it. I'll give you a hint: wild shirts. Take a look, too, at the price.$115. answered 29 Dec '11, 15:16 Anne Paulson 4.2k●1●20 That was why I bought the first edition, <$20. Note that Peter Norvig's review was for the first edition (presumably the second is even better, but I don't know about 6x better). (29 Dec '11, 15:32) rseiter ♦ @rseiter: Where did you find it for <$20? update @rseiter: Thanks!! (29 Dec '11, 17:11) Mindy Bokser 1 A number of copies of the first edition (from 2000 NOT the current second edition from 2008) on Amazon for <$20 at http://www.amazon.com/gp/offer-listing/0130950696/ref=tmm_hrd_used_olp_sr?ie=UTF8&condition=used (29 Dec '11, 17:27) rseiter ♦ NLP is a fast-moving field. I suspect that there was a lot of progress between 2000 and 2008. The old textbook is no doubt nonetheless valuable, but I'm guessing the new textbook has significant upgrades. (29 Dec '11, 18:39) Anne Paulson You can get the international paparback version from amazon.co.uk for 36 GBP (~$56). US customers have to pay additional shipping costs though (29 Dec '11, 18:59) Ceda
 0 No textbook for Machine Learning. answered 29 Dec '11, 14:22 EllenL 3.1k●2●14
 0 PGM - For additional depth, you can refer to the best-selling textbook, "Probabilistic Graphical Models: Principles and Techniques" by Daphne and Nir Friedman. NLP - The best textbook for the class is Jurafsky and Martin, Speech and Language Processing 2nd Edition, complemented by chapters from Introduction to Information Retrieval - Manning, Schütze and Raghavan 2008 (free); other useful, good books include Manning and Schütze 1999 (2 chapters free), and Natural Language Processing with Python - Bird, Klein and Loper 2009 (free). quoted from class pages (see right-hand menu) answered 29 Dec '11, 14:22 robrambusch ♦ 23.3k●1●20●134
