دانلود کتاب Natural language understanding
by James Allen
|
عنوان فارسی: فهم زبان |
دانلود کتاب
جزییات کتاب
Well, I was looking at a list abbrieviations of the categories (parts of speech) which the book used, and I noticed, for the first time after owning this book for over 10 years, that there was no abbriviation for "conjunction" listed. And indeed, after consulting the index and looking through the book, it is plain to see that this book doesn't treat conjunction at all!
I have many fond memories of this book--it is the book which my beloved professor at grad school taught me NLP from, and indeed, it contains far more information about NLP than most of its successors. For example, this book gives perhaps the best discussion of quantifier scope ambiguities of all the major NLP textbooks. (cf. with Jurafski and Martin's book, which devotes about 1/2 a sentence to quantifier scope ambiguities).
But it has odd ommissions, one of which is the lack of treatment of conjunction/disjunction. After devoting so much time to quantifier scope, why does Allen leave me in the dark about whether "Every woman" can take scope over "a man" in the sentence "A man and every woman hug each other?" Does that scope differently from "Every woman and a man hug each other?" Or what about "Every woman and her mother fight?" Can that mean "Every woman fights with her mother" or are we to look for another antecedient for "her"?
Or again, Allen's treatment of prolog-esque definite clause grammars. Allen deserves major kudus here for including them. Its obvious that he comes from the LISP side of the tracks, and most LISPy books on NLP ignore DCG's altogether (Norvig's "Paridigms of AI programming" being a notable exception). But it seems almost like Allen goes out of his way to present DCGs in the most unattractive light possible. Prolog has a nice syntactic sugar which makes a DCG look almost exactly like a context-free grammar specification, but you'd never know that if you only read this book--Allen chooses a wierd way to translate strings into clauses, which implies a bizzare-looking prolog grammar for them. The student naturally recoils in horror, but unless she reads a prolog-oriented book on NLP, she would never know how much easier DCGs are to program than ATNs or the bottom-up parsing methods which Allen goes on to expostulate.
Since this book was published, the field of NLP has taken a bit of a side-track through statistical learning of grammars--the thought being that, well, we really don't know how to do knowledge representation or pronoun resolution very well, so lets all spend a decade or so on how to induce grammars from corpora. This book doesn't cover any of this research, but frankly, I really don't consider that a critique of the book. Because now that grammar induction has been done to death, we're right back where this book leaves off--computers can parse sentences all right: heck, these days, computers can even assign numbers between 0 and 1 to parse trees-- but can computers UNDERSTAND sentences?
I would love to see a 3rd edition of this book, and I'm sure I'm not alone. What I'd like to see it cover is (surprise surprise) conjunction/disjunction, discourse representation theory, underspecification, and a more meaty discussion of knowledge representation and inference. Also, a few chapters on natural language generation would be nice, as well as discussions on dialogue. Skip the sections on ATNs and other parsing methods which are only of historical interest now.
Flaws and all, this book is beloved of generations of NLP researchers and is still indispensible, after all these years.