دانلود کتاب Text Data Management and Analysis: A Practical Introduction to Information Retrieval and Text Mining
by ChengXiang Zhai, Sean Massung
|
عنوان فارسی: مدیریت و تجزیه و تحلیل داده های متن: یک مقدمه عملی برای بازیابی اطلاعات و استخراج متن |
دانلود کتاب
جزییات کتاب
This book provides a systematic introduction to many of these approaches, with an emphasis on covering the most useful knowledge and skills required to build a variety of practically useful text information systems. Because humans can understand natural languages far better than computers can, effective involvement of humans in a text information system is generally needed and text information systems often serve as intelligent assistants for humans. Depending on how a text information system collaborates with humans, we distinguish two kinds of text information systems. The first is information retrieval systems which include search engines and recommender systems; they assist users in finding from a large collection of text data the most relevant text data that are actually needed for solving a specific application problem, thus effecively turning big raw text data into much smaller relevant text data that can be more easily processed by humans. The second is text mining application systems; they can assist users in analyzing patterns in text data to extract and discover useful actionable knowledge directly useful for task completion or decision making, thus providing more direct task support for users. This book covers the major concepts, techniques, and ideas in information retrieval and text data mining from a practical viewpoint, and includes many hands-on exercises designed with a companion software toolkit (i.e., MeTA) to help readers learn how to apply techniques of information retrieval and text mining to real-world text data and how to experiment with and improve some of the algorithms for interesting application tasks. This book can be used as a textbook for computer science undergraduates and graduates, library and information scientists, or as a reference book for practitioners working on relevant problems in managing and analyzing text data.