جزییات کتاب
Machine Learning - Made Easy To UnderstandIf you are looking for a book to help you understand how the machine learning algorithms "Random Forest" and "Decision Trees" work behind the scenes, then this is a good book for you. Those two algorithms are commonly used in a variety of applications including big data analysis for industry and data analysis competitions like you would find on Kaggle.This book explains how Decision Trees work and how they can be combined into a Random Forest to reduce many of the common problems with decision trees, such as overfitting the training data.Several Dozen Visual ExamplesEquations are great for really understanding every last detail of an algorithm. But to get a basic idea of how something works, in a way that will stick with you 6 months later, nothing beats pictures. This book contains several dozen images which detail things such as how a decision tree picks what splits it will make, how a decision tree can over fit its data, and how multiple decision trees can be combined to form a random forest.This Is Not A TextbookMost books, and other information on machine learning, that I have seen fall into one of two categories, they are either textbooks that explain an algorithm in a way similar to "And then the algorithm optimizes this loss function" or they focus entirely on how to set up code to use the algorithm and how to tune the parameters.This book takes a different approach that is based on providing simple examples of how Decision Trees and Random Forests work, and building on those examples step by step to encompass the more complicated parts of the algorithms. The actual equations behind decision trees and random forests get explained by breaking them down and showing what each part of the equation does, and how it affects the examples in question.Python Files & Excel File For Many Of The Examples Shown In The BookSome topics in machine learning don't lend themselves to equations in an Excel table. Things like error checking or complicated conditionals are hard to replicate outside of code. However some topics work quite well in a spreadsheet. Topics such as entropy and information gain, which is how a decision tree picks its splits, can be easily calculated in a spreadsheet. The spreadsheet used to generate many of the examples in this book is available for free download, as are all of the Python scripts that ran the Random Forests & Decision Trees in this book and generated many of the plots and images. If you are someone who learns by playing with the code, and editing the data or equations to see what changes, then use those resources along with the book for a deeper understanding.Topics CoveredThe topics covered in this book areAn overview of decision trees and random forestsA manual example of how a human would classify a dataset, compared to how a decision tree would workHow a decision tree works, and why it is prone to overfittingHow decision trees get combined to form a random forestHow to use that random forest to classify data and make predictionsHow to determine how many trees to use in a random forestJust where does the "randomness" come fromOut of Bag Errors & Cross Validation - how good of a fit did the machine learning algorithm make?Gini Criteria & Entropy Criteria - how to tell which split on a decision tree is best among many possible choicesAnd MoreIf you want to know more about how these machine learning algorithms work, but don't need to reinvent them, this is a good book for you