جزییات کتاب
Machine generated contents note: 1.Prologue -- 1.1.Machines that learn -- some recent history -- 1.2.Twenty canonical questions -- 1.3.Outline of the book -- 1.4.A comment about example datasets -- 1.5.Software -- Note -- 2.The landscape of learning machines -- 2.1.Introduction -- 2.2.Types of data for learning machines -- 2.3.Will that be supervised or unsupervised? -- 2.4.An unsupervised example -- 2.5.More lack of supervision -- where are the parents? -- 2.6.Engines, complex and primitive -- 2.7.Model richness means what, exactly? -- 2.8.Membership or probability of membership? -- 2.9.A taxonomy of machines? -- 2.10.A note of caution -- one of many -- 2.11.Highlights from the theory -- Notes -- 3.A mangle of machines -- 3.1.Introduction -- 3.2.Linear regression -- 3.3.Logistic regression -- 3.4.Linear discriminant -- 3.5.Bayes classifiers [-] regular and naive -- 3.6.Logic regression -- 3.7.k-Nearest neighbors -- 3.8.Support vector machines -- 3.9.Neural networks -- 3.10.Boosting -- 3.11.Evolutionary and genetic algorithms -- Notes -- 4.Three examples and several machines -- 4.1.Introduction -- 4.2.Simulated cholesterol data -- 4.3.Lupus data -- 4.4.Stroke data -- 4.5.Biomedical means unbalanced -- 4.6.Measures of machine performance -- 4.7.Linear analysis of cholesterol data -- 4.8.Nonlinear analysis of cholesterol data -- 4.9.Analysis of the lupus data -- 4.10.Analysis of the stroke data -- 4.11.Further analysis of the lupus and stroke data -- Notes -- 5.Logistic regression -- 5.1.Introduction -- 5.2.Inside and around the model -- 5.3.Interpreting the coefficients -- 5.4.Using logistic regression as a decision rule -- 5.5.Logistic regression applied to the cholesterol data -- 5.6.A cautionary note -- 5.7.Another cautionary note -- 5.8.Probability estimates and decision rules -- 5.9.Evaluating the goodness-of-fit of a logistic regression model -- 5.10.Calibrating a logistic regression -- 5.11.Beyond calibration -- 5.12.Logistic regression and reference models -- Notes -- 6.A single decision tree -- 6.1.Introduction -- 6.2.Dropping down trees -- 6.3.Growing a tree -- 6.4.Selecting features, making splits -- 6.5.Good split, bad split -- 6.6.Finding good features for making splits -- 6.7.Misreading trees -- 6.8.Stopping and pruning rules -- 6.9.Using functions of the features -- 6.10.Unstable trees? -- 6.11.Variable importance -- growing on trees? -- 6.12.Permuting for importance -- 6.13.The continuing mystery of trees -- 7.Random Forests -- trees everywhere -- 7.1.Random Forests in less than five minutes -- 7.2.Random treks through the data -- 7.3.Random treks through the features -- 7.4.Walking through the forest -- 7.5.Weighted and unweighted voting -- 7.6.Finding subsets in the data using proximities -- 7.7.Applying Random Forests to the Stroke data -- 7.8.Random Forests in the universe of machines -- Notes -- 8.Merely two variables -- 8.1.Introduction -- 8.2.Understanding correlations -- 8.3.Hazards of correlations -- 8.4.Correlations big and small -- Notes -- 9.More than two variables -- 9.1.Introduction -- 9.2.Tiny problems, large consequences -- 9.3.Mathematics to the rescue? -- 9.4.Good models need not be unique -- 9.5.Contexts and coefficients -- 9.6.Interpreting and testing coefficients in models -- 9.7.Merging models, pooling lists, ranking features -- Notes -- 10.Resampling methods -- 10.1.Introduction -- 10.2.The bootstrap -- 10.3.When the bootstrap works -- 10.4.When the bootstrap doesn't work -- 10.5.Resampling from a single group in different ways -- 10.6.Resampling from groups with unequal sizes -- 10.7.Resampling from small datasets -- 10.8.Permutation methods -- 10.9.Still more on permutation methods -- Note -- 11.Error analysis and model validation -- 11.1.Introduction -- 11.2.Errors? What errors? -- 11.3.Unbalanced data, unbalanced errors -- 11.4.Error analysis for a single machine -- 11.5.Cross-validation error estimation -- 11.6.Cross-validation or cross-training? -- 11.7.The leave-one-out method -- 11.8.The out-of-bag method -- 11.9.Intervals for error estimates for a single machine -- 11.10.Tossing random coins into the abyss -- 11.11.Error estimates for unbalanced data -- 11.12.Confidence intervals for comparing error values -- 11.13.Other measures of machine accuracy -- 11.14.Benchmarking and winning the lottery -- 11.15.Error analysis for predicting continuous outcomes -- Notes -- 12.Ensemble methods [--] let's take a vote -- 12.1.Pools of machines -- 12.2.Weak correlation with outcome can be good enough -- 12.3.Model averaging -- Notes -- 13.Summary and conclusions -- 13.1.Where have we been? -- 13.2.So many machines -- 13.3.Binary decision or probability estimate? -- 13.4.Survival machines? Risk machines? -- 13.5.And where are we going?