جزییات کتاب
■ To the best of our knowledge, STATISTICS Data Miner contains the most comprehensive selection of data mining methods available on the market (e.g., by far the most comprehensive selection of clustering techniques, neural networks architectures, classification/regression trees, multivariate modeling (includingMAR Splines), and many other predictive techniques; the largest selection of graphics and visualization procedures of any competing products);■ A selection of comprehensive, complete data mining projects, ready to run, and set up to competitively evaluate alternative models [using bagging (voting, averaging), boosting, stacking, meta-learning, etc.], and to produce presentation-quality summary reports;■ An extremely easy to use, drag-and-drop based user interface that can be used even by novices, but is still highly flexible, customizable, and provides one-click access to the underlying scripts;■ Powerful, interactive data exploration (drilling, slicing, dicing) tools, including the most comprehensive selection of interactive, exploratory graphics-visualization tools available in any product;■ Ability to handle/process simultaneously multiple data streams;■ Optimized for processing extremely large data sets (including options to pre-screen even over a million of variables, and/or draw stratified or simple random samples of records using DIEHARD-certified random sampling procedures);■ Flexible deployment engine, integrated with custom development environment allowing you to manage optimized analytic objects (nodes) for data mining using quick, industry standard, Visual Basic scripts (VB is built into the system);■ Extremely fast and efficient deployment via portable, XML syntax based PMML (Predictive Models Markup Language) files for prediction, predictive classification, or predictive clustering of large data files; trained models can be shared between desktop and WebSTATISTICA Data Miner (Client-Server) installations (see below);■ Options to write predicted values, classifications, classification probabilities, prediction residuals, and so on directly into external databases for subsequent analyses, selection, etc.; by using efficient IDP (In-Place Database Processing) technology for reading and writing information from and to external databases, datasets of extremely large sizes can be analyzed and scored (i.e., used to update predicted values, classification probabilities, and so on in the database);■ Open, COM-based architecture, unlimited automation options, and support for custom extensions (using industry standard VB (built in), Java, or C/C++/C#);■ Desktop or Client-Server options;■ Multithreading and distributed processing architecture delivers unmatched performance (offered in the Client-Server version) including super-computer-like parallel processing technology that optionally scales to multiple server computers that can work in parallel to rapidly process computationally intensive projects;■ Complete Web-enablement options (via WebSTATISTICA offering support for all data mining operations, including the interactive model building, via Internet browser using any computer connected to the Web); this ultimate enterprise data analysis/mining system allows you to manage projects over the Web and work collaboratively "across the hall or across continents."