Decision Trees for Business Intelligence and Data Mining_ Using SAS Enterprise Miner [de Ville 2006-10-30](1).pdf
(
2115 KB
)
Pobierz
The correct bibliographic citation for this manual is as follows: deVille, Barry. 2006.
Decision Trees for
®
Business Intelligence and Data Mining: Using SAS Enterprise Miner™.
Cary, NC: SAS Institute Inc.
Decision Trees for Business Intelligence and Data Mining: Using SAS
®
Enterprise Miner™
Copyright © 2006, SAS Institute Inc., Cary, NC, USA
ISBN-13: 978-1-59047-567-6
ISBN-10: 1-59047-567-4
All rights reserved. Produced in the United States of America.
For a hard-copy book:
No part of this publication may be reproduced, stored in a retrieval system, or
transmitted, in any form or by any means, electronic, mechanical, photocopying, or otherwise, without the
prior written permission of the publisher, SAS Institute Inc.
For a Web download or e-book:
Your use of this publication shall be governed by the terms established by
the vendor at the time you acquire this publication.
U.S. Government Restricted Rights Notice:
Use, duplication, or disclosure of this software and related
documentation by the U.S. government is subject to the Agreement with SAS Institute and the restrictions set
forth in FAR 52.227-19, Commercial Computer Software-Restricted Rights (June 1987).
SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513.
1st printing, November 2006
SAS
®
Publishing provides a complete selection of books and electronic products to help customers use SAS
software to its fullest potential. For more information about our e-books, e-learning products, CDs, and hard-
copy books, visit the SAS Publishing Web site at
support.sas.com/pubs
or call 1-800-727-3228.
SAS
®
and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS
Institute Inc. in the USA and other countries. ® indicates USA registration.
Other brand and product names are registered trademarks or trademarks of their respective companies.
Contents
Preface ................................................................................................ vii
Acknowledgments ...............................................................................xi
Chapter 1 Decision Trees—What Are They? ..................... 1
Introduction ..........................................................................................1
Using Decision Trees with Other Modeling Approaches ...................5
Why Are Decision Trees So Useful? ...................................................8
Level of Measurement .......................................................................11
Chapter 2 Descriptive, Predictive, and Explanatory
Analyses......................................................... 17
Introduction
..................................................................................................
18
The Importance of Showing Context .................................................19
Antecedents ................................................................................21
Intervening Factors .....................................................................22
A Classic Study and Illustration of the Need to
Understand Context ...........................................................................23
The Effect of Context..........................................................................25
How Do Misleading Results Appear? ................................................26
Automatic Interaction Detection ...............................................28
The Role of Validation and Statistics in Growing Decision Trees ....34
The Application of Statistical Knowledge to Growing
Decision Trees ....................................................................................36
Significance Tests.......................................................................36
The Role of Statistics in CHAID..................................................37
Validation to Determine Tree Size and Quality..................................40
What Is Validation? .....................................................................41
Pruning ................................................................................................44
iv
Contents
Machine Learning, Rule Induction, and Statistical Decision
Trees ................................................................................................... 49
Rule Induction ............................................................................ 50
Rule Induction and the Work of Ross Quinlan .......................... 55
The Use of Multiple Trees .......................................................... 57
A Review of the Major Features of Decision Trees .......................... 58
Roots and Trees ......................................................................... 58
Branches..................................................................................... 59
Similarity Measures .................................................................... 59
Recursive Growth....................................................................... 59
Shaping the Decision Tree......................................................... 60
Deploying Decision Trees .......................................................... 60
A Brief Review of the SAS Enterprise Miner ARBORETUM
Procedure................................................................................ 60
Chapter 3 The Mechanics of Decision Tree
Construction ................................................. 63
The Basics of Decision Trees ............................................................ 64
Step 1—Preprocess the Data for the Decision Tree Growing
Engine................................................................................................. 66
Step 2—Set the Input and Target Modeling Characteristics ........... 69
Targets........................................................................................ 69
Inputs .......................................................................................... 71
Step 3—Select the Decision Tree Growth Parameters .................... 72
Step 4—Cluster and Process Each Branch-Forming Input Field .... 74
Clustering Algorithms................................................................. 78
The Kass Merge-and-Split Heuristic ......................................... 86
Dealing with Missing Data and Missing Inputs in Decision
Trees ........................................................................................... 87
Step 5—Select the Candidate Decision Tree Branches................... 90
Step 6—Complete the Form and Content of the Final
Decision Tree..................................................................... 107
Plik z chomika:
musli_com
Inne pliki z tego folderu:
21 Recipes for Mining Twitter_ Distilling Rich Information from Messy Data [Russell 2011-03-10](1).pdf
(1049 KB)
Active Mining_ New Directions of Data Mining [Motoda 2002-07-29](2).pdf
(8618 KB)
Advanced Data Mining Techniques [Olson & Delen 2008-01-21](1).pdf
(1098 KB)
Advances in Data Mining_ Knowledge Discovery and Applications [Karahoca 2014](2).pdf
(15624 KB)
Advances in K-means Clustering_ A Data Mining Thinking [Wu 2012-07-10](1).pdf
(2511 KB)
Inne foldery tego chomika:
cheat-sheets
Data Structures
Demystified Series
Dreamweaver
Eclipse
Zgłoś jeśli
naruszono regulamin