Skip Headers
Oracle® Data Mining Concepts
11
g
Release 2 (11.2)
Part Number E12216-02
Home
Book List
Index
Master Index
Contact Us
Next
View PDF
Contents
List of Examples
List of Figures
List of Tables
Title and Copyright Information
Preface
Audience
Documentation Accessibility
Related Documentation
Conventions
What's New in Oracle Data Mining?
New Features in Oracle Data Mining 11
g
Release 2 (11.2)
New Features in Oracle Data Mining 11
g
Release 1 (11.1)
New Features in Oracle Data Mining 10
g
Release 2 (10.2)
Part I Introductions
1
What Is Data Mining?
What Is Data Mining?
Automatic Discovery
Prediction
Grouping
Actionable Information
Data Mining and Statistics
Data Mining and OLAP
Data Mining and Data Warehousing
What Can Data Mining Do and Not Do?
Asking the Right Questions
Understanding Your Data
The Data Mining Process
Problem Definition
Data Gathering and Preparation
Model Building and Evaluation
Knowledge Deployment
2
Introducing Oracle Data Mining
Data Mining in the Database Kernel
Data Mining Functions
Supervised Data Mining
Supervised Learning: Testing
Supervised Learning: Scoring
Unsupervised Data Mining
Unsupervised Learning: Scoring
Oracle Data Mining Functions
Data Mining Algorithms
Oracle Data Mining Supervised Algorithms
Oracle Data Mining Unsupervised Algorithms
Data Preparation
Supermodels
How Do I Use Oracle Data Mining?
Oracle Data Miner
PL/SQL Packages
SQL Functions
Java API
Oracle Spreadsheet Add-In for Predictive Analytics
Where Do I Find Information About Oracle Data Mining?
Oracle Data Mining Resources on the Oracle Technology Network
Oracle Data Mining Publications
Oracle Data Mining and Oracle Database Analytics
3
Introducing Oracle Predictive Analytics
About Predictive Analytics
Predictive Analytics and Data Mining
How Does it Work?
Predictive Analytics Operations
Oracle Spreadsheet Add-In for Predictive Analytics
APIs for Predictive Analytics
Predictive Analytics in the PL/SQL API
Predictive Analytics in the Java API
Example: Use OraProfileTask to Create Profile Results
Example: PREDICT
Behind the Scenes
EXPLAIN
PREDICT
Accuracy
PROFILE
Part II Mining Functions
4
Regression
About Regression
How Does Regression Work?
Linear Regression
Multivariate Linear Regression
Regression Coefficients
Nonlinear Regression
Multivariate Nonlinear Regression
Confidence Bounds
A Sample Regression Problem
Testing a Regression Model
Residual Plot
Regression Statistics
Root Mean Squared Error
Mean Absolute Error
Test Metrics in Oracle Data Miner
Regression Algorithms
5
Classification
About Classification
A Sample Classification Problem
Testing a Classification Model
Accuracy
Confusion Matrix
Lift
Lift Statistics
Receiver Operating Characteristic (ROC)
The ROC Curve
Area Under the Curve
ROC and Model Bias
ROC Statistics
Biasing a Classification Model
Costs
Costs Versus Accuracy
Positive and Negative Classes
Assigning Costs and Benefits
Priors
Classification Algorithms
6
Anomaly Detection
About Anomaly Detection
One-Class Classification
Anomaly Detection for Single-Class Data
Anomaly Detection for Finding Outliers
Sample Anomaly Detection Problems
Example: Find Outliers
Example: Score New Data
Algorithm for Anomaly Detection
7
Clustering
About Clustering
Interpreting Clusters
How are Clusters Computed?
Cluster Rules
Support and Confidence
Number of Clusters
Attribute Histograms
Centroid of a Cluster
Scoring New Data
Sample Clustering Problems
Example: Find Clusters
Example: Score New Data
Clustering Algorithms
8
Association
About Association
Transactions
Items and Collections
Sparse Data
Itemsets
Frequent Itemsets
A Sample Association Problem
Algorithm for Association Rules
9
Feature Selection and Extraction
Finding the Best Attributes
Feature Selection
Data for Attribute Importance
Example: Attribute Importance
Example: Predictive Analytics EXPLAIN
Feature Extraction
Data for Feature Extraction
Example: Features Created from Build Data
Example: Scored Features
Algorithms for Feature Selection and Extraction
Part III Algorithms
10
Apriori
About Apriori
Association Rules
Antecedent and Consequent
Confidence
Example: Calculating Rules from Frequent Itemsets
Metrics for Association Rules
Support
Confidence
Lift
Data for Association Rules
11
Decision Tree
About Decision Tree
Decision Tree Rules
Confidence and Support
Advantages of Decision Trees
Growing a Decision Tree
Splitting
Cost Matrix
Preventing Over-Fitting
XML for Decision Tree Models
Tuning the Decision Tree Algorithm
Data Preparation for Decision Tree
12
Generalized Linear Models
About Generalized Linear Models
GLM in Oracle Data Mining
Interpretability and Transparency
Wide Data
Confidence Bounds
Ridge Regression
Build Settings for Ridge Regression
Ridge and Confidence Bounds
Ridge and Variance Inflation Factor for Linear Regression
Ridge and Data Preparation
Tuning and Diagnostics for GLM
Build Settings
Diagnostics
Coefficient Statistics
Global Model Statistics
Row Diagnostics
Data Preparation for GLM
Data Preparation for Linear Regression
Data Preparation for Logistic Regression
Missing Values
Linear Regression
Coefficient Statistics for Linear Regression
Global Model Statistics for Linear Regression
Row Diagnostics for Linear Regression
Logistic Regression
Reference Class
Class Weights
Coefficient Statistics for Logistic Regression
Global Model Statistics for Logistic Regression
Row Diagnostics for Logistic Regression
13
k
-Means
About k-Means
Data Preparation for k-Means
14
Minimum Description Length
About MDL
Data Preparation for MDL
15
Naive Bayes
About Naive Bayes
Advantages of Naive Bayes
Tuning a Naive Bayes Model
Data Preparation for Naive Bayes
16
Non-Negative Matrix Factorization
About NMF
How Does it Work?
Data Preparation for NMF
17
O-Cluster
About O-Cluster
Data Preparation for O-Cluster
User-Specified Data Preparation for O-Cluster
18
Support Vector Machines
About Support Vector Machines
Advantages of SVM
Advantages of SVM in Oracle Data Mining
Usability
Scalability
Kernel-Based Learning
Active Learning
Tuning an SVM Model
Data Preparation for SVM
Normalization
SVM and Automatic Data Preparation
SVM Classification
Class Weights
One-Class SVM
SVM Regression
Part IV Data Preparation
19
Automatic and Embedded Data Preparation
Overview
The Case Table
Data Type Conversion
Date Data
Text Transformation
Business and Domain-Sensitive Transformations
Automatic Data Preparation
Enabling Automatic Data Preparation
Overview of Algorithm-Specific Transformations
Binning
Normalization
Outlier Treatment
Algorithms and ADP
Embedded Data Preparation
Transformation Lists and ADP
Creating a Transformation List
Transforming a Nested Attribute
Oracle Data Mining Transformation Routines
Binning
Normalization
Outlier Treatment
Transparency
Model Details and the Build Data
Reverse Transformations
Altering the Reverse Transformation Expression
Part V Mining Unstructured Data
20
Text Mining
About Unstructured Data
How Oracle Data Mining Supports Unstructured Data
Mixed Data
Text Data Types
Text Mining Algorithms
Text Classification
Multi-Class Document Classification
Multi-Target Document Classification
Document Classification Algorithms
Text Clustering
Text Feature Extraction
Text Association
Text Attribute Importance
Preparing Text for Mining
Sample Text Mining Problem
Oracle Data Mining and Oracle Text
Glossary
Index
Scripting on this page enhances content navigation, but does not change the content in any way.