Interface 1999 Short Course:

Data Mining

Abstract

This course surveys the leading computer-intensive algorithms for inductive classification and estimation. The inner workings of methods drawn from Data Mining, Statistics, and Machine Learning are described and compared. The instructor will also briefly demonstrate their relative effectiveness on practical applications. We'll first review classical statistical techniques, both linear and nonparametric, then outline the ways in which these basic tools are modified and combined into more modern methods.

The course pays particular attention to four powerful approaches: Neural Networks, Polynomial Networks, Kernels, and Decision Trees. Actual scientific and business problems are examined to demonstrate useful accompanying techniques such as scientific visualization, bootstrapping, and "bundling" employed by experienced analysts.

Along the way, major relative strengths and distinctive properties of the leading commercial software tools for Data Mining will be discussed.

Who should attend

Those who work with data and wish to understand the links between recent developments in pattern discovery, data mining, and inductive modeling. At the conclusion of this course, one should be able to discern the basic strengths of competing methods and select the appropriate tools for one's applications. Participants should have prior working experience with computers and introductory knowledge of applied statistical techniques.

Outline

Pattern Discovery: An Overview

Classical Statistical Techniques (brief review)

Modern Methods Key General Tools Data Issues (Brief) Outline of Other Methods Comparing and Combining Methods Instructor

John F. Elder, Ph.D.
elder@datamininglab.com
Elder Research, Charlottesville, VA

Handouts

Participants will receive spiral-bound copies of the slides, annotated reference lists, and a copy of the invited book chapter "A Statistical Perspective on Knowledge Discovery in Databases", by John Elder and Daryl Pregibon.

A Note About The Course Scope

Each of the major topics discussed could clearly comprise a semester-long course if presented in full detail! What this (admittedly intensive) short course provides however, is a broad overview of the highlights, drawing connections between major developments in the diverse fields that contribute to the emerging discipline of Data Mining. Previous participants have found this "big picture" to be useful for identifying avenues worthy of further exploration, whether for research or practical problem-solving.


Take me back to the main conference page