Tamraparni Dasu, AT&T Labs - Research, tamr@research.att.com
Theodore Johnson, AT&T Labs - Research

Visualizing Attribute Relationships with DataSpheres

Keywords: Data mining, attribute relationships, visualization, DataSpheres

Abstract: A significant component of data mining and analysis is the determination of relationships between attributes. Conventional visual methods for exploring attribute relationships rely on scatter plots of small subsets of attributes (typically pairwise) with linked views of data points. However, these techniques can become unwieldy in high dimensions because of the large number of scatter plots. Techniques such as clustering and projection pursuit tend to be computationally expensive. In this paper we attempt to capture how attribute distributions vary with each other. We use the DataSphere technique, a scalable space partitioning method proposed in our earlier work, to create a partition that captures the distance and direction information. We plot the marginal distributions of the attributes within each class of the partition. The partition boundaries place distance and directional constraints on the attribute values. We use these constraints to visualize the linkage between variables. We illustrate the technique using real and artificial data.