Brain Everitt, etc – Cluster Analysis 5
Description
Cluster analysis comprises a range of methods for classifying multivariate data into subgroups. By organizing multivariate data into such subgroups, clustering can help reveal the characteristics of any structure or patterns present. These techniques have proven useful in a wide range of areas such as medicine, psychology, market research and bioinformatics.
This fifth edition of the highly successful Cluster Analysis includes coverage of the latest developments in the field and a new chapter dealing with finite mixture models for structured data.
Real life examples are used throughout to demonstrate the application of the theory, and figures are used extensively to illustrate graphical techniques. The book is comprehensive yet relatively non-mathematical, focusing on the practical aspects of cluster analysis.
KEY FEATURES:
Presents a comprehensive guide to clustering techniques, with focus on the practical aspects of cluster analysis.
• Provides a thorough revision of the fourth edition, including new developments in clustering longitudinal data and examples from bioinformatics and gene studies
• Updates the chapter on mixture models to include recent developments and presents a new chapter on mixture modeling for structured data.
Practitioners and researchers working in cluster analysis and data analysis will benefit from this book.
TABLE OF CONTENTS
Preface.
Acknowledgement.
1 AN INTRODUCTION TO CLASSIFICATION AND CLUSTERING.
1.1 Introduction.
1.2 Reasons for classifying.
1.3 Numerical methods of classification – cluster analysis.
1.4 What is a cluster?
1.5 Examples of the use of clustering.
1.5.1 Market research.
1.5.2 Astronomy.
1.5.3 Psychiatry.
1.5.4 Weather classification.
1.5.5 Archaeology.
1.5.6 Bioinformatics and genetics.
1.6 Summary.
2 DETECTING CLUSTERS GRAPHICALLY.
2.1 Introduction.
2.2 Detecting clusters with univariate and bivariate plots of data.
2.2.1 Histograms.
2.2.2 Scatterplots.
2.2.3 Density estimation.
2.2.4 Scatterplot matrices.
2.3 Using lower-dimensional projections of multivariate data for graphical representations.
2.3.1 Principal components analysis of multivariate data.
2.3.2 Exploratory projection pursuit.
2.3.3 Multidimensional scaling.
2.4 Three-dimensional plots and trellis graphics.
2.5 Summary.
3 MEASUREMENT OF PROXIMITY.
3.1 Introduction.
3.2 Similarity measures for categorical data.
3.2.1 Similarity measures for binary data.
3.2.2 Similarity measures for categorical data with more than two levels.
3.3 Dissimilarity and distance measures for continuous data.
3.4 Similarity measures for data containing both continuous and categorical variables.
3.5 Proximity measures for structured data.
3.6 Inter-group proximity measures.
3.6.1 Inter-group proximity derived from the proximity matrix.
3.6.2 Inter-group proximity based on group summaries for continuous data.
3.6.3 Inter-group proximity based on group summaries for categorical data.
3.7 Weighting variables.
3.8 Standardization.
3.9 Choice of proximity measure.
3.10 Summary.
4 HIERARCHICAL CLUSTERING.
4.1 Introduction.
4.2 Agglomerative methods.
4.2.1 Illustrative examples of agglomerative methods.
4.2.2 The standard agglomerative methods.
4.2.3 Recurrence formula for agglomerative methods.
4.2.4 Problems of agglomerative hierarchical methods.
4.2.5 Empirical studies of hierarchical agglomerative methods.
4.3 Divisive methods.
4.3.1 Monothetic divisive methods.
4.3.2 Polythetic divisive methods.
4.4 Applying the hierarchical clustering process.
4.4.1 Dendrograms and other tree representations.
4.4.2 Comparing dendrograms and measuring their distortion.
4.4.3 Mathematical properties of hierarchical methods.
4.4.4 Choice of partition – the problem of the number of groups.
4.4.5 Hierarchical algorithms.
4.4.6 Methods for large data sets.
4.5 Applications of hierarchical methods.
4.5.1 Dolphin whistles – agglomerative clustering.
4.5.2 Needs of psychiatric patients – monothetic divisive clustering.
4.5.3 Globalization of cities – polythetic divisive method.
4.5.4 Women’s life histories – divisive clustering of sequence data.
4.5.5 Composition of mammals’ milk – exemplars, dendrogram seriation and choice of partition.
4.6 Summary.
5 OPTIMIZATION CLUSTERING TECHNIQUES.
5.1 Introduction.
5.2 Clustering criteria derived from the dissimilarity matrix.
5.3 Clustering criteria derived from continuous data.
5.3.1 Minimization of trace(W).
5.3.2 Minimization of det(W).
5.3.3 Maximization of trace (BW1).
5.3.4 Properties of the clustering criteria.
5.3.5 Alternative criteria for clusters of different shapes and sizes.
5.4 Optimization algorithms.
5.4.1 Numerical example.
5.4.2 More on k-means.
5.4.3 Software implementations of optimization clustering.
5.5 Choosing the number of clusters.
5.6 Applications of optimization methods.
5.6.1 Survey of student attitudes towards video games.
5.6.2 Air pollution indicators for US cities.
5.6.3 Aesthetic judgement
Lord –
This is Digital Download service, the course is available at Coursecui.com and Email download delivery.