[Search] |

ABOUT:
[Introduction]POINTERS:
[Texts]## 62: Statistics |

Statistics is the science of obtaining, synthesizing, predicting, and drawing inferences from data. Elementary calculations of mean and standard variation suffice to summarize a large, finite, normally-distributed dataset; the field of Statistics exists since data are not usually so nicely given. If we do not know all the elements of the dataset, we must discuss sampling and experimental design; if the data are not normal we must use other parameters to summarize them, or resort to nonparametric methods; if multiple data are involved, we study the measures of interaction among the variables. Other topics include the study of time-dependent data, and the foundations necessary to avoid ambiguity or paradox. Computational methods (e.g. for curve-fitting) are of particular importance in applications to the sciences and engineering as well as financial and actuarial work.

Certainly the greatest field of overlap with Statistics is 60: Probability.

For experimental design see also 05: Combinatorics.

Some questions on matching data to geometric figures are more properly a question of geometry (especially when there is a unique right answer).

For numerical methods, see 65U05. In particular, this applies to particular curve-fitting algorithms.

Clustering algorithms are related to nearest-neighbor methods in computational geometry.

Other fields with some overlap as seen in the diagram are areas 90 (Operations Research, Game Theory), 93 (Control Theory), 92 (Sciences), 65 (Numerical Analysis), 94 (Communication), 01 (History), 68 (Computer Science), 15 (Linear Algebra)

This image slightly hand-edited for clarity.

- 62A01: Foundations
- 62B: Sufficiency and information
- 62C: Decision theory, see also 90A05, 90B50; for game theory, See 90D35
- 62D05: Sampling theory, sample surveys
- 62E: Distribution theory, see also 60EXX
- 62F: Parametric inference
- 62G: Nonparametric inference
- 62H: Multivariate analysis, see also 60EXX
- 62J: Linear inference, regression
- 62K: Design of experiments, see also 05BXX
- 62L: Sequential methods
- 62M: Inference from stochastic processes
- 62N: Survival analysis and censored data [Engineering statistics]
- 62P: Applications, see also 90-XX, 92-XX
- 62Q05: Statistical tables

This is one of the largest areas in the Math Reviews database; many of the subfields listed above are fairly large. Indeed, the sub-areas 62G05 (nonparametric inference; estimation) and 62M10 (time series) are among the largest of the 5-digit areas.

Browse all (old) classifications for this area at the AMS.

Comprehensive: "Encyclopedia of statistical sciences", edited by Samuel Kotz, Norman L. Johnson and Campbell B. Read. John Wiley & Sons, Inc., New York, 1989. 6317pp in 9 volumes, plus supplements. MR90g:62001

Dated but useful: Kendall, Maurice G.; Doig, Alison G., "Bibliography of statistical literature" in 3 volumes: pre-1940, 1940--49, 1950--58. Oliver and Boyd, Edinburgh 1968 356 pp. MR41#2810

"A dictionary of statistical terms", first authored by M. G. Kendall and W. R. Buckland; fifth edition by F. H. C. Marriott: Longman Scientific & Technical, Harlow; copublished in the United States with John Wiley & Sons, Inc., New York, 1990. 223 pp. ISBN 0-582-01905-2 MR91j:62001

Sachs, Lothar: "A guide to statistical methods and to the pertinent literature", Springer-Verlag, Berlin-New York, 1986. 212 pp. ISBN 3-540-16835-4 MR88a:62001

Deely, J. J.: "What is Bayesian statistics?", New Zealand Oper. Res. 2 (1974), no. 2, 108--132. MR53#9440

Chen, Louis H. Y.: "What is nonparametric statistics?" Math. Medley 16 (1988), no. 2, 66--71. CMP992344

Roberts, Harry V.: "For what use are tests of hypotheses and tests of significance?", Comm. Statist.---Theory Methods A5 (1976), no. 8, 753--761. MR56#1549

Good, I. J.: "What is the use of a distribution?" Multivariate Analysis, II (Proc. Second Internat. Sympos., Dayton, Ohio, 1968) pp. 183--203; Academic Press, New York 1969 MR41#4703

Speed, T. P.: "What is an analysis of variance?", With a discussion and a reply by the author. Ann. Statist. 15 (1987), no. 3, 885--941. MR88k:62126

Ruppert, David: "What is kurtosis? An influence function approach", Amer. Statist. 41 (1987), no. 1, 1--5. CMP882763

Spitzer, John J.: "A primer on Box-Cox estimation", Rev. Econom. Statist. 64 (1982), no. 2, 307--313. MR83h:62055

Koopmans, L. H.: "A spectral analysis primer", Time series in the frequency domain, 169--183, Handbook of Statist., 3; North-Holland, Amsterdam, 1983. CMP749786

There are several statistics USENET newsgroups: sci.stat.math, sci.stat.edu, sci.stat.consult, comp.soft-sys.stat.spss, comp.soft-sys.stat.systat

An online text [Jan de Leeuw]

Another online text in statistics, this one with links to similar stat projects.

Introduction to Factor Analysis

Mailing lists: see http://www.stats.gla.ac.uk/allstat or http://www.mailbase.ac.uk/lists/minitab/files/list-of-lists

A full index to statistical routines and software is available at StatLib.

GAMS Statistics and Probability software.

A number of online calculators and other illustrative statistical tools.

Popular commercial statistical software includes SAS, SPSS, S-plus, etc.

Statlets: Java applets to perform the full range of elementary statistical analysis.

Packages for Mathematica, versions 2.2 and 3.0.

- American Statistical Society
- Institute of Mathematical Statistics
- SIAM's statistics page.
- Here are the AMS and Goettingen resource pages for area 62.
- StatLib Index, CMU
- www.statistics.com - a commercial site but a good selection of tutorials and software reviews.
- Statistics page in the The World-Wide Web Virtual Library.
- UTK archives page

- Elementary statistical paradox.
- Mean, median and mode viewed as minimizing total variation.
- Definition and application of Z-scores
- Hypothesis testing: how can we decide whether or not something is zero?
- Why aggregate errors by summing their squares? And what are the consequences of using the least-squares criterion?
- A Bayes problem: if two medical tests show negative, what the probability I'm really sick?
- Elementary summary of marginal and conditional distributions
- Who was Weibull of the Weibull distribution?
- Using the Poisson distribution to debunk numerology based on the appearance of integers in a set or real numbers.
- How to find a good ellipse to match a cluster of points in the plane?
- Applications of fuzzy logic to clustering and image processing
- Clustering algorithms.
- General observations: what is model-fitting, if not just finding a straight line through data points?
- Sample model-fitting problem (one-parameter, nonlinear): deciding what exactly is the goal.
- Deciding the class of functions to use for a fit.
- Basic method of fitting data to a polynomial (uni- or multivariate) of fixed degree.
- Code to perform (4-parameter) curve fitting.
- Comparing slopes of various interpretations of least-squares lines.
- Fitting data to a particular (exponential) family of curves, or, why
*not*to have a mathematician try to do statistics. - How to find a
*monotone*function to fit data? - Hough transform of data, to find patterns.
- Source code for Hough transform
- Pointer to Minitab (statistical software)
- How to fit the best circle/ellipse to some points in the plane? (Summary, pointers, citations, code)
- More careful fitting of an ellipse to some data points.
- Fitting a curve with nonlinear parameters via GNUPLOT
- Fitting a plane to a large number of points -- numerical issues
- Citations, cautions to numerical fitting of polynomial to data
- Using multidimensional scaling to approximately embed metric data sets into the plane.
- Summary of multidimensional scaling (dimension reduction, singular-value decomposition, rather like principal component analysis) to pick out key data attributes -- or locate cities on a map.
- Matching data to a curve y=A sin(x+B)
- Statistical distributions of quantities derived from random points on spheres (Citations)
- Citation for statistical analysis of spherical data.
- Determining Lyapunov exponents of time series
- Why are Gaussian normal (or Poisson) distributions used in practice?
- Basic: regression results depend on objective function being minimized
- What is does the R^2 statistic measure in a regression?
- Sample fit to curve with nonlinear parameters
- Fitting best planes to data points in R^3 (Deming's vs Pearson's methods)
- Reference: cluster algorithms for data points in R^3
- Mahalanobis metrics to estimate linearity of data swarms
- Fitting a circle to data in the plane
- Pointer: Fitting an ellipse to data in the plane
- Fitting a cylinder to data in R^3
- Generalities on finding approximations to functions on R^n
- Fitting a sum of (a few) exponential functions to data
- Fitting data to Johnson distributions
- Fitting a sum of Gaussian distributions to data
- Use DeMoivre-Laplace limit theorem to test hypotheses of distribution
- Mean of medians of sets of three trials; order statistics
- Statistical tests to use for comparing populations (t-test, chi-squared)
- Use the t-test for comparing means
- Spearman rank-order correlation
- The Law of the Unconscious Statistician
- Filtered acceleration of time series
- Use of incomplete Beta function to predict population proportions from a sample
- How do statisticians stay interested? :-)

Last modified 2001/05/14 by Dave Rusin. Mail: