Statistics

cosmos 3rd April 2019 at 3:18am
Data & Knowledge

Statistics is concerned with the study of collective properties of collections of objects (refered to as Populations). One can model properties of this set (like Frequency distributions) as Probability distributions, even though these don't have the interpretation of probabilities, as there isn't any randomness. Any quantity intended to summarize properties of the population may be called a statistic

Statistical inference

For reasons of time or cost we may not wish or be able to study each individual element of the population. In statistical inference (probably the main area of study in statistics), the object is to draw conclusions ("infer") about the unknown population characteristics on the basis of information on some characteristics of a suitably selected sample.

The combination of a population with a random Sampling procedure, defines Random variables known as Samples, and allows for a more mathematically-principled application of Probability (Probability theory). Functions of these samples are also known as Statistics.

General approaches:

Some types of statistical inference problems:

Parametric statistical inference

When one is trying to estimate a finite set of Parameters, parametrizing a family of Probability distributions, one of which one assumes describes the population being studied.

Statistical hypothesis testing

Nonparametric statistics

One doesn't make assumption that the population is described by a parametrized family of distributions, but by a more general class.


Statistics

Sampling

Compressed sampling

Sampling without replacement

This changes things if the sample size is close to the total population size. Now, samples are not totally independent!

Resampling

Mathematical statistics

On the Mathematical Foundations of Theoretical Statistics

Sufficient statistic, Likelihood function

Mathematical Statistics Videos some YB videos

Statistical learning

aka predictive inference

Particular classes of statistical inference problems, where we attempt to infer ("learn") functions that allow one to make predictions about future data (so we are interested in inferring things about the population that allow us to predict certain other things, while in other statistical inference problems, the aim of inferring the things may be different (for e.g. estimating a parameter as accurately as possible), and therefore the measure of success may be difference)

See Machine learning, Statistical learning theory

Decision theory


Resources

Wiki article

Understanding statistics through explorables

–> Seeing theory -- a visual introduction to probability and statistics

Train, validation, test

http://colorfulengineering.org/SCICOMP.html

Error propagation


Statistics software

R language