Psychological Statistics

Statistics has a wide variety of uses in the behavioral sciences.  It is useful to divide statistics into descriptive and inferential statistics.  Descriptive statistics describes a sample of subjects.  Inferential statistics uses In the behavioral sciences, the subjects are usually people, but they could be schools or other organizations, they could be neurons or other parts of the brain, and so on.  In any case, you usually have a population that you are interested in (perhaps all humans on Earth, or maybe all children between the ages of 2 and 10 in the United States, and so on), and you have data on a sample from that population.

The proper descriptive statistics to use depend on how the variable is measured.  Some variables are numeric, and of these, some are continuous and some not.  For example, height, weight, IQ and income are continuous (or so close to continuous that it doesn’t matter) while number of siblings is discrete.  When a variable is numeric, you want measures of its central tendency (e.g. the mean, median, trimmed mean, or mode), spread (e.g. standard deviation, range, or interquartile range), and shape (skewness and kurtosis.  Other variables are categorical, and these can either be ordered or nominal.  Ordered variables (also called ordinal) have an order, while nominal variables do not.  For both of these, a frequency table may be useful.  For ordinal variables the range is also useful.

There are also descriptive statistics about the relationship among variables.  These include the correlation and its variants, which are useful for numeric variables, and crosstabulation tables, which are useful for categorical variables.   When there are more than two variables, descriptive statistics include factor analysis (which is used for uncovering latent traits), multi-dimensional scaling, cluster analysis (which tells which groups of subjects are “near” each other), and others.

Inferential statistics are used to test hypotheses.  Research in the behavioral sciences often involves null hypothesis significance testing.  This process has its critics, but is very widely used.  To do NHST, you first set up a null hypothesis, which is nearly always some variation on “nothing is going on”.  For example, the null hypothesis might be that boys and girls have equal IQs, or that there is no relationship between height and depression.  Then you see if your data is likely to have arisen if the null hypothesis is true. 

There are a huge variety of inferential statistical techniques.  Regression (including ordinary regression, logistic regression and Poisson regression) tries to find the relationship between a dependent variable and one or more independent variables. T-tests and analysis of variance see whether different groups of subjects differ on a numeric variable.  Mixed models are often used when the variables are not independent (for example, when you are looking at the effectiveness of teaching methods, and students are in classes, which are in schools).

This is just an introduction to the use of statistics in behavioral sciences.  There are a huge variety of methods, and each has its pitfalls and benefits.