Independent and Dependent Statistical Variables

The difference between an independent variable and a dependent variable is causation. The independent variable causes the dependent variable. Many people already understand this concept but they just get confused by the mathematical vocabulary. Basically you could just think that the “Dependant Variable” depends on what the “Independent Variable” is.

Some real world examples of mathematical relationships is the amount of electricity you use (Independent) and your electric bill (Dependent); or the amount of food you order (Independent) and your food bill (Dependant); or how much you drive (Independent) and how much more empty your gas tank is (Dependant). Within all of these relationships we can see that what we choose to do affects something else; and in mathematics we like to say the Independent Variable affects or changes the Dependant Variable.

Now that we have a firmer grasp of what independent and dependant mean in mathematics we can see how they actually appear in problems that we may see. Mathematically we can represent these things different ways and graphically usually label the independent variable on the x-axis and the dependent variable on the y-axis. Many people know the formula y=mx+b and looking at this from an abstract view we can think of we just plug in x and do something to it so we can find our y value and make our coordinate pair. Anyone who has worked with functions would be able to see this even better with f(x) = mx + b because we can literally see that f(x) is a function where we are doing something to an x-value to get a y-value. Another way we can look at this is we know that ‘m’ is slope; which means that the change in x is causing a change in y.

In statistics this relationship between Independent and Dependent variables is important because we want to find a statistical pattern connecting the two things. In the examples above we know from real life experience that these Independent Variables affect the Dependent Variables; but in some real world situations this may not be so obvious so statisticians have to use this vocabulary to describe their findings. For instance if we wanted to find the relationship between the amount of fish eggs in a pond to the pond temperature this is not really as obvious; it is hard to determine which would cause the other. Maybe if the pond is warmer fish are more active and want to lay eggs; or vice versa the activity of the fish causes the pond to warm up; or maybe their happens to be a correlation or pattern but there is no causation at all. This is a problem that many statisticians have to face because sometimes you do not know which variable affects another (A causes B, B causes A) or if they don’t even affect each other at all; for example let’s say that the crime rates have risen in the past years and so has global temperature. From this we cannot say that crime causes global warming or vice versa.

Sometimes there is even a third variable C that can affect a relationship. For example the rise of gas price (A), the temperature (B), and local beach attendance (C). We can see that A and B both mix together and affect C. Technically speaking there is an infinite amount of combinations of relationships between variables but that is something that a statistician has to use logic and collected data to correctly determine what relationship if any exists between different variables.