Use bivariate scatterplots (constructing them where needed), to describe the patterns, features and associations of bivariate datasets, justifying any conclusions
Describe bivariate datasets in terms of form (linear/non-linear) and in the case of linear, also the direction (positive/negative) and strength of association (strong/moderate/weak)
Identify the dependent and independent variables within bivariate datasets where appropriate
Describe and interpret a variety of bivariate datasets involving two numerical variables using real-world examples in the media or those freely available from government or business datasets
Calculate and interpret Pearson’s correlation coefficient using technology to quantify the strength of a linear association of a sample
A bivariate dataset is often generated by an experiment:
Experiments with Two Variables
One of the simplest experiments we can conduct involves:
An Independent Variable which is changed throughout the experiment, normally called
A Dependent Variable which is measured throughout the experiment, normally called
The reason behind experiments is usually to determine whether or not influences .
Example
One possible experiment could be:
- how much chilli we put in a meal
- a rating out of for how much we enjoyed the meal
Each meal would have a different amount of chilli and rating. We can think of this as a pair of numbers:
After conducting the experiment multiple times, we get a sequence of pairs of numbers, which is a bivariate dataset.
Correlation
After analysing the dataset, we might find that there is a correlation:
Correlation
Correlation is when a relationship or trend occurs between two variables.
We mainly focus on linear correlation, but there are many types of correlation:
Types of Correlation
The main types of correlation are inspired by elementary functions. Some examples are below.
Linear Correlation
Quadratic Correlation
Exponential Correlation
Trigonometric Correlation
In this course we place most of our emphasis on linear correlation, and normally group all other types as “non-linear”.
No Correlation
Linear Correlation
To describe a linear correlation we need to classify the direction and strength:
Verbally Describing Linear Correlation
Describing Direction
The direction of a linear correlation can either be:
Positive - if tends to increase as increases
Negative - if tends to decrease as increases
From the scatterplot we can use the approximate gradient of the trend:
Describing Strength
The better the dataset approximates a line, the stronger the linear correlation:
Linear correlation can also be measured with most standard scientific calculators using Pearson’s Correlation Coefficient :
Calculating Linear Correlation
Pearson’s Correlation Coefficient is a number in the range that describes both the direction and strength of a linear correlation.
Finding Pearson’s Correlation Coefficient
The video below is a good demonstration on how you can find Pearson’s Correlation Coefficient with a calculator:
<Insert Thumbnail>
Describing Direction
The direction is described by the sign of the coefficient, so if:
then the linear correlation is positive
then the linear correlation is negative
then there is no linear correlation
Describing Strength
The strength is described by the magnitude of the coefficient . A rough guide as to how we can interpret the magnitude is:
Strength:
none
weak
moderate
strong
If we have:
then the dataset forms a perfect line with positive gradient
then the dataset forms a perfect line with negative gradient
then the dataset only approximates a line
Describing Linear Correlation
We can use both of the above ideas to completely describe a linear correlation with Pearson’s Correlation Coefficient. Some examples are:
A dataset with has a positive, moderate linear correlation
A dataset with has a negative, weak linear correlation
A dataset with has a positive, strong linear correlation
How the Coefficient Works
While we only need to be able to calculate and understand Pearson’s Correlation Coefficient, it is also useful to know how it works:
Pearson’s Correlation Coefficient
We first find the covariance which provides the direction of correlation:
Understanding Covariance
Covariance
The covariance between two variables is given by:
If then the direction of correlation is positive
If then the direction of correlation is negative
How Covariance gives Direction
To understand how the covariance formula works, we need to first find the point of averages :
Considering the term, we know that if:
then the term is positive
then the term is negative
then the term is
Considering the term, we know that if:
then the term is positive
then the term is negative
then the term is
We can now consider the product term if we remember that:
The product of two positive numbers is positive
The product of two negative numbers is positive
The product of a positive and negative number is negative
So the total sum will be positive if most of the data points tend to be in positive regions. Similarly, it will be negative if most of the data points tend to be in negative regions.
So this means that:
If the linear correlation is positive most data points tend to be in the positive regions covariance is positive
If the linear correlation is negative most data points tend to be in the negative regions covariance is negative
We can then modify the covariance to obtain Pearson’s Correlation Coefficient:
Understanding Pearson’s Correlation Coefficient
Pearson’s Correlation Coefficient
We obtain this coefficient by scaling the covariance:
Expanding this formula gives:
The Effect of Scaling
We
We can apply these concepts manually to a simple dataset:
A couple of fleshed out examples :)
Lesson Loading
Difficulty
00
Time
SOLUTION
Homework will appear here once homework is set for this session
Edit Homework
Confirm
Clear All
Cancel
Presets
Describing Correlation
Cambridge 2 Unit Year 12 | Exercise 9D
Question 1
Question 2
Question 4
Question 5
Calculating Correlation One
Cambridge 2 Unit Year 12 | Exercise 9E
Question 1 (a, b, c, d, e, f, g)
Question 2
Find the correlation with your calculator
Calculating Correlation Two
Cambridge 2 Unit Year 12 | Exercise 9F
Question 1
Only calculate Pearson's Correlation Coefficient
Question 4 (a, b)
Only calculate Pearson's Correlation Coefficient
Lesson Loading
×
×
Describing Scatterplots
Review Questions
A scatterplot must always follow a linear trend.
False
True
Which of the following is not used to describe a scatterplot?
Accuracy
Strength
Direction
Form
×
Describing Scatterplots
Review Questions
A scatterplot must always follow a linear trend.
False
True
Which of the following is not used to describe a scatterplot?
Strength
Form
Accuracy
Direction
×
Linear Correlation
Review Questions
What is the range of the linear correlation coefficient?
[-1, 1]
[0, 1)
[0, 1]
(-1, 1)
How can we tell that a scatterplot has a strong correlation? Assume the direction of the scatterplot is positive.
The correlation coefficient is close to -1
The correlation coefficient is close to 1
The correlation coefficient is close to 0
×
Linear Correlation
Review Questions
What is the range of the linear correlation coefficient?
[0, 1]
[0, 1)
(-1, 1)
[-1, 1]
How can we tell that a scatterplot has a strong correlation? Assume the direction of the scatterplot is positive.