biserial correlation

Dr. Meenakshi Shukla

Assistant Professor

Department of Psychology

Magadh University

Bodh Gaya

Biserial

Correlation

What is Biserial Correlation?❑ Suppose you have a set of bivariate data from the bivariate normal distribution. The

two variables have a correlation, sometimes called the product-moment correlation

coefficient. Now suppose one of the variables is dichotomized by creating a binary

variable that is zero if the original variable is less than a certain variable and one

otherwise.

❑ For example, you may want to calculate the correlation between IQ and the score on

a certain test, but the only measurement available with whether the test was passed

or failed. You could then use the biserial correlation to estimate the more meaningful

product-moment correlation.

• The biserial correlation is a correlation between a continuous variable

and a binary variable, where the binary variable is not a true binary

variable but a continuous variable has been dichotomized to create a

binary variable.

• Biserial correlation (rbis or rb) is a correlational index that estimates the

strength of a relationship between an artificially dichotomous variable

and a true continuous variable. Both variables are assumed to be

normally distributed in their underlying populations.

Assumptions of Biserial Correlation

•Assumption #1: Both of your two variables should be measured on a continuous scale.

Assumption #2: One of the variables should be made dichotomous. Examples of such

artificial dichotomous variables include Pass or Fail, above 75 or below 75 attendance,

Happy or Sad, and so forth.

•Assumption #3: There should be no outliers for both the continuous variables. You can

test for outliers using boxplots.

•Assumption #4: Your continuous variables should be approximately normally

distributed . You can test this using the Shapiro-Wilk test of normality.

•Assumption #5: Your continuous variables should have equal variances. You can test this

using Levene's test of equality of variances.

𝑟𝑏 =𝑀1 −𝑀0

SDt

×𝑝𝑞

𝑦

Formula:

Where,

𝑀0 = mean score for data pairs for x=0,

𝑀1 = mean score for data pairs for x=1,

q = proportion of data pairs for x=0,

p = proportion of data pairs for x=1,

SDt = population standard deviation,

y = ordinate or the height of the standard normal distribution at the point which divides the

proportions of p and q

A teacher wants to determine whether there is a relationship between the results of the

students (Pass or Fail) and the number of hours per week that they devoted to their studies.

The data of 14 students is given below. Calculate biserial correlation from the data given

below:

Result Study hours

Pass 2

Pass 3

Pass 3

Pass 4

Pass 5

Pass 3

Pass 3

Pass 2

Pass 1

Fail 0

Fail 3

Fail 5

Fail 0

Fail 1

Result Study hours

Pass (p) 2

Pass (p) 3

Pass (p) 3

Pass (p) 4

Pass (p) 5

Pass (p) 3

Pass (p) 3

Pass (p) 2

Pass (p) 1

Fail (q) 0

Fail (q) 3

Fail (q) 5

Fail (q) 0

Fail (q) 1

Let’s call Pass as 1 and Fail as 0. Then, the proportion of passed students will be denoted by p and the

proportion of failed students will be denoted by q.

𝑀1 =


SDt

×𝑝𝑞

𝑦

2 + 3 + 3 + 4 + 5 + 3 + 3 + 2 + 1

9

= 26

9

= 2.89

0 + 3 + 5 + 0 + 1

5

𝑀0 =

9

5=

= 1.80

=33.50

13

= 2.5769

= 1.605


SDt

×𝑝𝑞

𝑦

p= 9/14 = .64

q= 5/14 = .36

y= .50 - .36 = .14

• In the ordinate table, check at .14

under ‘Area from mean’ and see the

value of ‘y’ which is the ordinate50% 50%

p

q


SDt

×𝑝𝑞

𝑦

=2.89 − 1.80

1.605×.64 × .36

.3739

=1.09

1.605×.2304

.3739

= .6791 × .6162

= .42

Calculating biserial correlation from

point-biserial correlation, and vice-

versa

𝑟𝑏 =𝑟𝑝𝑏 𝑝𝑞

𝑦

Significance testing

2 × .42

5

12

514

=

= -.27

• Using z-table check the p-value to

determine significance of biserial

correlation.

• Remember to multiply the Table value

by 2 to get a two-tailed p-value in

case of a two-tailed hypothesis.

Since the p-value is .79, the biserial correlation is non-significant. This means that there

is not a significant relationship between result and study hours.

• The p-value obtained from the table is .39358. Since it is the p-value for a one-tailed test,

multiply it by 2 to get p-value for a two-tailed test.

• If you have a specific one-tailed hypothesis, then you can use the one-tailed value from the

table and will not need to multiply it by 2.

• To recap the concept of one-tailed and two-tailed tests, see the next two slides.

p-value for two-tailed test:

= .39358 x 2

= .78716

= .79

Practice question 1:

Question: From the following data, obtain biserial correlation

and interpret the result.

Negative affectivity Scores on Beck Depression

Inventory

High 0

Low 12

High 14

High 54

Low 12

High 60

Low 43

Low 36

Low 9

High 58


Question: From the following data, obtain biserial correlation

and interpret the result.

Results IQ

Above average 80

Above average 85

Above average 90

Above average 104

Above average 88

Above average 110

Below average 100

Below average 110

Below average 98

Below average 88

Help: https://www.youtube.com/watch?v=RwqkiTDCgnc&t=699s

Class Interval X (Trained) Y (Untrained)

46-50 2 3

41-45 1 4

36-40 3 5

31-35 4 5

26-30 2 2

21-25 5 5

16-20 2 4

11-15 2 1

6-10 2 2

0-5 1 1


Question: From the following data, obtain biserial correlation and interpret the result. (Hint: Use Assumed Mean

method to calculate mean and then follow the regular process of calculating biserial correlation)

https://www.youtube.com/watch?v=RwqkiTDCgnc&t=699s

Biserial Correlation using SPSS

• A teacher wants to determine whether there is a relationship between the results of the students (Pass or

Fail) and the number of hours per week that they devoted to their studies. The data of 40 students is

available.

• Therefore, two variables were created in the Variable View of SPSS Statistics: Result, which had two

categories (“Pass" and “Fail") and StudyHours (i.e., a variable denoting the number of hours per week that a

student devoted to studies).

Click Analyze > Correlate > Bivariate... on the top menu, as shown below:

You will be presented with the following Bivariate Correlations screen:

SPSS Output

Thank you…

biserial correlation

Documents