Calculating Chi-squared with perl

There are two Perl repositories available on CPAN that deal with Chi-squared analysis(Statistics::ChiSquare and Statistics::Distributions).  However neither one outputs the Chi-squared value for the analysis of two binary populations.

We can use the formula below to calculate the Chi-squared value with one degree of freedom.

χ2 = [n(ad – bc)2] / [(a + b) (c + d) (a + c) (b + d)]

n = a + b + c + d

Where:

variable population 1 population 2
+ a b
c d

Example:
Suppose we wish to determine the relationship between disease in two species. Both disease and the species are binary variables, so the Chi-squared test is applied:

Diseased species 1 species 2
No 57 36
Yes 63 88

n = (57 + 36 + 63 + 88) = 244

χ2 = [244*(57*88 – 36*63)2] / [(57 + 36) (63 + 88) (57 + 63) (36 + 88)]

χ2 = 8.81

The critical Chi-squared distribution P-values at 1 degree of freedom are:

D.F. 0.1 0.05 0.025 0.01 0.005
1 2.71 3.84 5.02 6.63 7.88

The χ2 value (8.82) is below the P-value 0.005.

Since the corresponding P-value is less than 0.05 (P<0.05), the data suggest that the prevalence of disease is significantly higher in species 2. Therefore we reject the null hypothesis.

Below is a Perl subroutine to automatically calculate Chi-squared.

sub chi_squared {
     my ($a,$b,$c,$d) = @_;
     return 0 if($b+$d == 0);
     my $n= $a + $b + $c + $d;
     return (($n*($a*$d - $b*$c)**2) / (($a + $b)*($c + $d)*($a + $c)*($b + $d)));
}
print &chi_squared(57,36,63,88); 

Output:

8.81780430153469