• Welcome to Valhalla Legends Archive.
 

Covariance and correlation

Started by jd, May 31, 2004, 09:30 PM

Previous topic - Next topic

jd

I am having trouble with a simple statistics problem.  I have all of the numbers right but I am plugging them into or computing the formula wrong.
X          Y
-3        4
2        5
0        3
4       -2
-3       -3
I am coming up with a covariance of -.25 and I know that the correlation is -.022 but I am coming up with .017 Where am I going wrong?

jd

I figured it out! Covariance is correct at 0.25 and correlation is -0.022.  I was just reading the problem wrong. Thanks

Yoni

How did you do that? Describe in an overview and show the formulas you used please.

Adron

Hmm, I'm not getting the same results... Maybe I've forgotten statistics...


x        y
-3        4
2        5
0        3
4      -2
-3      -3

xmean = (-3 + 2 + 0 + 4 - 3) / 5 = 0
ymean = (4 + 5 + 3 - 2 - 3) / 5 = 7/5 = 1.4


xnorm  ynorm
-3        2.6
2         3.6
0         1.6
4         -3.4
-3        -4.4

cov = (-3*2.6 + 2*3.6 + 0*1.6 + 4*-3.4 + -3*-4.4)/5 = -0.2

jd

sum of X= 0
sum of Y = 7
sum of XY= -1
Sum of XY= -1
Sum of X squared= 38
sum of Y squared= 63

covariance

s=-1-  (0)(7)/5/4= -1- 0/5/4=-1/4=0.25

correlation coefficient

r= 5(-1)- (0)(7)/square root of [5(38)-0][5(63)-49=
r=-5-0/square root of (190)(266)= -5/sqaure root of 50540=-5/224.811  r= -0.022

sorry it took me so long to post this and sorry if it is a little confusing.  I couldn't paste the statistical notation into the message box.





Yoni

That looks interesting. What formulas did you use exactly? And what is the meaning of the covariance and correlation?

Adron

#6
IIRC:

average(x) = E(x) = expected value for x

variance(x) = E((x-E(x))^2) = expected (difference of x from its average squared)

covariance(x, y) = E( (x-E(x)) * (y - E(y)) ) = expected (difference of x from its average times difference of y from its average)

y can be another signal/series or just a time-delayed version of x. When you do it with a time-delayed version, you're measuring frequency contents in the signal x.


Edit:

Correlation measures the same thing as covariance but is normalized.

standard deviation, stddev(x) = sqrt(variance(x))

correlation(x, y) = covariance(x, y) / (stddev(x) * stddev(y))


Adron

And on another note, Maple agrees with me about the covariance of those sequences:

Quote
with(stats):
describe[covariance]([-3,2,0,4,-3], [4,5,3,-2,-3]);

                                -1/5