Z-TEST

FISHER'S  Z-TEST

 The significance of the correlation coefficient in small samples can also be tested with the help of the Z-test given by Prof. Fisher. The sampling distribution of Z is very close to the normal curve. It is practically independent of the degree of correlation present in the population.

Fisher's Z-test, also known as the Z-transformation or Fisher's Z-transformation, is a statistical method used to assess the significance of the correlation coefficient (r) between two variables. It is named after Sir Ronald A. Fisher, the renowned statistician who contributed significantly to the field of statistics.

The primary purpose of Fisher's Z-test is to determine whether the observed correlation between two variables is statistically significant or if it could have occurred due to random chance. This test is particularly useful when working with small sample sizes or when the distribution of the correlation coefficient is not approximately normally distributed.

Here are the key steps and components of Fisher's Z-test:

Calculate the Sample Correlation (r): Start by calculating the sample correlation coefficient (r) between the two variables of interest. This coefficient measures the strength and direction of the linear relationship between the variables.

Transform r into the Z-score: Fisher's Z-test transforms the sample correlation coefficient (r) into a Z-score using the following formula:

Z = 0.5 * ln((1 + r) / (1 - r))

where:

Z is the Z-score.

ln represents the natural logarithm.

Compute the Standard Error (SE) of Z: The standard error of the Z-score is calculated to assess the precision of the Z-score estimate. The formula for SE is:

SE = 1 / √(n - 3)

where n is the sample size.

Calculate the Z-statistic: The Z-statistic is obtained by dividing the Z-score by its standard error:

Z-statistic = Z / SE

Determine Significance: The Z-statistic is compared to the critical values from the standard normal distribution (Z-distribution) table at a chosen significance level (alpha). If the absolute value of the Z-statistic is greater than the critical value (usually corresponding to the chosen alpha level), then the correlation coefficient (r) is considered statistically significant.

In summary, Fisher's Z-test is used to assess whether the correlation between two variables is statistically significant. It does this by transforming the correlation coefficient (r) into a Z-score and comparing it to critical values from the standard normal distribution. If the Z-score is beyond the critical value, it suggests that the correlation is unlikely to be due to random chance.

Fisher's Z-test is particularly valuable when working with small sample sizes or when the distribution of the correlation coefficient is non-normal, as it provides a more robust method for assessing the significance of correlations in such situations.

FISHER'S  Z-TEST..


Z-test is performed for the two purposes:

a) Whether an observed value of r differs significantly from some hypothetical value:

Z=(z-epsilin)/S.E.  #z=1.1513 log10(1+r)/(1-r) 

# epsilion=1.1513 log10(1+p)/(1-p)

#p=Hypothetical correlation coefficient

S.E=Standard Error=1/square root(n-3)  #n=Size of sample

Thus the computed value of Z is compared with the table value 1.96. If this value exceeds 1.96(at 5% significance level) the difference between the two correlation coefficients is considered significant.

b)To know whether the difference in correlation coefficient of two small samples is significant or not:

Z=(Z1-Z2)/S.E.

#Z1=1.1513 log10(1+r1)/(1-r1) 

#Z2=1.1513 log10(1+r2)/(1-r2) 

S.E.=square root[1/(n1-3) +1/(n2-3)]

Here, r1 = Correlation coefficient of the first sample

 r2 = Correlation coefficient of the second sample

n1= Size of the first sample

n2= Size of the second sample.

Here, also the value of Zis is compared with 1.96 at a 5% significance level, and it is decided whether the difference is significant or not.

Z-scores in Python

Standardizing a distribution to a z-score means converting it from a raw distribution with a given mean and variance to a distribution with a mean of zero and a variance of one.
#Make data
data = {'ac' : [70676575767369687076777585,
8685767573959489949391],
'teach' : [1111112222223333334,
44444],
'text' : [1112221112221112221,
11222]}
import pandas as pd #library
#Make data frame
df=pd.DataFrame(data)
df
import statistics
ac = df['ac']
z_numerator = (ac - statistics.mean(ac))
z_denominator = statistics.stdev(ac)
z = z_numerator/z_denominator
z
0 -0.937148 1 -1.248091 2 -1.455387 3 -0.418910 4 -0.315262 5 -0.626205 6 -1.040796 7 -1.144444 8 -0.937148 9 -0.315262 10 -0.211614 11 -0.418910 12 0.617568 13 0.721215 14 0.617568 15 -0.315262 16 -0.418910 17 -0.626205 18 1.654045 19 1.550397 20 1.032159 21 1.550397 22 1.446750 23 1.239454 Name: ac, dtype: float64

from scipy import stats
ac = df['ac']
stats.zscore(ac, axis=0, ddof=1)
array([-0.93714826, -1.24809146, -1.45538693, -0.41890959, -0.31526186, -0.62620506, -1.04079599, -1.14444373, -0.93714826, -0.31526186, -0.21161412, -0.41890959, 0.61756775, 0.72121548, 0.61756775, -0.31526186, -0.41890959, -0.62620506, 1.65404508, 1.55039735, 1.03215868, 1.55039735, 1.44674962, 1.23945415])
import matplotlib.pyplot as plt
ac = df['ac']
z = stats.zscore(ac, axis=0, ddof=1)
ac_hist = plt.hist(ac)
ac.z_hist = plt.hist(z)
import scipy as sp
sp.stats.skew(ac)
0.3882748054677566
sp.stats.kurtosis(ac)
-1.2190774612249433

More

T-TEST

F-TEST

Seasonal Variation


Post a Comment

0 Comments