Part 2. MM4XL Tools > 2. Analytical Tools > CrossTab > 3. Technicalities > Testing proportions for significance (Z-test)
## CrossTab ### Testing proportions for significance (Z-test) When from survey data we find out that of the interviewed persons 65% answered *Yes* to a certain question and 23% answered *No*, there is little doubt that a significant difference between the two values exists. But what when 46% versus 51% answered *Yes* and *No*, respectively? Can we still say the data differ under the circumstances the sample was drawn (say, sampling with 95% confidence interval, 5% error level, and hypothesis of the study set at 0.5)? In order to answer we should employ an appropriate test statistic, and the Z-test for comparing the significance of difference between two proportions from independent samples fits the contingency table case. Testing significance may result very useful when screening large amounts of contingency tables, for example, from a marketing survey. With significance values available one can locate very quickly data driving the most substantial differences in the tables. The table above shows in cell D10 that 33.3% of English speaking respondents belong to Client Class C. The string *c7* in cell D10 (means column 7, beginning from leftmost column in the table. In this example we use a table from a larger elaboration hence the numbers beginning from c6 instead of c1) tells us that the proportion 33.3% is *significantly* larger than 18.3% in cell C9, so we can safely believe that there are more English speking clients in Class C than in Class B. Similarly, the string *c7c8* in D10 tells us that 48.4% differs significantly from 33.3% and 18.3%. Finally, when two proportions do not differ at a significant level CrossTab does not print any small caps letter. The last row of text below each table shows the level at which proportions are tested, which is a level users typically set between 90% and 99%. Cell counts smaller than 30 are considered ** Small base* and caution should be used when interpreting these values, while counts below 6 are considered *** Very small base*. Statistically speaking, the null () and alternative () hypotheses CrossTab sets are: : or the difference : or the difference The test statistic, Z, is approximated by a standard normal distribution: Where: = Proportion of successes in sample 1. = Proportion of successes in sample 2. = Sample proportion from population i. = Size sample i. = Pooled estimate for population proportion. The probability of accepting or rejecting Ho is approximated by a standard normal distribution with mean 0 and standard deviation 1. The equation for the standard normal density function is: The null hypothesis is rejected if the Z value lies outside the critical value from the standard normal distribution. This means, when the achieved probability is higher that the user stated probability the two proportions are recognized as significantly different, the alternative hypothesis is true, and the Column identifier appears under the significant value. |