Non-parametric test for an effect of an intervention in a stratified population.
A company is testing a sales software package. Their sales force of 500 people is divided into four regions: Northeast, Southeast, Central and West. Each sales person is expected to sell the same amount of products. During the last 3 months, only half of the sales representatives in each region were given the software program to help them manage their contacts.
The VP of Sales at the company, who is comfortable with statistics, wants to know the possible null and alternative hypotheses for a non-parametric test on this data using the chi-squared distribution. A non-parametric test is used on data that is qualitative or categorical, such as gender, age group, region, and color. It is used when it doesn’t make sense to look at the mean of such variables.
How could a chi-squared test of the effect of introducing the software program be set up and conducted in this scenario?
A possible way to test for the effect of introducing the software program in each region by turn using a non-parametric chi-squared test of independence would be to create a 2x2 contingency table comparing the change in performance in sales to the categorical variable of whether the salesperson was given the software program or not. If the software program is effective in increasing sales (or indeed perhaps worsens performance), the test of independence would indicate lack of independence. The null hypothesis of the test then would be that the software has no effect and the alternative that the software has an effect (beneficial or counterproductive).
So the 2x2 table for each region would look like this
assigned software program?
____________________ |____yes ____ |_____no_____|_______
performance _____improved________|____ a______|_____b______|__n1___
_____worsened_______ |_____c______|_____ d______|__n2___
| m1 | m2 | Nk
The chi-squared test for independence of the two variables 'performance' and 'assigned software program?' takes the form
`C = sum_(i=1)^2 sum_(j=1)^2 (O_(i,j)- E_(i,j))^2/E_(i,j) `
where `i ` denotes rows and`j ` denotes columns, the `O_(i,j) ` and `E_(i,j) ` are observed and expected counts in the `(i,j) `th cell.
` `Straightforwardly, the `O_(i,j) ` are the observed counts in the cells, ie `a, b, c, d ` as shown. If half of the salespeople in the region are assigned to the software then the respective sums of those given the software and those not, `m_1 ` and `m_2 ` will be equal to each other and equal to `N_k/2 ` where `N_k ` is the total number of salespeople in that particular region (indexed by `k `). The `E_(i,j) ` are calculated, in this test of independence, as
`E_(i,j) = (n_im_j)/N_k ` .
The number of degrees of freedom for the chi-squared test for each region is `d=(r-1)(c-1) ` where `r ` and `c ` are the number of rows and columns respectively. Here then, `d=1 `.
To test for effectiveness of introduction of the software program in each region, this test can be performed in each region, by comparing `C_k ` to a `chi^2_1 ` distribution in the `k `th region, ie to the `(1-alpha) `th percentile, where `alpha ` is the 'size' of the test, or the type I error of rejecting the null hypothesis when it is true. The overall 'size' `alpha ` (usually taken to be 0.05 corresponding to 5%) should be adjusted to take account of the multiple testing, however. This means that each test, in order to achieve an overall/global 'size' of `alpha `, should be performed with a 'size' of `alpha' = alpha/K = alpha/4 ` where `K=4 ` is the number of regions. Since independent chi-squared variables can be added together to form a larger chi-squared variable (with the degrees of freedom simply added together), a global test of whether an effect is present generally over all the regions combined can be performed. Simply add the 4 chi-squared statistics from the 4 regions and compare the resulting value to a `chi^2_4 ` distribution (ie with 4 degrees of freedom).
This global test is like performing a chi-squared test on a 2x2x4 contingency table (imagine the tables stacked on top of each other). Looking at the result from each table separately is like 'drilling down' to look for where evidence of an effect might be. Strictly, this should only be done after a significant global effect is detected in the global test. The size parameter `alpha ` needs to be reduced when drilling down as if one looks at many regions for evidence of an effect, looking at more regions would make it more likely to find evidence of an effect, purely by chance (if you hunt ferociously for a strange coincidence, you'll find one! And that's not a coincidence anymore...).