How would one test the effect of two multi-levelled independent variables (ie factors) on a dependent variable?
The two independent variables have four possible outcomes each, namely (a,b,c,d) and (1,2,3,4).
Can the test be done in microsoft excel?
Write the levels of the two factors as indicator variables. Choose a baseline level combination eg`` "a" for the first combined with "1" for the second variable. Then you will have vector indicator variables `x_1` and `x_2` where
`x_1 = c(0,0,0)` if variable 1 is equal to "a"
`= c(1,0,0)` " " "b"
`= c(0,1,0)` " " "c"
`= c(0,0,1)` " " "d"
Define `x_2` similarly.
You can then fit a linear model of the dependent variable `y` on the independent vector variables `x_1` and `x_2`:
`y approx beta_0 + beta_1 x_1 + beta_2 x_2 + beta_(12) x_1x_2`
where `beta_0` is the intercept (here, the expected value of `y` given the independent variables take the baseline combination ("a","1") ) and where `beta_1` and `beta_2` are vector-valued coefficients of length 3 that give the difference in effect between a level of one the independent variables and it's baseline level. There might be an interaction between the two indepedent variables for different levels of those variables, which is accounted for by the vector-valued coefficient `beta_(12)` . This coefficient has length 4x4 - 1 = 15.
Once the parameters are estimated, their significance can be tested by constructing t-statistics. Which coefficients are significant will tell you where differences in factor levels make a significant difference to the measured outcome. If there are many parameters to test the significance of, beware of correcting for multiple testing.
This model could be fitted in Excel using the standard simple linear regression formulae, or possibly with a more fancy statistics tool. Better to download the widely-used R stats package though which is free (!) and easy to use after you have familiarised yourself with the basic techniques.