class: center, middle, inverse, title-slide # Design of Experiment and Analysis of Variance ## Repetition ### Raju Rimal ### 30 Aug, 2017 --- background-image: url(https://www.nmbu.no/sites/all/themes/nmbu_university/images/logo-nb.png) ??? Image credit: [NMBU](https://www.nmbu.no/sites/all/themes/nmbu_university/images/logo-nb.png) --- class: center, middle, inverse # ANOVA Model --- .left-column[ # ANOVA Model ## Random Effect Model - <a href="images/2011-1cd.png" data-fancybox="2011" data-caption="Exam 2011: 1(c) and 1(d)">Exam 2011: 1(c), 1(d)</a><a href="images/2011-a2.png" data-fancybox="2011" data-caption="Appendix 2"></a> - <a href="images/2012-1c.png" data-fancybox="2012" data-caption="Exam 2012: 1(c)">Exam 2012: 1(c)</a><a href="images/2012-a2.png" data-fancybox="2012" data-caption="Appendix 2"></a> - <a href="images/2013-1c.png" data-fancybox="2013" data-caption="Exam 2013: 1(c)">Exam 2013: 1(c)</a><a href="images/2013-a2.png" data-fancybox="2013" data-caption="Appendix 2"></a> - <a href="images/2014-1e.png" data-fancybox="2014" data-caption="Exam 2014: 1(e)">Exam 2014: 1(e)</a><a href="images/2014-a4.png" data-fancybox="2014" data-caption="Table 4"></a> - <a href="images/2015-2c.png" data-fancybox="2015" data-caption="Exam 2015: 2(c)">Exam 2015: 2(c)</a><a href="images/2015-a6.png" data-fancybox="2015" data-caption="Table 6"></a> ].right-column[ ### Intraclass Correlation Coefficient Proportion of variation between groups to total variation. `$$\rho = \frac{\sigma_\tau^2}{\sigma_\tau^2 + \sigma^2}$$` Using estimates of variance components, we can estimate **intraclass correlation coefficient**. ### Confidence interval of overall mean The `\(100(1-\alpha)\)` level of confidence interval for overall mean `\(\mu\)` in case of random effect model is, `$$\hat{\mu} \pm t_{\alpha/2, a-1}\sqrt{\frac{\text{MS}_\text{treatment}}{N}}$$` (Refer to Thore's Lecture on Random effect Model) ] --- .left-column[ # ANOVA Model ## Random Effect Model ### Intraclass Correlation `$$\rho = \frac{\sigma_\tau^2}{\sigma_\tau^2 + \sigma^2}$$` ### CI for overall mean `$$\hat{\mu} \pm t_{\alpha/2, a-1}\sqrt{\frac{\text{MS}_\text{treatment}}{N}}$$` ].right-column[ ### Interpretation of Intraclass correlation coefficient - *Proportaion of variation* between groups to total variation - Correlation between the observation **within same group** - In `besettning` and `fettprosent` example, if the correlation is 0.90 shows that the major variation in `fettprosent` is explained by besettning and thus the `cows` in each besettning is more identical and has correlation of 0.90. ### Interpretation of Overall Mean We can extend interpretation of overall mean for whole population For example, if `besetning` (farms) is a random factor, than the overall mean can refer to the average `fettprosent` in the milk from the entile population of besetning. ] --- .left-column[ # ANOVA Model ## Random Effect Model <img src="Day3_files/figure-html/unnamed-chunk-2-1.png" width="90%" /> .side-caption[ - <a href="images/2014-a4.png" data-fancybox="2014">Table 4: Anova Output</a> - <a href="images/F0025.png" data-fancybox="2014">F distribution table at 0.025 level</a> ]].right-column[ ### Confidene interval for correlation `\(L\)` and `\(U\)` gives the confidence interval for `\(\sigma_\tau^2/\sigma^2\)`. `\begin{aligned} L &= \frac{1}{n}\left( \frac{\text{MS}_\text{treatments}}{\text{MSE}} \frac{1}{F_{\alpha/2, a-1, N-a}} - 1 \right) \\ U &= \frac{1}{n}\left( \frac{\text{MS}_\text{treatments}}{\text{MSE}} \frac{1}{F_{1-\alpha/2, a-1, N-a}} - 1 \right) \end{aligned}` So, the confidence interval for `\(\rho = \frac{\sigma_\tau^2}{(\sigma_\tau^2 + \sigma^2)}\)` is, `$$\frac{L}{1 + L} \le \frac{\sigma_\tau^2}{\sigma_\tau^2 + \sigma^2} \le \frac{U}{1 + U}$$` <h3><a href="images/2014-1e.png" data-fancybox="2014">Exam 2014: 1(e)</a></h3> Here, `\(L = 8.39\)` and `\(U = 158.385\)`, so, the confidence interval for `\(\rho\)` is (0.893, 0.994). ] --- .left-column[ # ANOVA Model ## Two factors <img src="Day3_files/figure-html/unnamed-chunk-5-1.png" width="90%" /> .side-caption[ - Do we need interaction? - How about Gender, is it significant? - What can we see if interaction is not significant? - Is blocking a two factor model? ]].right-column[ ### ANOVA model with two factors `$$y_{ijk} = \mu + \tau_i + \beta_j + (\tau\beta)_{ij} + \varepsilon_{ijk}$$` where, `\(\varepsilon_{ijk} \sim N(0, \sigma^2)\)`, `\(i = 1, 2, \ldots, a(3)\)`, `\(j = 1, 2, \ldots, b(2)\)` and `\(k = 1, 2, \ldots n(4)\)` When `\(\mu\)` is a overall mean, we will have, `\begin{aligned} \sum_{i = 1}^a{\tau_i} = 0, && \sum_{j = 1}^b{\beta_j} = 0, && \sum_{i = 1}^a\sum_{j = 1}^b{(\tau\beta)_{ij}} = 0 \end{aligned}` <h3><a href="images/2013-2ab.png" data-fancybox="2013">Exam 2013: 2(a)</a><a href="images/2013-a3.png" data-fancybox="2013"></a></h3> ``` Analysis of Variance Table Response: Weight2 Df Sum Sq Mean Sq F value Pr(>F) Weight1 2 1012033 506017 13.03 0.00032 *** Gender 1 124704 124704 3.21 0.08992 . Weight1:Gender 2 28433 14217 0.37 0.69842 Residuals 18 698825 38824 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` ] --- .left-column[ # ANOVA Model ## Two factors <img src="Day3_files/figure-html/unnamed-chunk-8-1.png" width="90%" /> ``` Estimate (Intercept) 3582.1 Weight1(High) 234.2 Weight1(Low) -265.8 Gender(Boy) 72.1 Weight1(High):Gender(Boy) 29.2 Weight1(Low):Gender(Boy) -48.3 ``` ].right-column[ ### Prediction <a href="images/2013-2ab.png" data-fancybox="2013">Exam 2013: 2(b)</a><a href="images/2013-a3.png" data-fancybox="2013"></a> wants us to predict the weight of second child (girl) if the first child has `High` weight. **The Model:** `$$y_{ijk} = \mu + \tau_i + \beta_j + (\tau\beta)_{ij} + \varepsilon_{ijk}$$` where, `\(\varepsilon_{ijk} \sim N(0, \sigma^2)\)`, `\(i = 1, 2, 3\)`, `\(j = 1, 2\)` and `\(k = 1, 2, 3, 4\)` So, the predicted weight for `Girl` child whose first sibling has `High` weight is, `$$\hat{y}_{\texttt{High, girl}} = \hat{\mu} + \hat{\tau}_\texttt{High} + \hat{\beta}_\texttt{Girl} + (\widehat{\tau\beta})_\texttt{High, Girl}$$` .right-column[ ``` factor2 Weight1(High) Weight1(Low) Weight1(Medium) 1 Gender(Boy) 29.2 -48.3 NA 2 Gender(Girl) NA NA NA ``` ].left-column[ ``` factor1 coef 1 Weight1(High) 234 2 Weight1(Low) -266 3 Weight1(Medium) NA ``` ] .full-width[ `$$\hat{y}_{\texttt{High, girl}} = 3582.083 + (234.167) + (-72.083) + (-29.167) = 3715\text{ gram}$$` ]] --- .left-column[ # ANOVA Model ## Two factors <img src="Day3_files/figure-html/unnamed-chunk-12-1.png" width="90%" /> <img src="Day3_files/figure-html/unnamed-chunk-13-1.png" width="90%" /> ].right-column[ ### Reducing a two factors Model In compulsory assignment 3(c), you are asked to choose between <a href="javascript:;" data-fancybox data-src="#hidden-content">two models.</a> .hidden[ <div id="hidden-content"> <h3>Model 1</h3> <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> Df </th> <th style="text-align:right;"> Sum Sq </th> <th style="text-align:right;"> Mean Sq </th> <th style="text-align:right;"> F value </th> <th style="text-align:right;"> Pr(>F) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Fortype </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 49.3 </td> <td style="text-align:right;"> 12.32 </td> <td style="text-align:right;"> 9.54 </td> <td style="text-align:right;"> 0.001 </td> </tr> <tr> <td style="text-align:left;"> Dommer </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 29.0 </td> <td style="text-align:right;"> 9.67 </td> <td style="text-align:right;"> 7.48 </td> <td style="text-align:right;"> 0.004 </td> </tr> <tr> <td style="text-align:left;"> Residuals </td> <td style="text-align:right;"> 12 </td> <td style="text-align:right;"> 15.5 </td> <td style="text-align:right;"> 1.29 </td> <td style="text-align:right;"> </td> <td style="text-align:right;"> </td> </tr> </tbody> </table> <h3>Model 2</h3> <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> Df </th> <th style="text-align:right;"> Sum Sq </th> <th style="text-align:right;"> Mean Sq </th> <th style="text-align:right;"> F value </th> <th style="text-align:right;"> Pr(>F) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Fortype </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 49.3 </td> <td style="text-align:right;"> 12.32 </td> <td style="text-align:right;"> 4.15 </td> <td style="text-align:right;"> 0.018 </td> </tr> <tr> <td style="text-align:left;"> Residuals </td> <td style="text-align:right;"> 15 </td> <td style="text-align:right;"> 44.5 </td> <td style="text-align:right;"> 2.97 </td> <td style="text-align:right;"> </td> <td style="text-align:right;"> </td> </tr> </tbody> </table> </div> ] <details> <summary>What happened when <code>Dommer</code> is removed from <code>Model 1</code>?</summary> When we remove a significant factor, its variation adds up to residual rising the model error. When the model error increases, the difference between the treatment becomes more difficult to see. </details> ### Interaction Term and Degree of freedom With only one observation for each combination of `Fortype` and `Dommer` we cannot include _interaction term_ in the model. No _degree of freedom_ left for residuals. So, we will only be able to find the estimate, but can not perform any kind of test for there significance. ] --- .left-column[ # ANOVA Model ## Model Assessment - Error should be random, i.e. free from any kind of pattern - Error should be have constant variation for all the groups ].right-column[ ### Assumption of random error with constant variance <img src="Day3_files/figure-html/unnamed-chunk-18-1.png" width="48%" /><img src="Day3_files/figure-html/unnamed-chunk-18-2.png" width="48%" /> ### Normality of Error term ] --- .left-column[ # ANOVA Model ## Model Assessment - Error (Residuals) should be randomly distribution - All the error should align with Normal Q-Q plot - You can also see histogram and/or density plot and compare with normal distribution plot ].right-column[ ### Assumption of random error with constant variance ### Normality of Error term <img src="Day3_files/figure-html/unnamed-chunk-20-1.png" width="45%" /><img src="Day3_files/figure-html/unnamed-chunk-20-2.png" width="45%" /><img src="Day3_files/figure-html/unnamed-chunk-20-3.png" width="45%" /><img src="Day3_files/figure-html/unnamed-chunk-20-4.png" width="45%" /> ] --- class: center, middle, inverse # Best of Luck # Lykke til --- class: center, middle, inverse <img src="images/NMBUwhite.png" width="30%" style="display: block; margin: auto;" />