. * do-file for lecture 6 of VHM 802, Winter 2023 . version 17 /* works also with versions 14-16 */ . set more off . cd "r:\" r:\ . . import delimited ch03ex1.csv, clear (encoding automatically selected: ISO-8859-1) (2 vars, 29 obs) . oneway liverwgt diet, tabulate | Summary of liverwgt diet | Mean Std. dev. Freq. ------------+------------------------------------ 1 | 3.7457143 .28401046 7 2 | 3.58 .18213024 8 3 | 3.5983333 .09621159 6 4 | 3.9225 .19710402 8 ------------+------------------------------------ Total | 3.7182759 .23998614 29 Analysis of variance Source SS df MS F Prob > F ------------------------------------------------------------------------ Between groups .578208876 3 .192736292 4.66 0.0102 Within groups 1.0344049 25 .041376196 ------------------------------------------------------------------------ Total 1.61261378 28 .057593349 Bartlett's equal-variances test: chi2(3) = 5.1211 Prob>chi2 = 0.163 . anova liverwgt diet /* allows postestimation commands */ Number of obs = 29 R-squared = 0.3586 Root MSE = .203411 Adj R-squared = 0.2816 Source | Partial SS df MS F Prob>F -----------+---------------------------------------------------- Model | .57820888 3 .19273629 4.66 0.0102 | diet | .57820888 3 .19273629 4.66 0.0102 | Residual | 1.0344049 25 .0413762 -----------+---------------------------------------------------- Total | 1.6126138 28 .05759335 . regress /* estimates corresponding to anova model */ Source | SS df MS Number of obs = 29 -------------+---------------------------------- F(3, 25) = 4.66 Model | .578208876 3 .192736292 Prob > F = 0.0102 Residual | 1.0344049 25 .041376196 R-squared = 0.3586 -------------+---------------------------------- Adj R-squared = 0.2816 Total | 1.61261378 28 .057593349 Root MSE = .20341 ------------------------------------------------------------------------------ liverwgt | Coefficient Std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- diet | 2 | -.1657143 .1052754 -1.57 0.128 -.382533 .0511045 3 | -.147381 .1131677 -1.30 0.205 -.3804541 .0856922 4 | .1767857 .1052754 1.68 0.106 -.0400331 .3936044 | _cons | 3.745714 .0768823 48.72 0.000 3.587372 3.904056 ------------------------------------------------------------------------------ . regress liverwgt i.diet /* totally identical */ Source | SS df MS Number of obs = 29 -------------+---------------------------------- F(3, 25) = 4.66 Model | .578208876 3 .192736292 Prob > F = 0.0102 Residual | 1.0344049 25 .041376196 R-squared = 0.3586 -------------+---------------------------------- Adj R-squared = 0.2816 Total | 1.61261378 28 .057593349 Root MSE = .20341 ------------------------------------------------------------------------------ liverwgt | Coefficient Std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- diet | 2 | -.1657143 .1052754 -1.57 0.128 -.382533 .0511045 3 | -.147381 .1131677 -1.30 0.205 -.3804541 .0856922 4 | .1767857 .1052754 1.68 0.106 -.0400331 .3936044 | _cons | 3.745714 .0768823 48.72 0.000 3.587372 3.904056 ------------------------------------------------------------------------------ . xi: boxcox liverwgt i.diet /* Box-Cox analysis; note: needs xi: */ i.diet _Idiet_1-4 (naturally coded; _Idiet_1 omitted) Fitting comparison model Iteration 0: log likelihood = .74765532 Iteration 1: log likelihood = 1.6767683 Iteration 2: log likelihood = 1.6782923 Iteration 3: log likelihood = 1.6782967 Iteration 4: log likelihood = 1.6782967 Fitting full model Iteration 0: log likelihood = 7.1860909 Iteration 1: log likelihood = 8.0100122 Iteration 2: log likelihood = 8.0113804 Iteration 3: log likelihood = 8.0113804 Number of obs = 29 LR chi2(3) = 12.67 Log likelihood = 8.0113804 Prob > chi2 = 0.005 ------------------------------------------------------------------------------ liverwgt | Coefficient Std. err. z P>|z| [95% conf. interval] -------------+---------------------------------------------------------------- /theta | -2.161941 2.514124 -0.86 0.390 -7.089534 2.765652 ------------------------------------------------------------------------------ Estimates of scale-variant parameters ---------------------------- | Coefficient -------------+-------------- Notrans | _Idiet_2 | -.0025137 _Idiet_3 | -.0020231 _Idiet_4 | .002801 _cons | .4354793 -------------+-------------- /sigma | .0029048 ---------------------------- --------------------------------------------------------- Test Restricted LR statistic H0: log likelihood chi2 Prob > chi2 --------------------------------------------------------- theta = -1 7.9027094 0.22 0.641 theta = 0 7.6300907 0.76 0.383 theta = 1 7.1860909 1.65 0.199 --------------------------------------------------------- . * means and SE with CI . anova liverwgt diet Number of obs = 29 R-squared = 0.3586 Root MSE = .203411 Adj R-squared = 0.2816 Source | Partial SS df MS F Prob>F -----------+---------------------------------------------------- Model | .57820888 3 .19273629 4.66 0.0102 | diet | .57820888 3 .19273629 4.66 0.0102 | Residual | 1.0344049 25 .0413762 -----------+---------------------------------------------------- Total | 1.6126138 28 .05759335 . lincom _cons+1.diet /* similar for diets 2-4 */ ( 1) 1b.diet + _cons = 0 ------------------------------------------------------------------------------ liverwgt | Coefficient Std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- (1) | 3.745714 .0768823 48.72 0.000 3.587372 3.904056 ------------------------------------------------------------------------------ . margins diet Adjusted predictions Number of obs = 29 Expression: Linear prediction, predict() ------------------------------------------------------------------------------ | Delta-method | Margin std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- diet | 1 | 3.745714 .0768823 48.72 0.000 3.587372 3.904056 2 | 3.58 .0719168 49.78 0.000 3.431885 3.728115 3 | 3.598333 .0830424 43.33 0.000 3.427304 3.769362 4 | 3.9225 .0719168 54.54 0.000 3.774385 4.070615 ------------------------------------------------------------------------------ . marginsplot /* interval plot */ Variables that uniquely identify margins: diet . * pairwise comparisons . oneway liverwgt diet, bonferroni /* bonferroni comparisons */ Analysis of variance Source SS df MS F Prob > F ------------------------------------------------------------------------ Between groups .578208876 3 .192736292 4.66 0.0102 Within groups 1.0344049 25 .041376196 ------------------------------------------------------------------------ Total 1.61261378 28 .057593349 Bartlett's equal-variances test: chi2(3) = 5.1211 Prob>chi2 = 0.163 Comparison of liverwgt by diet (Bonferroni) Row Mean-| Col Mean | 1 2 3 ---------+--------------------------------- 2 | -.165714 | 0.768 | 3 | -.147381 .018333 | 1.000 1.000 | 4 | .176786 .3425 .324167 | 0.633 0.015 0.041 . * general method for anova and regression . anova liverwgt diet Number of obs = 29 R-squared = 0.3586 Root MSE = .203411 Adj R-squared = 0.2816 Source | Partial SS df MS F Prob>F -----------+---------------------------------------------------- Model | .57820888 3 .19273629 4.66 0.0102 | diet | .57820888 3 .19273629 4.66 0.0102 | Residual | 1.0344049 25 .0413762 -----------+---------------------------------------------------- Total | 1.6126138 28 .05759335 . pwcompare diet, pv mcomp(noadjust) /* no adjustment - the default */ Pairwise comparisons of marginal linear predictions Margins: asbalanced ----------------------------------------------------- | Unadjusted | Contrast Std. err. t P>|t| -------------+--------------------------------------- diet | 2 vs 1 | -.1657143 .1052754 -1.57 0.128 3 vs 1 | -.147381 .1131677 -1.30 0.205 4 vs 1 | .1767857 .1052754 1.68 0.106 3 vs 2 | .0183333 .1098547 0.17 0.869 4 vs 2 | .3424999 .1017057 3.37 0.002 4 vs 3 | .3241666 .1098547 2.95 0.007 ----------------------------------------------------- . pwcompare diet, pv mcomp(bon) /* Bonferroni method */ Pairwise comparisons of marginal linear predictions Margins: asbalanced --------------------------- | Number of | comparisons -------------+------------- diet | 6 --------------------------- ----------------------------------------------------- | Bonferroni | Contrast Std. err. t P>|t| -------------+--------------------------------------- diet | 2 vs 1 | -.1657143 .1052754 -1.57 0.768 3 vs 1 | -.147381 .1131677 -1.30 1.000 4 vs 1 | .1767857 .1052754 1.68 0.633 3 vs 2 | .0183333 .1098547 0.17 1.000 4 vs 2 | .3424999 .1017057 3.37 0.015 4 vs 3 | .3241666 .1098547 2.95 0.041 ----------------------------------------------------- . pwcompare diet, pv mcomp(tukey) /* Tukey method */ Pairwise comparisons of marginal linear predictions Margins: asbalanced --------------------------- | Number of | comparisons -------------+------------- diet | 6 --------------------------- ----------------------------------------------------- | Tukey | Contrast Std. err. t P>|t| -------------+--------------------------------------- diet | 2 vs 1 | -.1657143 .1052754 -1.57 0.411 3 vs 1 | -.147381 .1131677 -1.30 0.570 4 vs 1 | .1767857 .1052754 1.68 0.355 3 vs 2 | .0183333 .1098547 0.17 0.998 4 vs 2 | .3424999 .1017057 3.37 0.012 4 vs 3 | .3241666 .1098547 2.95 0.032 ----------------------------------------------------- . * general approach to testing . anova liverwgt diet Number of obs = 29 R-squared = 0.3586 Root MSE = .203411 Adj R-squared = 0.2816 Source | Partial SS df MS F Prob>F -----------+---------------------------------------------------- Model | .57820888 3 .19273629 4.66 0.0102 | diet | .57820888 3 .19273629 4.66 0.0102 | Residual | 1.0344049 25 .0413762 -----------+---------------------------------------------------- Total | 1.6126138 28 .05759335 . test, showorder Order of columns in the design matrix 1: (diet==1) 2: (diet==2) 3: (diet==3) 4: (diet==4) 5: _cons . matrix input mycon=(1,-1,0,0,0\1,0,-1,0,0\1,0,0,-1,0\0,1,-1,0,0\0,1,0,-1,0\0,0,1,-1,0) . test, test(mycon) mtest ( 1) 1b.diet - 2.diet = 0 ( 2) 1b.diet - 3.diet = 0 ( 3) 1b.diet - 4.diet = 0 ( 4) 2.diet - 3.diet = 0 ( 5) 2.diet - 4.diet = 0 ( 6) 3.diet - 4.diet = 0 Constraint 3 dropped Constraint 4 dropped Constraint 6 dropped -------------------------------------- | F(df,25) df p > F -------+------------------------------ (1) | 2.48 1 0.1280* (2) | 1.70 1 0.2047* (3) | 2.82 1 0.1056* (4) | 0.03 1 0.8688* (5) | 11.34 1 0.0025* (6) | 8.71 1 0.0068* -------+------------------------------ All | 4.66 3 0.0102 -------------------------------------- * Unadjusted p-values . test, test(mycon) mtest(bon) /* Bonferroni method */ ( 1) 1b.diet - 2.diet = 0 ( 2) 1b.diet - 3.diet = 0 ( 3) 1b.diet - 4.diet = 0 ( 4) 2.diet - 3.diet = 0 ( 5) 2.diet - 4.diet = 0 ( 6) 3.diet - 4.diet = 0 Constraint 3 dropped Constraint 4 dropped Constraint 6 dropped -------------------------------------- | F(df,25) df p > F -------+------------------------------ (1) | 2.48 1 0.7682* (2) | 1.70 1 1.0000* (3) | 2.82 1 0.6333* (4) | 0.03 1 1.0000* (5) | 11.34 1 0.0147* (6) | 8.71 1 0.0408* -------+------------------------------ All | 4.66 3 0.0102 -------------------------------------- * Bonferroni-adjusted p-values . test, test(mycon) mtest(holm) /* Holm method */ ( 1) 1b.diet - 2.diet = 0 ( 2) 1b.diet - 3.diet = 0 ( 3) 1b.diet - 4.diet = 0 ( 4) 2.diet - 3.diet = 0 ( 5) 2.diet - 4.diet = 0 ( 6) 3.diet - 4.diet = 0 Constraint 3 dropped Constraint 4 dropped Constraint 6 dropped -------------------------------------- | F(df,25) df p > F -------+------------------------------ (1) | 2.48 1 0.3841* (2) | 1.70 1 0.4094* (3) | 2.82 1 0.4222* (4) | 0.03 1 0.8688* (5) | 11.34 1 0.0147* (6) | 8.71 1 0.0340* -------+------------------------------ All | 4.66 3 0.0102 -------------------------------------- * Holm-adjusted p-values . * contrasts . lincom (1.diet+2.diet+3.diet)/3-4.diet ( 1) .3333333*1b.diet + .3333333*2.diet + .3333333*3.diet - 4.diet = 0 ------------------------------------------------------------------------------ liverwgt | Coefficient Std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- (1) | -.2811507 .084674 -3.32 0.003 -.4555401 -.1067614 ------------------------------------------------------------------------------ . scalar tval=-.2811507/.084674 /* to get more decimals than in listing */ . di "SS: " tval^2*0.0413762 " in %: " tval^2*0.0413762/.57820888*100 SS: .45617217 in %: 78.89401 . di "Scheffe test: F = " tval^2/3 " P = " Ftail(3,25,tval^2/3) Scheffe test: F = 3.6749965 P = .02546919 . lincom (2.diet+3.diet)/2-1.diet ( 1) - 1b.diet + .5*2.diet + .5*3.diet = 0 ------------------------------------------------------------------------------ liverwgt | Coefficient Std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- (1) | -.1565476 .0944876 -1.66 0.110 -.3511484 .0380532 ------------------------------------------------------------------------------ . lincom 2.diet-3.diet ( 1) 2.diet - 3.diet = 0 ------------------------------------------------------------------------------ liverwgt | Coefficient Std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- (1) | -.0183333 .1098547 -0.17 0.869 -.2445833 .2079167 ------------------------------------------------------------------------------ . . import delimited ch08ta6.csv, clear /* Example 8.6 */ (encoding automatically selected: UTF-8) (3 vars, 12 obs) . anova tfaa r50 r21 r50#r21 /* same with r50##r21 only */ Number of obs = 12 R-squared = 0.5459 Root MSE = .301177 Adj R-squared = 0.3756 Source | Partial SS df MS F Prob>F -----------+---------------------------------------------------- Model | .87231412 3 .29077137 3.21 0.0834 | r50 | .65613639 1 .65613639 7.23 0.0275 r21 | .21440139 1 .21440139 2.36 0.1627 r50#r21 | .00177634 1 .00177634 0.02 0.8922 | Residual | .72566253 8 .09070782 -----------+---------------------------------------------------- Total | 1.5979766 11 .1452706 . regress /* Stata parametrization; note different P-values! */ Source | SS df MS Number of obs = 12 -------------+---------------------------------- F(3, 8) = 3.21 Model | .872314116 3 .290771372 Prob > F = 0.0834 Residual | .725662525 8 .090707816 R-squared = 0.5459 -------------+---------------------------------- Adj R-squared = 0.3756 Total | 1.59797664 11 .145270604 Root MSE = .30118 ------------------------------------------------------------------------------ tfaa | Coefficient Std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- 1.r50 | .4433333 .2459103 1.80 0.109 -.1237369 1.010403 1.r21 | .243 .2459103 0.99 0.352 -.3240702 .8100702 | r50#r21 | 1 1 | .0486668 .3477697 0.14 0.892 -.7532916 .8506251 | _cons | 1.709333 .1738848 9.83 0.000 1.308354 2.110313 ------------------------------------------------------------------------------ . * interaction plot . margins r50#r21 Adjusted predictions Number of obs = 12 Expression: Linear prediction, predict() ------------------------------------------------------------------------------ | Delta-method | Margin std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- r50#r21 | 0 0 | 1.709333 .1738848 9.83 0.000 1.308354 2.110313 0 1 | 1.952333 .1738848 11.23 0.000 1.551354 2.353312 1 0 | 2.152667 .1738848 12.38 0.000 1.751687 2.553646 1 1 | 2.444333 .1738848 14.06 0.000 2.043354 2.845313 ------------------------------------------------------------------------------ . marginsplot, noci /* CIs often look messy */ Variables that uniquely identify margins: r50 r21 . * contrasts for SS decomposition computed manually . egen tx=group(r50 r21) /* combined tx variable */ . table tx r50 r21 /* check of coding: 2 ~ r50=0, r21=1, etc. */ ------------------------------- | r21 | 0 1 Total ---------------+--------------- group(r50 r21) | 1 | r50 | 0 | 3 3 Total | 3 3 2 | r50 | 0 | 3 3 Total | 3 3 3 | r50 | 1 | 3 3 Total | 3 3 4 | r50 | 1 | 3 3 Total | 3 3 Total | r50 | 0 | 3 3 6 1 | 3 3 6 Total | 6 6 12 ------------------------------- . anova tfaa tx Number of obs = 12 R-squared = 0.5459 Root MSE = .301177 Adj R-squared = 0.3756 Source | Partial SS df MS F Prob>F -----------+---------------------------------------------------- Model | .87231412 3 .29077137 3.21 0.0834 | tx | .87231412 3 .29077137 3.21 0.0834 | Residual | .72566253 8 .09070782 -----------+---------------------------------------------------- Total | 1.5979766 11 .1452706 . lincom 1.tx+2.tx-3.tx-4.tx /* r50 */ ( 1) 1b.tx + 2.tx - 3.tx - 4.tx = 0 ------------------------------------------------------------------------------ tfaa | Coefficient Std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- (1) | -.9353334 .3477697 -2.69 0.028 -1.737292 -.133375 ------------------------------------------------------------------------------ . lincom 1.tx-2.tx+3.tx-4.tx /* r21 */ ( 1) 1b.tx - 2.tx + 3.tx - 4.tx = 0 ------------------------------------------------------------------------------ tfaa | Coefficient Std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- (1) | -.5346667 .3477697 -1.54 0.163 -1.336625 .2672916 ------------------------------------------------------------------------------ . lincom 1.tx-2.tx-3.tx+4.tx /* r50*r21 */ ( 1) 1b.tx - 2.tx - 3.tx + 4.tx = 0 ------------------------------------------------------------------------------ tfaa | Coefficient Std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- (1) | .0486668 .3477697 0.14 0.892 -.7532916 .8506251 ------------------------------------------------------------------------------ . . import delimited ch08ta7.csv, clear /* Example 8.8 */ (encoding automatically selected: ISO-8859-1) (5 vars, 54 obs) . generate lnfault=ln(fault) . anova lnfault alg##seq##size##alloc /* DFE=0 */ Number of obs = 54 R-squared = 1.0000 Root MSE = 0 Adj R-squared = Source | Partial SS df MS F Prob>F -------------------+---------------------------------------------------- Model | 173.59316 53 3.2753426 | alg | 2.5018375 1 2.5018375 seq | 24.639254 2 12.319627 alg#seq | .01763693 2 .00881847 size | 41.691651 2 20.845825 alg#size | .02221451 2 .01110725 seq#size | .82895746 4 .20723936 alg#seq#size | .01456402 4 .00364101 alloc | 92.697301 2 46.34865 alg#alloc | .06003955 2 .03001977 seq#alloc | 9.5104704 4 2.3776176 alg#seq#alloc | .02600767 4 .00650192 size#alloc | .50430485 4 .12607621 alg#size#alloc | .00400155 4 .00100039 seq#size#alloc | 1.0521223 8 .13151529 alg#seq#size#alloc | .02279619 8 .00284952 | Residual | 0 0 -------------------+---------------------------------------------------- Total | 173.59316 53 3.2753426 . anova lnfault alg##seq##size alg##seq##alloc alg##size##alloc seq##size##alloc Number of obs = 54 R-squared = 0.9999 Root MSE = .053381 Adj R-squared = 0.9991 Source | Partial SS df MS F Prob>F ---------------+---------------------------------------------------- Model | 173.57036 45 3.8571192 1353.60 0.0000 | alg | 2.5018375 1 2.5018375 877.98 0.0000 seq | 24.639254 2 12.319627 4323.40 0.0000 alg#seq | .01763693 2 .00881847 3.09 0.1010 size | 41.691651 2 20.845825 7315.55 0.0000 alg#size | .02221451 2 .01110725 3.90 0.0658 seq#size | .82895746 4 .20723936 72.73 0.0000 alg#seq#size | .01456402 4 .00364101 1.28 0.3548 alloc | 92.697301 2 46.34865 16265.40 0.0000 alg#alloc | .06003955 2 .03001977 10.54 0.0057 seq#alloc | 9.5104704 4 2.3776176 834.39 0.0000 alg#seq#alloc | .02600767 4 .00650192 2.28 0.1491 size#alloc | .50430485 4 .12607621 44.24 0.0000 alg#size#alloc | .00400155 4 .00100039 0.35 0.8365 seq#size#alloc | 1.0521223 8 .13151529 46.15 0.0000 | Residual | .02279619 8 .00284952 ---------------+---------------------------------------------------- Total | 173.59316 53 3.2753426 . . import delimited ch10ta1.csv, clear /* Example 10.3 */ (encoding automatically selected: ISO-8859-1) (3 vars, 17 obs) . anova y a##b /* partial SS */ Number of obs = 17 R-squared = 0.6632 Root MSE = 10.6004 Adj R-squared = 0.5855 Source | Partial SS df MS F Prob>F -----------+---------------------------------------------------- Model | 2876.8803 3 958.9601 8.53 0.0022 | a | 499.95134 1 499.95134 4.45 0.0549 b | 265.47133 1 265.47133 2.36 0.1483 a#b | 65.244557 1 65.244557 0.58 0.4597 | Residual | 1460.789 13 112.36839 -----------+---------------------------------------------------- Total | 4337.6693 16 271.10433 . anova y a##b, sequential /* SS for b without interaction */ Number of obs = 17 R-squared = 0.6632 Root MSE = 10.6004 Adj R-squared = 0.5855 Source | Seq. SS df MS F Prob>F -----------+---------------------------------------------------- Model | 2876.8803 3 958.9601 8.53 0.0022 | a | 2557.0039 1 2557.0039 22.76 0.0004 b | 254.63187 1 254.63187 2.27 0.1561 a#b | 65.244557 1 65.244557 0.58 0.4597 | Residual | 1460.789 13 112.36839 -----------+---------------------------------------------------- Total | 4337.6693 16 271.10433 . anova y b##a, sequential /* SS for a without interaction */ Number of obs = 17 R-squared = 0.6632 Root MSE = 10.6004 Adj R-squared = 0.5855 Source | Seq. SS df MS F Prob>F -----------+---------------------------------------------------- Model | 2876.8803 3 958.9601 8.53 0.0022 | b | 2326.3487 1 2326.3487 20.70 0.0005 a | 485.28704 1 485.28704 4.32 0.0581 b#a | 65.244557 1 65.244557 0.58 0.4597 | Residual | 1460.789 13 112.36839 -----------+---------------------------------------------------- Total | 4337.6693 16 271.10433 . . import delimited ch09ta2.csv, clear /* Example 9.2 */ (encoding automatically selected: ISO-8859-1) (5 vars, 32 obs) . anova y a##b##c##d Number of obs = 32 R-squared = 0.9881 Root MSE = 1.01458 Adj R-squared = 0.9769 Source | Partial SS df MS F Prob>F -----------+---------------------------------------------------- Model | 1367.1988 15 91.146585 88.55 0.0000 | a | 120.90125 1 120.90125 117.45 0.0000 b | 204.02001 1 204.02001 198.20 0.0000 a#b | 18.000003 1 18.000003 17.49 0.0007 c | 472.78124 1 472.78124 459.29 0.0000 a#c | 24.851261 1 24.851261 24.14 0.0002 b#c | 27.380001 1 27.380001 26.60 0.0001 a#b#c | 11.519999 1 11.519999 11.19 0.0041 d | 335.40503 1 335.40503 325.83 0.0000 a#d | 15.125003 1 15.125003 14.69 0.0015 b#d | 10.81125 1 10.81125 10.50 0.0051 a#b#d | 34.031238 1 34.031238 33.06 0.0000 c#d | 6.4800031 1 6.4800031 6.30 0.0232 a#c#d | 50.000005 1 50.000005 48.57 0.0000 b#c#d | 22.111251 1 22.111251 21.48 0.0003 a#b#c#d | 13.781253 1 13.781253 13.39 0.0021 | Residual | 16.469993 16 1.0293746 -----------+---------------------------------------------------- Total | 1383.6688 31 44.634477 . anova y a b c d a#b#c#d Number of obs = 32 R-squared = 0.9881 Root MSE = 1.01458 Adj R-squared = 0.9769 Source | Partial SS df MS F Prob>F -----------+---------------------------------------------------- Model | 1367.1988 15 91.146585 88.55 0.0000 | a | 120.90125 1 120.90125 117.45 0.0000 b | 204.02001 1 204.02001 198.20 0.0000 c | 472.78124 1 472.78124 459.29 0.0000 d | 335.40503 1 335.40503 325.83 0.0000 a#b#c#d | 234.09127 11 21.281024 20.67 0.0000 | Residual | 16.469993 16 1.0293746 -----------+---------------------------------------------------- Total | 1383.6688 31 44.634477 . regress /* some indication that cell (1,1,1,1) is extreme */ Source | SS df MS Number of obs = 32 -------------+---------------------------------- F(15, 16) = 88.55 Model | 1367.19878 15 91.1465853 Prob > F = 0.0000 Residual | 16.4699933 16 1.02937458 R-squared = 0.9881 -------------+---------------------------------- Adj R-squared = 0.9769 Total | 1383.66877 31 44.6344765 Root MSE = 1.0146 ------------------------------------------------------------------------------ y | Coefficient Std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- 2.a | -15.6 1.014581 -15.38 0.000 -17.75082 -13.44918 2.b | -5.7 1.014581 -5.62 0.000 -7.850815 -3.549184 2.c | -3.500001 1.014581 -3.45 0.003 -5.650817 -1.349185 2.d | -4.5 1.014581 -4.44 0.000 -6.650816 -2.349184 | a#b#c#d | 1 1 2 2 | 12.75 1.434834 8.89 0.000 9.708288 15.79171 1 2 1 2 | 12.4 1.434834 8.64 0.000 9.358287 15.44171 1 2 2 1 | 12.05 1.434834 8.40 0.000 9.008288 15.09171 1 2 2 2 | 25.3 2.029162 12.47 0.000 20.99837 29.60163 2 1 1 2 | 14.5 1.434834 10.11 0.000 11.45829 17.54171 2 1 2 1 | 13.55 1.434834 9.44 0.000 10.50829 16.59171 2 1 2 2 | 25.55 2.029162 12.59 0.000 21.24837 29.85163 2 2 1 1 | 12.15 1.434834 8.47 0.000 9.108287 15.19171 2 2 1 2 | 25.55 2.029162 12.59 0.000 21.24837 29.85163 2 2 2 1 | 27.7 2.029162 13.65 0.000 23.39837 32.00163 2 2 2 2 | 37.2 2.684329 13.86 0.000 31.50948 42.89052 | _cons | 26.8 .7174171 37.36 0.000 25.27914 28.32086 ------------------------------------------------------------------------------ . generate cell_1111=(a==1)*(b==1)*(c==1)*(d==1) . anova y a b c d cell_1111 a##b##c##d, sequential /* note: unnatural order */ Number of obs = 32 R-squared = 0.9881 Root MSE = 1.01458 Adj R-squared = 0.9769 Source | Seq. SS df MS F Prob>F -----------+---------------------------------------------------- Model | 1367.1988 15 91.146585 88.55 0.0000 | a | 120.90125 1 120.90125 117.45 0.0000 b | 204.02001 1 204.02001 198.20 0.0000 c | 472.78124 1 472.78124 459.29 0.0000 d | 335.40503 1 335.40503 325.83 0.0000 cell_1111 | 217.35104 1 217.35104 211.15 0.0000 a#b | .04510221 1 .04510221 0.04 0.8368 a#c | .30012612 1 .30012612 0.29 0.5967 b#c | .765625 1 .765625 0.74 0.4012 a#b#c | .9472327 1 .9472327 0.92 0.3517 a#d | .34714256 1 .34714256 0.34 0.5695 b#d | 1.8375 1 1.8375 1.79 0.2002 a#b#d | 1.4062471 1 1.4062471 1.37 0.2596 c#d | 5.1337486 1 5.1337486 4.99 0.0402 a#c#d | 5.4675005 1 5.4675005 5.31 0.0349 b#c#d | .48999973 1 .48999973 0.48 0.5001 a#b#c#d | 0 0 | Residual | 16.469993 16 1.0293746 -----------+---------------------------------------------------- Total | 1383.6688 31 44.634477 . anova y a b c d cell_1111 Number of obs = 32 R-squared = 0.9760 Root MSE = 1.13018 Adj R-squared = 0.9714 Source | Partial SS df MS F Prob>F -----------+---------------------------------------------------- Model | 1350.4586 5 270.09171 211.45 0.0000 | a | 39.331851 1 39.331851 30.79 0.0000 b | 321.53343 1 321.53343 251.73 0.0000 c | 628.69227 1 628.69227 492.20 0.0000 d | 474.81596 1 474.81596 371.73 0.0000 cell_1111 | 217.35104 1 217.35104 170.16 0.0000 | Residual | 33.210218 26 1.2773161 -----------+---------------------------------------------------- Total | 1383.6688 31 44.634477 . . import delimited ch08ta9.csv, clear /* Example 8.10 */ (encoding automatically selected: ISO-8859-2) (7 vars, 96 obs) . egen tx=group(at gt v) . xi: boxcox amylase i.tx i.tx _Itx_1-32 (naturally coded; _Itx_1 omitted) Fitting comparison model Iteration 0: log likelihood = -542.90683 Iteration 1: log likelihood = -542.5619 Iteration 2: log likelihood = -542.56179 Iteration 3: log likelihood = -542.56179 Fitting full model Iteration 0: log likelihood = -427.06096 Iteration 1: log likelihood = -424.07141 Iteration 2: log likelihood = -424.06421 Iteration 3: log likelihood = -424.06421 Number of obs = 96 LR chi2(31) = 237.00 Log likelihood = -424.06421 Prob > chi2 = 0.000 ------------------------------------------------------------------------------ amylase | Coefficient Std. err. z P>|z| [95% conf. interval] -------------+---------------------------------------------------------------- /theta | .0199989 .3958468 0.05 0.960 -.7558465 .7958443 ------------------------------------------------------------------------------ Estimates of scale-variant parameters ---------------------------- | Coefficient -------------+-------------- Notrans | _Itx_2 | -.2257092 _Itx_3 | -.1653663 _Itx_4 | -.3410324 _Itx_5 | .2061606 _Itx_6 | -.0799097 _Itx_7 | .096261 _Itx_8 | .0100406 _Itx_9 | .3149527 _Itx_10 | .0366516 _Itx_11 | .1889919 _Itx_12 | .1886505 _Itx_13 | .2654907 _Itx_14 | .0600479 _Itx_15 | .1791769 _Itx_16 | .1178467 _Itx_17 | .1514475 _Itx_18 | .0031704 _Itx_19 | .0760831 _Itx_20 | -.1154264 _Itx_21 | -.0755018 _Itx_22 | -.2582509 _Itx_23 | -.0948975 _Itx_24 | -.2019227 _Itx_25 | -.1874994 _Itx_26 | -.4267404 _Itx_27 | -.203189 _Itx_28 | -.3183826 _Itx_29 | -.2422305 _Itx_30 | -.6219556 _Itx_31 | -.3050953 _Itx_32 | -.4370663 _cons | 6.232128 -------------+-------------- /sigma | .0677822 ---------------------------- --------------------------------------------------------- Test Restricted LR statistic H0: log likelihood chi2 Prob > chi2 --------------------------------------------------------- theta = -1 -427.42898 6.73 0.009 theta = 0 -424.06549 0.00 0.960 theta = 1 -427.06096 5.99 0.014 --------------------------------------------------------- . generate lnam=ln(amylase) . anova lnam atemp##gtemp##v Number of obs = 96 R-squared = 0.9168 Root MSE = .073916 Adj R-squared = 0.8765 Source | Partial SS df MS F Prob>F --------------+---------------------------------------------------- Model | 3.8523521 31 .12426942 22.74 0.0000 | atemp | 3.0161274 7 .43087535 78.86 0.0000 gtemp | .00437953 1 .00437953 0.80 0.3740 atemp#gtemp | .08105969 7 .01157996 2.12 0.0539 v | .58956976 1 .58956976 107.91 0.0000 atemp#v | .02758221 7 .00394032 0.72 0.6544 gtemp#v | .08599279 1 .08599279 15.74 0.0002 atemp#gtemp#v | .04764068 7 .00680581 1.25 0.2916 | Residual | .34967081 64 .00546361 --------------+---------------------------------------------------- Total | 4.2020229 95 .04423182 . anova lnam atemp##gtemp gtemp##v Number of obs = 96 R-squared = 0.8989 Root MSE = .073806 Adj R-squared = 0.8768 Source | Partial SS df MS F Prob>F ------------+---------------------------------------------------- Model | 3.7771292 17 .22218407 40.79 0.0000 | atemp | 3.0161274 7 .43087535 79.10 0.0000 gtemp | .00437953 1 .00437953 0.80 0.3727 atemp#gtemp | .08105969 7 .01157996 2.13 0.0504 v | .58956976 1 .58956976 108.23 0.0000 gtemp#v | .08599279 1 .08599279 15.79 0.0002 | Residual | .4248937 78 .00544736 ------------+---------------------------------------------------- Total | 4.2020229 95 .04423182 . * atemp modelled as continuous . anova lnam c.atemp##gtemp gtemp##v Number of obs = 96 R-squared = 0.3786 Root MSE = .170336 Adj R-squared = 0.3440 Source | Partial SS df MS F Prob>F ------------+---------------------------------------------------- Model | 1.5907417 5 .31814835 10.97 0.0000 | atemp | .87537029 1 .87537029 30.17 0.0000 gtemp | .0214632 1 .0214632 0.74 0.3920 gtemp#atemp | .03542937 1 .03542937 1.22 0.2721 v | .58956976 1 .58956976 20.32 0.0000 gtemp#v | .08599279 1 .08599279 2.96 0.0886 | Residual | 2.6112812 90 .02901424 ------------+---------------------------------------------------- Total | 4.2020229 95 .04423182 . regress Source | SS df MS Number of obs = 96 -------------+---------------------------------- F(5, 90) = 10.97 Model | 1.59074174 5 .318148348 Prob > F = 0.0000 Residual | 2.61128117 90 .029014235 R-squared = 0.3786 -------------+---------------------------------- Adj R-squared = 0.3440 Total | 4.20202291 95 .04423182 Root MSE = .17034 ------------------------------------------------------------------------------- lnam | Coefficient Std. err. t P>|t| [95% conf. interval] --------------+---------------------------------------------------------------- atemp | .007507 .0024196 3.10 0.003 .0027 .0123139 25.gtemp | -.0154915 .094255 -0.16 0.870 -.2027455 .1717624 | gtemp#c.atemp | 25 | .0037812 .0034218 1.11 0.272 -.0030168 .0105792 | 2.v | -.0968751 .0491717 -1.97 0.052 -.1945632 .000813 | gtemp#v | 25 2 | -.1197169 .0695392 -1.72 0.089 -.2578687 .018435 | _cons | 5.671236 .0666483 85.09 0.000 5.538827 5.803644 ------------------------------------------------------------------------------- . anova lnam c.atemp##c.atemp##gtemp gtemp##v Number of obs = 96 R-squared = 0.8759 Root MSE = .07698 Adj R-squared = 0.8660 Source | Partial SS df MS F Prob>F ------------------+---------------------------------------------------- Model | 3.6805469 7 .52579241 88.73 0.0000 | atemp | 2.4887306 1 2.4887306 419.98 0.0000 atemp#atemp | 2.0897161 1 2.0897161 352.64 0.0000 gtemp | .00217869 1 .00217869 0.37 0.5458 gtemp#atemp | .00044304 1 .00044304 0.07 0.7852 gtemp#atemp#atemp | .00008904 1 .00008904 0.02 0.9027 v | .58956976 1 .58956976 99.49 0.0000 gtemp#v | .08599279 1 .08599279 14.51 0.0003 | Residual | .52147602 88 .00592586 ------------------+---------------------------------------------------- Total | 4.2020229 95 .04423182 . regress Source | SS df MS Number of obs = 96 -------------+---------------------------------- F(7, 88) = 88.73 Model | 3.68054689 7 .525792413 Prob > F = 0.0000 Residual | .521476018 88 .005925864 R-squared = 0.8759 -------------+---------------------------------- Adj R-squared = 0.8660 Total | 4.20202291 95 .04423182 Root MSE = .07698 --------------------------------------------------------------------------------------- lnam | Coefficient Std. err. t P>|t| [95% conf. interval] ----------------------+---------------------------------------------------------------- atemp | .0969251 .0067791 14.30 0.000 .083453 .1103971 | c.atemp#c.atemp | -.0018086 .0001353 -13.37 0.000 -.0020775 -.0015397 | 25.gtemp | -.0036131 .1058513 -0.03 0.973 -.2139703 .2067441 | gtemp#c.atemp | 25 | .0026214 .0095871 0.27 0.785 -.0164309 .0216737 | gtemp#c.atemp#c.atemp | 25 | .0000235 .0001914 0.12 0.903 -.0003569 .0004038 | 2.v | -.0968751 .0222221 -4.36 0.000 -.1410369 -.0527133 | gtemp#v | 25 2 | -.1197169 .0314268 -3.81 0.000 -.1821711 -.0572627 | _cons | 4.755444 .0748482 63.53 0.000 4.606699 4.904189 --------------------------------------------------------------------------------------- . anova lnam c.atemp##c.atemp##c.atemp##gtemp gtemp##v Number of obs = 96 R-squared = 0.8928 Root MSE = .072366 Adj R-squared = 0.8816 Source | Partial SS df MS F Prob>F ------------------------+---------------------------------------------------- Model | 3.7516521 9 .41685023 79.60 0.0000 | atemp | .02573811 1 .02573811 4.91 0.0293 atemp#atemp | .00253925 1 .00253925 0.48 0.4881 atemp#atemp#atemp | .04199339 1 .04199339 8.02 0.0058 gtemp | .03099672 1 .03099672 5.92 0.0171 gtemp#atemp | .02921493 1 .02921493 5.58 0.0204 gtemp#atemp#atemp | .02844537 1 .02844537 5.43 0.0221 gtemp#atemp#atemp#atemp | .0291118 1 .0291118 5.56 0.0207 v | .58956976 1 .58956976 112.58 0.0000 gtemp#v | .08599279 1 .08599279 16.42 0.0001 | Residual | .45037083 86 .00523687 ------------------------+---------------------------------------------------- Total | 4.2020229 95 .04423182 . . * Example 10.1: analysis with missing value . anova lnam atemp##gtemp gtemp##v if _n>1 Number of obs = 95 R-squared = 0.9006 Root MSE = .073411 Adj R-squared = 0.8787 Source | Partial SS df MS F Prob>F ------------+---------------------------------------------------- Model | 3.7596044 17 .2211532 41.04 0.0000 | atemp | 3.0260185 7 .43228835 80.22 0.0000 gtemp | .00297517 1 .00297517 0.55 0.4597 atemp#gtemp | .06702757 7 .00957537 1.78 0.1040 v | .56512284 1 .56512284 104.86 0.0000 gtemp#v | .07849559 1 .07849559 14.57 0.0003 | Residual | .41496137 77 .00538911 ------------+---------------------------------------------------- Total | 4.1745657 94 .04441027 . margins gtemp#v, asbalanced /* least squares means */ Adjusted predictions Number of obs = 95 Expression: Linear prediction, predict() At: atemp (asbalanced) gtemp (asbalanced) v (asbalanced) ------------------------------------------------------------------------------ | Delta-method | Margin std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- gtemp#v | 13 1 | 5.847649 .0149849 390.24 0.000 5.817811 5.877488 13 2 | 5.750774 .0149849 383.77 0.000 5.720936 5.780613 25 1 | 5.916409 .0153643 385.08 0.000 5.885815 5.947004 25 2 | 5.704424 .0149849 380.68 0.000 5.674586 5.734263 ------------------------------------------------------------------------------ . margins gtemp#v /* all estimates different */ Predictive margins Number of obs = 95 Expression: Linear prediction, predict() ------------------------------------------------------------------------------ | Delta-method | Margin std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- gtemp#v | 13 1 | 5.849246 .0149878 390.27 0.000 5.819402 5.879091 13 2 | 5.752371 .0149878 383.80 0.000 5.722527 5.782216 25 1 | 5.917001 .01534 385.72 0.000 5.886455 5.947547 25 2 | 5.705016 .0149883 380.63 0.000 5.67517 5.734861 ------------------------------------------------------------------------------ . end of do-file