-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
      name:  <unnamed>
       log:  C:\vhm812-data\L5a-log_reg_dx.txt
  log type:  text
 opened on:   2 Feb 2016, 09:46:52

. 
. * open the Nocardia dataset
. use nocardia.dta, clear

. sum dcpct

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
       dcpct |        108    75.56481     37.3964          0        100

. tab dcpct

   Pcnt. of |
   cows dry |
    treated |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |          7        6.48        6.48
          1 |          2        1.85        8.33
          3 |          1        0.93        9.26
          5 |          3        2.78       12.04
          7 |          1        0.93       12.96
         10 |          1        0.93       13.89
         14 |          1        0.93       14.81
         20 |          2        1.85       16.67
         25 |          3        2.78       19.44
         30 |          2        1.85       21.30
         40 |          1        0.93       22.22
         50 |          7        6.48       28.70
         75 |          4        3.70       32.41
         80 |          1        0.93       33.33
         83 |          1        0.93       34.26
         90 |          1        0.93       35.19
         95 |          1        0.93       36.11
         99 |          3        2.78       38.89
        100 |         66       61.11      100.00
------------+-----------------------------------
      Total |        108      100.00

. egen dcpct3=cut(dcpct), at(0,50,100,1000)

. tab dcpct3

     dcpct3 |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |         24       22.22       22.22
         50 |         18       16.67       38.89
        100 |         66       61.11      100.00
------------+-----------------------------------
      Total |        108      100.00

. 
. * residuals one per covariate pattern
. * fitting a logistic model
. logit casecont dneo##dclox i.dcpct3

Iteration 0:   log likelihood = -74.859896  
Iteration 1:   log likelihood = -52.081216  
Iteration 2:   log likelihood = -51.634967  
Iteration 3:   log likelihood = -51.632242  
Iteration 4:   log likelihood = -51.632242  

Logistic regression                             Number of obs     =        108
                                                LR chi2(5)        =      46.46
                                                Prob > chi2       =     0.0000
Log likelihood = -51.632242                     Pseudo R2         =     0.3103

------------------------------------------------------------------------------
    casecont |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        dneo |
        yes  |    3.19238   .8361783     3.82   0.000       1.5535    4.831259
             |
       dclox |
        yes  |   .4529145   1.026657     0.44   0.659    -1.559296    2.465125
             |
  dneo#dclox |
    yes#yes  |  -2.532558   1.207714    -2.10   0.036    -4.899634   -.1654829
             |
      dcpct3 |
         50  |   1.361002    .819178     1.66   0.097    -.2445579    2.966561
        100  |   2.026562   .6855237     2.96   0.003     .6829604    3.370164
             |
       _cons |  -3.531226   .9364287    -3.77   0.000    -5.366593    -1.69586
------------------------------------------------------------------------------

. 
. * examining the covariate patterns
. predict cov, num

. predict pv, p

. sort cov

. * generate a count of the number of obs. in each cov. pattern
. quietly by cov: gen cnt=_N

. br cov cnt dcpct3 dneo dclox pv casecont

. 
. * examining Pearson residuals
. predict pear, res /*one per covariate pattern*/

. format pv pear  %5.3f

. sort pear

. summ pear

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
        pear |        108    .1413821    .5669692  -.5835651   2.359985

. list cov cnt dcpct dneo dclox pv casecont pear  if abs(pear)>2, noobs sep(4)

  +-------------------------------------------------------------+
  | cov   cnt   dcpct   dneo   dclox      pv   casecont    pear |
  |-------------------------------------------------------------|
  |   4     1      83     no     yes   0.152        yes   2.360 |
  +-------------------------------------------------------------+

. 
. * Pearson goodness-of-fit tests
. *summary table - page 6
. preserve

. gen pear_sq=pear^2

. egen pos=sum(casecont), by(cov)

. collapse pv cnt pear pear_sq pos, by(cov casecont)

. sort cov casecont

. foreach var in pv cnt pear pos pear_sq {
  2. by cov:replace `var'=. if _n>1
  3. }
(8 real changes made, 8 to missing)
(8 real changes made, 8 to missing)
(8 real changes made, 8 to missing)
(8 real changes made, 8 to missing)
(8 real changes made, 8 to missing)

. table casecont, by(cov)  c(mean cnt mean pos mean pv mean pear mean pear_sq )

--------------------------------------------------------------------------------
covariate |
pattern   |
and Case  |
- Control |    mean(cnt)     mean(pos)      mean(pv)    mean(pear)  mean(pear~q)
----------+---------------------------------------------------------------------
1         |
       no |           12             1         0.028         1.144      1.308949
      yes |                                                                     
----------+---------------------------------------------------------------------
2         |
       no |            2             0         0.102        -0.478      .2283039
      yes |                                                                     
----------+---------------------------------------------------------------------
3         |
       no |            8             1         0.182        -0.416      .1731429
      yes |                                                                     
----------+---------------------------------------------------------------------
4         |
       no |                                                                     
      yes |            1             1         0.152         2.360      5.569528
----------+---------------------------------------------------------------------
5         |
       no |           11             2         0.259        -0.584      .3405482
      yes |                                                                     
----------+---------------------------------------------------------------------
6         |
       no |           11             4         0.416        -0.353      .1245677
      yes |                                                                     
----------+---------------------------------------------------------------------
7         |
       no |           10             7         0.735        -0.254      .0643712
      yes |                                                                     
----------+---------------------------------------------------------------------
8         |
       no |           38            33         0.844         0.416      .1731366
      yes |                                                                     
----------+---------------------------------------------------------------------
9         |
       no |            1             0         0.082        -0.298      .0890559
      yes |                                                                     
----------+---------------------------------------------------------------------
10        |
       no |            5             1         0.258        -0.295      .0872724
      yes |                                                                     
----------+---------------------------------------------------------------------
11        |
       no |            9             4         0.403         0.252      .0634578
      yes |                                                                     
--------------------------------------------------------------------------------

. qui summ pear_sq if cnt~=. //command to capture the sum of pear_sq

. di "Pearson X2 = " r(sum) "  Prob > chi2 =" chi2tail(11-6,r(sum))
Pearson X2 = 8.2223332  Prob > chi2 =.14440061

. restore

. 
. * Pearson GOF  
. **stata post estimation command
. estat gof

Logistic model for casecont, goodness-of-fit test

       number of observations =       108
 number of covariate patterns =        11
              Pearson chi2(5) =         8.22
                  Prob > chi2 =         0.1444

. 
. * Hosmer - Lemeshow Test
. estat gof, g(10) table

Logistic model for casecont, goodness-of-fit test

  (Table collapsed on quantiles of estimated probabilities)
  (There are only 7 distinct quantiles because of ties)
  +--------------------------------------------------------+
  | Group |   Prob | Obs_1 | Exp_1 | Obs_0 | Exp_0 | Total |
  |-------+--------+-------+-------+-------+-------+-------|
  |     1 | 0.0284 |     1 |   0.3 |    11 |  11.7 |    12 |
  |     2 | 0.1817 |     2 |   1.9 |    10 |  10.1 |    12 |
  |     3 | 0.2589 |     3 |   4.1 |    13 |  11.9 |    16 |
  |     4 | 0.4033 |     4 |   3.6 |     5 |   5.4 |     9 |
  |     5 | 0.4161 |     4 |   4.6 |     7 |   6.4 |    11 |
  |-------+--------+-------+-------+-------+-------+-------|
  |     6 | 0.7354 |     7 |   7.4 |     3 |   2.6 |    10 |
  |    10 | 0.8439 |    33 |  32.1 |     5 |   5.9 |    38 |
  +--------------------------------------------------------+

       number of observations =       108
             number of groups =         7
      Hosmer-Lemeshow chi2(5) =         2.16
                  Prob > chi2 =         0.8262

. 
. * Evaluating Important Observations in a Logistic Model
. * fitting a logistic model
. logit casecont i.dneo##dclox i.dcpct3 

Iteration 0:   log likelihood = -74.859896  
Iteration 1:   log likelihood = -52.081216  
Iteration 2:   log likelihood = -51.634967  
Iteration 3:   log likelihood = -51.632242  
Iteration 4:   log likelihood = -51.632242  

Logistic regression                             Number of obs     =        108
                                                LR chi2(5)        =      46.46
                                                Prob > chi2       =     0.0000
Log likelihood = -51.632242                     Pseudo R2         =     0.3103

------------------------------------------------------------------------------
    casecont |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        dneo |
        yes  |    3.19238   .8361783     3.82   0.000       1.5535    4.831259
             |
       dclox |
        yes  |   .4529145   1.026657     0.44   0.659    -1.559296    2.465125
             |
  dneo#dclox |
    yes#yes  |  -2.532558   1.207714    -2.10   0.036    -4.899634   -.1654829
             |
      dcpct3 |
         50  |   1.361002    .819178     1.66   0.097    -.2445579    2.966561
        100  |   2.026562   .6855237     2.96   0.003     .6829604    3.370164
             |
       _cons |  -3.531226   .9364287    -3.77   0.000    -5.366593    -1.69586
------------------------------------------------------------------------------

. * predicting residuals and influential statistics
. capture drop cov 

. capture drop pv  

. capture drop cnt

. capture drop lev 

. capture drop pear_std 

. capture drop dx2

. capture drop db

. predict pv, p

. predict pear_std, rstandard

. predict lev, hat

. predict dx2, dx2 

. predict db, dbeta

. predict cov, num

. 
. **additional variables for listings and formatting
. bysort cov: gen cnt=_N

. bysort cov:gen wcov=_n

. bysort cov: egen opr=mean(casecont)

. foreach var in opr pv pear_std lev dx2 db {
  2. format `var' %4.3f
  3. }

. 
. * Identifying highest leverage points
. summ lev, d

                          leverage
-------------------------------------------------------------
      Percentiles      Smallest
 1%     .1063734       .0550704
 5%     .2907239       .1063734
10%     .2907239       .1662458       Obs                 108
25%     .7028956       .1662458       Sum of Wgt.         108

50%     .8515815                      Mean            .729414
                        Largest       Std. Dev.      .2256387
75%     .8515815       .9453593
90%     .9453593       .9453593       Variance       .0509128
95%     .9453593       .9453593       Skewness      -1.434562
99%     .9453593       .9453593       Kurtosis       3.850434

. * graph of stand. resid. vs leverage
. scatter lev pv, mlabel(cov) xline(0.1 0.9) xlabel(0(0.1)1)

. scatter pear_std  lev , mlabel(cov) yline(-2 2)

. sort lev pv 

. list cov cnt dcpct3 dneo dclox opr pv pear_std lev if  pv>0.1 & pv<0.9 & wcov==1, noobs 

  +----------------------------------------------------------------------+
  | cov   cnt   dcpct3   dneo   dclox     opr      pv   pear_std     lev |
  |----------------------------------------------------------------------|
  |   4     1       50     no     yes   1.000   0.152      2.496   0.106 |
  |   2     2       50     no      no   0.000   0.102     -0.523   0.166 |
  |  10     5       50    yes     yes   0.200   0.258     -0.416   0.496 |
  |   7    10       50    yes      no   0.700   0.735     -0.465   0.703 |
  |   3     8      100     no      no   0.125   0.182     -0.801   0.730 |
  |----------------------------------------------------------------------|
  |  11     9      100    yes     yes   0.444   0.403      0.518   0.764 |
  |   8    38      100    yes      no   0.868   0.844      1.080   0.852 |
  |   6    11        0    yes      no   0.364   0.416     -1.073   0.892 |
  |   5    11      100     no     yes   0.182   0.259     -2.496   0.945 |
  +----------------------------------------------------------------------+

. 
. * residual  
. sort pear_std

. twoway (scatter pear_std cov [aweight=cnt], msymbol(Oh) mlcolor(black) mlwidth(medium)) ///
>                 (scatter pear_std cov, msize(vtiny) mlabel(cov)), legend(off) yline( -2 2)

. list cov cnt dcpct3 dneo dclox opr pv pear_std if wcov==1 & abs(pear_std)>2 ,noobs 

  +--------------------------------------------------------------+
  | cov   cnt   dcpct3   dneo   dclox     opr      pv   pear_std |
  |--------------------------------------------------------------|
  |   5    11      100     no     yes   0.182   0.259     -2.496 |
  |   4     1       50     no     yes   1.000   0.152      2.496 |
  +--------------------------------------------------------------+

. 
. * evaluating delta chisq 
. scatter dx2 pv, mlabel(cov) yline(3.84) /*delta chi2*/

. sort dx2

. list cov cnt dcpct3 dneo dclox pv dx2 lev pear if dx2>3.84 & wcov==1, noobs 

  +--------------------------------------------------------------------+
  | cov   cnt   dcpct3   dneo   dclox      pv     dx2     lev     pear |
  |--------------------------------------------------------------------|
  |   4     1       50     no     yes   0.152   6.232   0.106    2.360 |
  |   5    11      100     no     yes   0.259   6.232   0.945   -0.584 |
  +--------------------------------------------------------------------+

. 
. * evaluating delta betas
. sort db

. summ db, d

                      Pregibon's dbeta
-------------------------------------------------------------
      Percentiles      Smallest
 1%     .0545994       .0054927
 5%     .1701865       .0545994
10%     .5125833       .0545994       Obs                 108
25%     .7564373       .1701865       Sum of Wgt.         108

50%      6.69328                      Mean           14.65434
                        Largest       Std. Dev.      31.68869
75%      6.69328       107.8308
90%     107.8308       107.8308       Variance       1004.173
95%     107.8308       107.8308       Skewness       2.581599
99%     107.8308       107.8308       Kurtosis        7.77457

. scatter db pv, ml(cov) yline(1)

. scatter db lev, ml(cov) yline(1)

. scatter dx2 pv [aweight=db], msymbol(Oh)  || scatter dx2 pv, ml(cov) yline(3.84) legend(off) ///
>         ytitle("Delta Chi2")

. sort db 

. l cov cnt dcpct dneo dclox opr pv lev dx2 db if db > abs(1) & wcov==1, noobs 

  +----------------------------------------------------------------------------+
  | cov   cnt   dcpct   dneo   dclox     opr      pv     lev     dx2        db |
  |----------------------------------------------------------------------------|
  |   3     8     100     no      no   0.125   0.182   0.730   0.642     1.739 |
  |   8    38     100    yes      no   0.868   0.844   0.852   1.167     6.693 |
  |   6    11       5    yes      no   0.364   0.416   0.892   1.152     9.504 |
  |   5    11     100     no     yes   0.182   0.259   0.945   6.232   107.831 |
  +----------------------------------------------------------------------------+

. 
. * dropping the highest db covariate pattern and refitting the model
. logit casecont dneo##dclox i.dcpct3 if cov~=5

note: 0.dneo#1.dclox != 0 predicts success perfectly
      0.dneo#1.dclox dropped and 1 obs not used

note: 1.dneo#1.dclox omitted because of collinearity
Iteration 0:   log likelihood = -66.354507  
Iteration 1:   log likelihood = -44.475074  
Iteration 2:   log likelihood = -44.191216  
Iteration 3:   log likelihood = -44.189538  
Iteration 4:   log likelihood = -44.189538  

Logistic regression                             Number of obs     =         96
                                                LR chi2(4)        =      44.33
                                                Prob > chi2       =     0.0000
Log likelihood = -44.189538                     Pseudo R2         =     0.3340

------------------------------------------------------------------------------
    casecont |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        dneo |
        yes  |    3.24774   .8455286     3.84   0.000     1.590534    4.904945
             |
       dclox |
        yes  |  -2.081316   .6762839    -3.08   0.002    -3.406808   -.7558245
             |
  dneo#dclox |
     no#yes  |          0  (empty)
    yes#yes  |          0  (omitted)
             |
      dcpct3 |
         50  |   1.086617   .8180359     1.33   0.184    -.5167036    2.689938
        100  |    2.13252   .6950519     3.07   0.002     .7702429    3.494796
             |
       _cons |   -3.58071   .9466209    -3.78   0.000    -5.436053   -1.725367
------------------------------------------------------------------------------

. * no interaction cov 5 only cov with dneo=no and dclox=yes with cases and controls
. * the other cov with this patterns is 4 but only has one case.
. table casecont dneo dclox

------------------------------------
          | Cloxacillin used on farm
          |and Neomycin used on farm
Case -    | --- no ---    --- yes --
Control   |   no   yes      no   yes
----------+-------------------------
       no |   20    15       9    10
      yes |    2    44       3     5
------------------------------------

. table casecont dneo dclox if cov~=5

------------------------------------
          | Cloxacillin used on farm
          |and Neomycin used on farm
Case -    | --- no ---    --- yes --
Control   |   no   yes      no   yes
----------+-------------------------
       no |   20    15            10
      yes |    2    44       1     5
------------------------------------

. 
. * refitting and comparing the models 
. logit casecont dneo##dclox i.dcpct3 

Iteration 0:   log likelihood = -74.859896  
Iteration 1:   log likelihood = -52.081216  
Iteration 2:   log likelihood = -51.634967  
Iteration 3:   log likelihood = -51.632242  
Iteration 4:   log likelihood = -51.632242  

Logistic regression                             Number of obs     =        108
                                                LR chi2(5)        =      46.46
                                                Prob > chi2       =     0.0000
Log likelihood = -51.632242                     Pseudo R2         =     0.3103

------------------------------------------------------------------------------
    casecont |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        dneo |
        yes  |    3.19238   .8361783     3.82   0.000       1.5535    4.831259
             |
       dclox |
        yes  |   .4529145   1.026657     0.44   0.659    -1.559296    2.465125
             |
  dneo#dclox |
    yes#yes  |  -2.532558   1.207714    -2.10   0.036    -4.899634   -.1654829
             |
      dcpct3 |
         50  |   1.361002    .819178     1.66   0.097    -.2445579    2.966561
        100  |   2.026562   .6855237     2.96   0.003     .6829604    3.370164
             |
       _cons |  -3.531226   .9364287    -3.77   0.000    -5.366593    -1.69586
------------------------------------------------------------------------------

. estimate store final

. * without cov pattern 5
. logit casecont dneo##dclox i.dcpct3 if cov~=5, asis

Iteration 0:   log likelihood =  -66.98248  
Iteration 1:   log likelihood = -44.531702  
Iteration 2:   log likelihood = -44.212331  
Iteration 3:   log likelihood = -44.194711  
Iteration 4:   log likelihood = -44.190532  
Iteration 5:   log likelihood = -44.189712  
Iteration 6:   log likelihood = -44.189579  
Iteration 7:   log likelihood = -44.189547  
Iteration 8:   log likelihood =  -44.18954  
Iteration 9:   log likelihood = -44.189538  

Logistic regression                             Number of obs     =         97
                                                LR chi2(5)        =      45.59
                                                Prob > chi2       =     0.0000
Log likelihood = -44.189538                     Pseudo R2         =     0.3403

------------------------------------------------------------------------------
    casecont |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        dneo |
        yes  |    3.24781   .8455337     3.84   0.000     1.590594    4.905025
             |
       dclox |
        yes  |   17.32151   1658.542     0.01   0.992     -3233.36    3268.003
             |
  dneo#dclox |
    yes#yes  |  -19.40303   1658.542    -0.01   0.991    -3270.085    3231.279
             |
      dcpct3 |
         50  |   1.086567   .8180393     1.33   0.184    -.5167607    2.689894
        100  |   2.132465   .6950524     3.07   0.002     .7701876    3.494743
             |
       _cons |  -3.580685   .9466216    -3.78   0.000    -5.436029   -1.725341
------------------------------------------------------------------------------

. estimates store wocov5

. * without cov pattern 6
. logit casecont dneo##dclox i.dcpct3 if cov~=6

Iteration 0:   log likelihood = -67.188877  
Iteration 1:   log likelihood = -44.268414  
Iteration 2:   log likelihood = -43.951293  
Iteration 3:   log likelihood = -43.949794  
Iteration 4:   log likelihood = -43.949794  

Logistic regression                             Number of obs     =         97
                                                LR chi2(5)        =      46.48
                                                Prob > chi2       =     0.0000
Log likelihood = -43.949794                     Pseudo R2         =     0.3459

------------------------------------------------------------------------------
    casecont |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        dneo |
        yes  |   3.639127     .99651     3.65   0.000     1.686003    5.592251
             |
       dclox |
        yes  |   .8080203   1.132313     0.71   0.475    -1.411272    3.027313
             |
  dneo#dclox |
    yes#yes  |  -3.018298    1.34906    -2.24   0.025    -5.662408   -.3741881
             |
      dcpct3 |
         50  |   .1199465   1.407715     0.09   0.932    -2.639124    2.879017
        100  |   .8134231   1.299859     0.63   0.531    -1.734255    3.361101
             |
       _cons |  -2.671599   1.073272    -2.49   0.013    -4.775174   -.5680239
------------------------------------------------------------------------------

. estimates store wocov6

. * without cov pattern 8
. logit casecont dneo##dclox i.dcpct3 if cov~=8

Iteration 0:   log likelihood = -42.760501  
Iteration 1:   log likelihood = -36.670668  
Iteration 2:   log likelihood = -36.284326  
Iteration 3:   log likelihood = -36.280224  
Iteration 4:   log likelihood = -36.280224  

Logistic regression                             Number of obs     =         70
                                                LR chi2(5)        =      12.96
                                                Prob > chi2       =     0.0238
Log likelihood = -36.280224                     Pseudo R2         =     0.1515

------------------------------------------------------------------------------
    casecont |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        dneo |
        yes  |   2.518415   .9964233     2.53   0.011     .5654611    4.471369
             |
       dclox |
        yes  |   .7045019   1.065468     0.66   0.508    -1.383776     2.79278
             |
  dneo#dclox |
    yes#yes  |   -2.05333   1.255048    -1.64   0.102     -4.51318    .4065195
             |
      dcpct3 |
         50  |   1.173495   .7931855     1.48   0.139    -.3811195     2.72811
        100  |   1.168346   1.025769     1.14   0.255     -.842125    3.178817
             |
       _cons |   -2.97189   .9807858    -3.03   0.002    -4.894194   -1.049585
------------------------------------------------------------------------------

. estimates store wocov8

. estimates table final wocov5 wocov6 wocov8 , b(%5.3f)  stats(N) star( .05 .01 .001)

------------------------------------------------------------------
    Variable |   final        wocov5       wocov6       wocov8    
-------------+----------------------------------------------------
        dneo |
        yes  |   3.192***     3.248***     3.639***     2.518*    
             |
       dclox |
        yes  |   0.453       17.322        0.808        0.705     
             |
  dneo#dclox |
    yes#yes  |  -2.533*     -19.403       -3.018*      -2.053     
             |
      dcpct3 |
         50  |   1.361        1.087        0.120        1.173     
        100  |   2.027**      2.132**      0.813        1.168     
             |
       _cons |  -3.531***    -3.581***    -2.672*      -2.972**   
-------------+----------------------------------------------------
           N |     108           97           97           70     
------------------------------------------------------------------
                             legend: * p<.05; ** p<.01; *** p<.001

. 
. *Predictive ability of the model
. * sensitivity and specificty of logistic model
. logit casecont dneo##dclox i.dcpct3 

Iteration 0:   log likelihood = -74.859896  
Iteration 1:   log likelihood = -52.081216  
Iteration 2:   log likelihood = -51.634967  
Iteration 3:   log likelihood = -51.632242  
Iteration 4:   log likelihood = -51.632242  

Logistic regression                             Number of obs     =        108
                                                LR chi2(5)        =      46.46
                                                Prob > chi2       =     0.0000
Log likelihood = -51.632242                     Pseudo R2         =     0.3103

------------------------------------------------------------------------------
    casecont |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        dneo |
        yes  |    3.19238   .8361783     3.82   0.000       1.5535    4.831259
             |
       dclox |
        yes  |   .4529145   1.026657     0.44   0.659    -1.559296    2.465125
             |
  dneo#dclox |
    yes#yes  |  -2.532558   1.207714    -2.10   0.036    -4.899634   -.1654829
             |
      dcpct3 |
         50  |   1.361002    .819178     1.66   0.097    -.2445579    2.966561
        100  |   2.026562   .6855237     2.96   0.003     .6829604    3.370164
             |
       _cons |  -3.531226   .9364287    -3.77   0.000    -5.366593    -1.69586
------------------------------------------------------------------------------

. estat class

Logistic model for casecont

              -------- True --------
Classified |         D            ~D  |      Total
-----------+--------------------------+-----------
     +     |        40             8  |         48
     -     |        14            46  |         60
-----------+--------------------------+-----------
   Total   |        54            54  |        108

Classified + if predicted Pr(D) >= .5
True D defined as casecont != 0
--------------------------------------------------
Sensitivity                     Pr( +| D)   74.07%
Specificity                     Pr( -|~D)   85.19%
Positive predictive value       Pr( D| +)   83.33%
Negative predictive value       Pr(~D| -)   76.67%
--------------------------------------------------
False + rate for true ~D        Pr( +|~D)   14.81%
False - rate for true D         Pr( -| D)   25.93%
False + rate for classified +   Pr(~D| +)   16.67%
False - rate for classified -   Pr( D| -)   23.33%
--------------------------------------------------
Correctly classified                        79.63%
--------------------------------------------------

. * two graph ROC
. lsens, lpattern(solid dash)

. * changing the cutpoint  and producing an ROC curve and LR table
. egen pv_cat=cut(pv), at(0(.05)1) 

. roctab casecont pv_cat, graph sum detail 

Detailed report of sensitivity and specificity
------------------------------------------------------------------------------
                                           Correctly
Cutpoint      Sensitivity   Specificity   Classified          LR+          LR-
------------------------------------------------------------------------------
( >= 0 )          100.00%         0.00%       50.00%       1.0000     
( >= .05 )         98.15%        20.37%       59.26%       1.2326       0.0909
( >= .1 )          98.15%        22.22%       60.19%       1.2619       0.0833
( >= .15 )         98.15%        25.93%       62.04%       1.3250       0.0714
( >= .25 )         94.44%        38.89%       66.67%       1.5455       0.1429
( >= .4 )          88.89%        62.96%       75.93%       2.4000       0.1765
( >= .7 )          74.07%        85.19%       79.63%       5.0000       0.3043
( >= .8 )          61.11%        90.74%       75.93%       6.6000       0.4286
( >  .8 )           0.00%       100.00%       50.00%                    1.0000
------------------------------------------------------------------------------


                      ROC                    -Asymptotic Normal--
           Obs       Area     Std. Err.      [95% Conf. Interval]
     ------------------------------------------------------------
           108     0.8488       0.0370        0.77621     0.92132

. estat class, cut(0.25) // no in the notes - change cutpoint to 0.25 

Logistic model for casecont

              -------- True --------
Classified |         D            ~D  |      Total
-----------+--------------------------+-----------
     +     |        51            33  |         84
     -     |         3            21  |         24
-----------+--------------------------+-----------
   Total   |        54            54  |        108

Classified + if predicted Pr(D) >= .25
True D defined as casecont != 0
--------------------------------------------------
Sensitivity                     Pr( +| D)   94.44%
Specificity                     Pr( -|~D)   38.89%
Positive predictive value       Pr( D| +)   60.71%
Negative predictive value       Pr(~D| -)   87.50%
--------------------------------------------------
False + rate for true ~D        Pr( +|~D)   61.11%
False - rate for true D         Pr( -| D)    5.56%
False + rate for classified +   Pr(~D| +)   39.29%
False - rate for classified -   Pr( D| -)   12.50%
--------------------------------------------------
Correctly classified                        66.67%
--------------------------------------------------

.                                           // increase Se and decrease Sp
. *ROC plot and AUC after command
. lroc

Logistic model for casecont

number of observations =      108
area under ROC curve   =   0.8460

. 
end of do-file

. log close
      name:  <unnamed>
       log:  C:\vhm812-data\L5a-log_reg_dx.txt
  log type:  text
 closed on:   2 Feb 2016, 09:47:11
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------