Use the LOGISTIC macro. 1. Issue the following commands to SAS (with appropriate substitutions for "U:\"). LIBNAME GFLib 'U:\dm\sasdata' ; LIBNAME RCLib 'U:\dm\RC' ; 2. If necessary, apply macros to convert Excel data to SAS (EXCELSAS, see MacroInstr.txt), to divide the data into training/test or training/validation/test subsets (RANSPLIT, see MacroInstr.txt), or to explore the data (UNIVAR and FREQ/FREQUENCY, see MacroInstr2.txt). 3. If necessary, assign numerical designations to categories and shorten variable names so that you can fit them all in the appropriate fields of the LOGISTIC macro. For example: DATA RCLIB.SATRAIN2; SET RCLIB.SATRAIN; IF FAMHIST = 'Present' THEN FAM=1; IF FAMHIST = 'Absent' THEN FAM=0; to = tobacco; ad = adiposity; al = alcohol; ta = typea; ob = obesity; RUN; DATA RCLIB.SAVALID2; SET RCLIB.SAVALID; IF FAMHIST = 'Present' THEN FAM=1; IF FAMHIST = 'Absent' THEN FAM=0; to = tobacco; ad = adiposity; al = alcohol; ta = typea; ob = obesity; RUN; DATA RCLIB.SATEST2; SET RCLIB.SATEST; IF FAMHIST = 'Present' THEN FAM=1; IF FAMHIST = 'Absent' THEN FAM=0; to = tobacco; ad = adiposity; al = alcohol; ta = typea; ob = obesity; RUN; 4. Apply the LOGISTIC macro to your training data set. How to fill in the blanks is described on pages 187-191 of the textbook. I recommend converting non-dichotomous categorical variables to dichotomous indicators and then including both the continuous variables and dichotomous indicators in the fourth and fifth fields. (Filling in the third field changes the kind of output you get.) For output like that in {RCLIB.SATRAIN224.rtf} (i.e., to obtain a fitted model and suggestions about which variables can be removed), try: RCLIB.SATRAIN2 chd [blank] sbp to ldl ad fam ta ob al age sbp to ldl ad fam ta ob al age [blank] none [blank] [blank] subject 24 word U:\dm\RC\ [blank] 0.50 For output like that in {RCLIB.SATRAIN222.rtf} (i.e., to obtain diagnostic plots suggesting which quadratic/interaction terms may help), try: RCLIB.SATRAIN2 chd [blank] sbp to ldl ad fam ta ob al age sbp to ldl ad fam ta ob al age yes none [blank] [blank] subject 22 word U:\dm\RC\ [blank] 0.50 The LSCORE macro does not work! You can use the code below to obtain correct classification rates for the validation (or test) subset. If you insert the line WHERE CHD = 1; immediately after the line starting with PROC MEANS, the code will give you sensitivities instead of correct classification rates. If you insert the line WHERE CHD = 0; immediately after the line starting with PROC MEANS, the code will give you specificities instead of correct classification rates. For the validation subset in {savalid2.sas7bdat}, the correct classification rates range from 35.7% (when we predict CHD for any subject with estimated risk at least 5%) to 73.9% (when we predict CHD for any subject with estimated risk at least 50%). The sensitivities decline from 97.4% (when we predict CHD for any subject with estimated risk at least 5%) to 0% (when we predict CHD for any subject with estimated risk at least 95%). DATA RCLIB.SAVALID2; * Change RCLIB.SAVALID2 to the name of your validation (or test) subset. ; SET RCLIB.SAVALID2; ESTLOGITRISK = -6.9699 + SBP*0.0169 + TO*0.1255 + LDL*0.1399 + AD*0.0479 + FAM*0.9254 + TA*0.0383 + OB*-0.0584 + AL*-0.00531 + AGE*0.0163; * This fitted model is shown on page 15 of {RCLIB.SATRAIN224.rtf}. ; ESTRISK = exp(ESTLOGITRISK)/(1+exp(ESTLOGITRISK)); PREDWITHCUTOFF05 = 1 - (ESTRISK < 0.05); PREDWITHCUTOFF10 = 1 - (ESTRISK < 0.10); PREDWITHCUTOFF15 = 1 - (ESTRISK < 0.15); PREDWITHCUTOFF20 = 1 - (ESTRISK < 0.20); PREDWITHCUTOFF25 = 1 - (ESTRISK < 0.25); PREDWITHCUTOFF30 = 1 - (ESTRISK < 0.30); PREDWITHCUTOFF35 = 1 - (ESTRISK < 0.35); PREDWITHCUTOFF40 = 1 - (ESTRISK < 0.40); PREDWITHCUTOFF45 = 1 - (ESTRISK < 0.45); PREDWITHCUTOFF50 = 1 - (ESTRISK < 0.50); PREDWITHCUTOFF55 = 1 - (ESTRISK < 0.55); PREDWITHCUTOFF60 = 1 - (ESTRISK < 0.60); PREDWITHCUTOFF65 = 1 - (ESTRISK < 0.65); PREDWITHCUTOFF70 = 1 - (ESTRISK < 0.70); PREDWITHCUTOFF75 = 1 - (ESTRISK < 0.75); PREDWITHCUTOFF80 = 1 - (ESTRISK < 0.80); PREDWITHCUTOFF85 = 1 - (ESTRISK < 0.85); PREDWITHCUTOFF90 = 1 - (ESTRISK < 0.90); PREDWITHCUTOFF95 = 1 - (ESTRISK < 0.95); CORRECT05 = (PREDWITHCUTOFF05 = CHD); * Change CHD to the name of your response variable. ; CORRECT10 = (PREDWITHCUTOFF10 = CHD); CORRECT15 = (PREDWITHCUTOFF15 = CHD); CORRECT20 = (PREDWITHCUTOFF20 = CHD); CORRECT25 = (PREDWITHCUTOFF25 = CHD); CORRECT30 = (PREDWITHCUTOFF30 = CHD); CORRECT35 = (PREDWITHCUTOFF35 = CHD); CORRECT40 = (PREDWITHCUTOFF40 = CHD); CORRECT45 = (PREDWITHCUTOFF45 = CHD); CORRECT50 = (PREDWITHCUTOFF50 = CHD); CORRECT55 = (PREDWITHCUTOFF55 = CHD); CORRECT60 = (PREDWITHCUTOFF60 = CHD); CORRECT65 = (PREDWITHCUTOFF65 = CHD); CORRECT70 = (PREDWITHCUTOFF70 = CHD); CORRECT75 = (PREDWITHCUTOFF75 = CHD); CORRECT80 = (PREDWITHCUTOFF80 = CHD); CORRECT85 = (PREDWITHCUTOFF85 = CHD); CORRECT90 = (PREDWITHCUTOFF90 = CHD); CORRECT95 = (PREDWITHCUTOFF95 = CHD); RUN; TITLE ' '; PROC MEANS DATA=RCLIB.SAVALID2 MEAN; VAR CORRECT05 CORRECT10 CORRECT15 CORRECT20 CORRECT25 CORRECT30 CORRECT35 CORRECT40 CORRECT45 CORRECT50 CORRECT55 CORRECT60 CORRECT65 CORRECT70 CORRECT75 CORRECT80 CORRECT85 CORRECT90 CORRECT95; RUN;