Use Enterprise Miner to fit a neural network. 1. Issue the following commands to SAS (with appropriate substitutions for "U:\"). LIBNAME GFLib 'U:\dm\sasdata' ; LIBNAME RCLib 'U:\dm\RC' ; 2. If necessary, apply macros to convert Excel data to SAS (EXCELSAS, see MacroInstr.txt), to divide the data into training/test or training/validation/test subsets (RANSPLIT, see MacroInstr.txt), or to explore the data (UNIVAR and FREQ/FREQUENCY, see MacroInstr2.txt). 3. If necessary, assign numerical designations to categories and shorten variable names. Illustrative code may be found in MacroInstr5.txt. 4. In SAS, go to the "Solutions" menu. Go to "Analysis" and then select "Enterprise Miner". Do not invoke the tutorial (unless you want to do so to satisfy your own curiosity). From the "File" menu, choose "New" and then "Project". A box will appear with "Name" and "Location" fields as well as with "Create", "Cancel", and "Browse" buttons. Press the "Browse" button and select a subdirectory such as 'U:\dm\RC'. In the "Name" field, type a name like 'NeuralNet'. Then press "Create". 5. In the left panel of the SAS Enterprise Miner window, you will see a diagram with 'NeuralNet' and, immediately below it, 'Untitled'. You can right-click on 'Untitled' to assign a name such as 'Diabetes'. 6. Near the top of the SAS Enterprise Miner window, click the "Input Data Source" icon (far left) and, while holding the mouse button down, drag it into the right panel. Assuming that you have training, validation, and test data, repeat this process twice so that you have three "Input Data Source" icons in the right panel. 7. Double click on one of the "Input Data Source" icons in the right panel. Press the "Select" button and then choose an appropriate library such as 'RCLib'. Choose a training data set like 'DIABTR'. In the "Role" field, change "RAW" to "TRAIN". Then click on the "Variables" tab near the top of the "Input Data Source" box. You can alter entries in the "Model Role" column by right clicking and then selecting "Set Model Role". You want the response variable to be identified as "target", the explanatory variables to be identified as "input", and the ID variable (if any) to be identified as "ID". Any variables that you know will not be used may be identified as "rejected". Also, make any necessary adjustments in the "Measurement" column. When finished, click the white on red X in the upper right corner of the "Input Data Source" box and confirm the changes. Assuming that you have validation and test data, repeat this process twice (except that "RAW" will be changed to "VALIDATE" and "TEST"). 8. Near the top of the SAS Enterprise Miner window, click the "Neural Network" icon (middle-right) and drag it into the right panel. By holding the left mouse button down, 'draw' arrows from the data set icons to the "Neural Network" icon. Also, drag the "Reporter" icon (right) into the right panel, then 'draw' an arrow from the "Neural Network" icon to the "Reporter" icon. 9. Double click on the "Neural Network" icon. Click the "General" tab. For "Model selection criteria", choose "Average [Squared] Error" if the target is continuous and "Misclassification Rate" if the target is categorical. Click the "Basic" tab. For "Runtime limit", choose 10 minutes. Clicking the arrow for "Network architecture" will yield a menu with a "Hidden neurons" item. You can set the number of hidden neurons yourself or let Enterprise Miner decide based on the anticipated level of noise in your data; some guidelines are provided below. When finished, close the "Neural Network" box and assign a model name such as 'DiabNN'. Noise Continuous target Categorical target High 50% reduction in 40% misclassified average squared error Medium 70% reduction in 25% misclassified average squared error Low 90% reduction in 10% misclassified average squared error 10. Right-click the "Neural Network" icon and choose "Run". You will be asked if you want to view the results. You can say "No" since you will acquire what you need in the next step. 11. Right-click the "Reporter" icon and choose "Run". You can "Open" the report now or view it later by noting to which subdirectory it has been saved. For a continuous target, you are interested in the average squared error for the training, validation, and test data sets. For a categorical target, you are interested in the misclassification rate for the training, validation, and test data sets. For either kind of target, you can click the "Datastep Score Code" link to get SAS code that will allow you to generate predictions for each and every individual in any data set that has the same explanatory variables as the training data set.