Software needed: MlwiN [free to UK academic users] REALCOM-impute [free] MATLAB runtime installer [free] All are available from http://www.cmm.bristol.ac.uk/ For the MATLAB installer, you must register your organisation and email on the site. Data needed: class_size_data.wsz (a) Fitting model of interest, eq(1), to the partially observed data: ========================================= Start MLwiN From the MLwiN toolbar click on File->Open worksheet and select class_size_data.wsz The 'Names' window appears showing the following variables: clsnr - class identifier pupil - pupil identifier nlitpre - see Table 1 nmatpre - see Table 1 nlitpost - see Table 1 nmatpost - see Table 1 csize - categorical class size variable, categories 1 : 19 or fewer; 2 : 20-24; 3 : 25-29; 4 : 30 or greater. cons - constant csize20-24 - dummy variable indexing csize=2 csize25-29 - dummy variable indexing csize=3 csize>=30 - dummy variable inexing csize=4 From the MLwiN toolbar click on Model->Equations The Equations window will appear, showing the model (1) in the paper. From the toolbar click on 'Start' The model will be fitted by maximising the restricted likelihood (REML) giving the parameter estimates shown in Table 3, in the column headed 'Complete cases' (b) Creating imputation data set ==================== In MlwiN, from the toolbar, select Model -> Imputation -> Save Imputation Specification A dialogue headed 'Specify imputation' will open. In the 'Number of response variables' box, choose 5 In the 'Number of auxiliary variables' box, choose 1 In the 'Level 2 identifier box' choose from the drop down menu the level 2 identifier, 'clsnr' In the 'Variables to be imputed' dialogue, click in the top left box and from the drop down menu select 'nmatpost' In the same dialogue, click in the top right box and from the drop down menu select 'Normal' In the same dialogue, click in the cell below 'nmatpost' and from the drop down menu select 'nlitpost'. Click in the box immediately to the right of this and from the drop down menu select 'Normal' In the same dialogue, click in the cell below 'nlitpost' and from the drop down menu select 'nmatpre'. Click in the box immediately to the right of this and from the drop down menu select 'Normal' In the same dialogue, click in the cell below 'nmatpre' and from the drop down menu select 'nlitpre'. Click in the box immediately to the right of this and from the drop down menu select 'Normal' In the same dialogue, click in the cell below 'nlitpre' and from the drop down menu select 'csize'. Click in the box immediately to the right of this and from the drop down menu select 'Unordered categorical' In the same dialogue, on the right hand side in the Auxiliary variables area, click in the 'Column' box and select the variable 'cons' The dialogue box should look like FIgure 2. Click on 'Done' at the centre bottom of this box. [Note: if you chose a number greater than 5 in the 'No of response variables' you will have more rows than you require - ie row 6 and above - then you may get the message 'incomplete specification' when you click on 'Done'. In this case, in the 'No of response variables' box, enter '5', which will resolve this] Now a dialogue appears asking for a file name to save the data for imputation. We assume you choose the filename 'data_for_impute'. Click on 'Save'. A .txt file will be created with the data for imputation. Now it's easiest to minimise the MLwiN window (but not exit MLWiN). (c) Imputing the missing values ==================== Start the program REALCOM-IMPUTE. Wait a few seconds, and a dialog like the left side of FIgure 3 will appear, headed 'Two level mixed response model' In this dialogue click on 'Open data file', change to the directory where you stored 'data_for_impute' and in the Files of type' select 'Plain text files' and then select 'data_for_impute' Click in the 'Show equations' box, and an Equations window, as shown in the right part of Figure 3, should appear. Click on MCMC estimation settings and the box 'MCMC estimation settings' should open. Choose your burn in length (see paper): for example 2000 Choose the number of iterations (see paper): for example 10000 Choose the screen refresh rate: for example 100 Click on 'Done' Click on 'Impute' and the dialogue shown in Figure 4 should appear. Enter the iteration numbers at which you wish to create imputations, and choose the directory name where the imputed data will be stored. In the paper, we choose 500, 1000, 1500,...,10000. Note, you do not choose the file name, just the directory name, so you may wish to use the 'create new directory' option here. Click on 'Done' You should now be ready to start the MCMC process. Return to the main dialogue and click on 'Start MCMC run' The DOS window behind the REALCOM-IMPUTATION dialogue may be brought to the front to monitor progress. NOTE: Using the burn in, between imputation updates and number of imputations chosen above, a typical imputation time would be one hour. (d) Fitting the model of interest to each imputed data set and ====================================== combining the results using Rubin's rules. =========================== Return to MLwiN. From the MLwiN toolbar select 'Model -> Imputation -> Retrieve Imputation ' This prompts for the name of the file of imputed data created by REALCOM-IMPUTE. From the MLwiN toolbar select 'Model -> Imputation -> Start Analysis. MlwiN then fits the model of interest to each imputed data set, combines the results using Rubin's rules, and displays them in the 'Equations' window (in blue). Up to Monte-Carlo error, the results should agree with those shown in Table 3.