FORTRAN 90 AND SAS-IML PROGRAMS FOR COMPUTATION OF CRITICAL VALUES FOR MULTIPLE TESTING AND SIMULTANEOUS CONFIDENCE INTERVALS

Without loss of generality, we can assume all the hypotheses Hi are one-sided and of the form Hi: ci μ < δi, and all the confidence intervals for ci μ are also one-sided and of the form (-∞, γi). This is because the two-sided hypothesis Hi: ci μ ≠ δi can be regarded as a set of two one-sided hypotheses Hia: ciμ < δi, and Hib: ci μ < -δi. In similar fashion, a two-sided confidence interval (γia, γib) for ciμ can be regarded as a set of two simultaneous one-sided confidence intervals (-∞, γib) for ciμ, and (-∞, γia) for ciμ.


Introduction
Let x be an estimate of the k-vector mean µ µ µ µ, where x is MVN(µ µ µ µ, Σσ 2 ). We assume Σ is known and if we do not know σ 2 , we have an independent estimate s 2 of σ 2 with ν degrees of freedom such that ν s 2 / σ 2 is a chi-square variate. We have a set B of contrasts c 1 , c 2 , ... , c p , and are interested either in simultaneous hypotheses testing or obtaining simultaneous confidence intervals.
Without loss of generality, we can assume all the hypotheses H i are one-sided and of the form H i : c i ' µ µ µ µ < δ i , and all the confidence intervals for c i ' µ µ µ µ are also one-sided and of the form (-∞, γ i ). This is because the two-sided hypothesis H i : c i ' µ µ µ µ ≠ δ i can be regarded as a set of two one-sided hypotheses H ia : c i ' µ µ µ µ < δ i , and H ib :c i ' µ µ µ µ < -δ i . In similar fashion, a two-sided confidence interval (γ ia , γ ib ) for c i ' µ µ µ µ can be regarded as a set of two simultaneous one-sided confidence intervals (-∞, γ ib ) for c i ' µ µ µ µ, and (-∞, -γ ia ) forc i ' µ µ µ µ.
To simultaneously test H 1 , H 2 , ... ,H p , we calculate t i = (δ ic i 'x) /v i 1/2 where v i is the variance (estimated if necessary) of c i 'x. We will accept H i whenever t i > the "critical value" which we define to be d (or q/2 1/2 ). The value d (or q) is chosen such that the probability of accepting one or more of the H i is less than α (type I error) when in fact c i ' µ µ µ µ = 0 for i = 1, ..., p. Use of d is consistent with most of the literature of multiple comparisons. Use of q is consistent with Tukey's multiple comparison procedure.
Our problem is obtaining the value of d (or q), which is a function of ν, k, α, and B. With no loss in generality, we may assume µ µ µ µ = 0. Then in the k dimensional space x 1 , x 2 , ... ,x k , we need to choose d such that the probability content of the region (call it A d ) defined by c i 'x / v 1/2 < d , for c ∈ B, is 1 -α.
Two Fortran 90 programs and two SAS-IML programs have been written to implement the procedures. The Fortran 90 program QBATCH4.FOR requires that two files exist prior to making a run. They are QCARB.IN, which contains the input for the program and QCARB.OUT, which is output for the program. QINTER4 is an interactive version of QBATCH4.
The Fortran 90 programs QBATCH4.FOR and QINTER4.FOR, and the SAS-IML programs QBATCH4.SAS and QINTER4.SAS calculate the value of q necessary for hypotheses testing or simultaneous confidence intervals and multiple comparisons, given B, α, k and ν. If σ 2 is known, ν is set to -1. For 15 different multiple comparison procedures, e.g. those of Tukey, Dunnett, Hsu's MCB, etc., the programs generate the set B. The programs also accept user generated sets B, and can also be used for "step-down" testing, Dunnett and Tamhane (1991). QBATCH4.FOR and QBATCH4.SAS are batch programs while QINTER4.FOR and QINTER4.SAS are interactive programs.
Two recent extensions of QBATCH4.FOR and QINTER4.FOR are DBATCH2.FOR and DINTER2.FOR. They extend the capabilities to calculating critical values for "step-up" testing. The hypotheses (or confidence intervals) can be one-sided, two-sided or a mixture. They also directly calculate d rather than q as the critical constant.

Methodology
The following is an overview of the methodology involved in calculating the value for d (or q). In what follows we confine ourselves to discussing the calculation of d.
First, a coordinate transformation (Cholesky) is made so that in the new coordinate system, we have k uncorrelated normal (or spherically symmetric t) random variables with unit variances. The transformed region is now bounded by p hyperplanes, each of which is at a distance d from the origin.
Consider a randomly selected point in the k dimensional space. The square of the distance to the point is the sum of the squares of the individual coordinates. Thus if σ 2 is known, it is distributed as χ 2 with k degrees of freedom. If σ 2 is not known, the distance squared, divided by k, has the F distribution with k degrees of freedom for the numerator and the degrees of freedom for the denominator the same as those for the estimate of σ 2 . An unbiased estimate of the probability content of A d is the probability of obtaining a distance less than or equal to the distance to the boundary.
The problem is to find the value of d such that the probability content of the region A d is 1 -α. Divide the region A d into two subregions, the region inside the hypersphere of radius d and centered at the origin, and the portion outside the hypersphere (say A). The probability content of the hypersphere is Pr[F < d 2 /k] if σ 2 is known, and Pr[χ 2 < d 2 ] if σ 2 is not known.
It is convenient to use reciprocal distances instead of distances. Let H(y) be the cumulative distribution function for y, the reciprocal distance from the origin to the boundary in a randomly selected direction and let f(y) be the probability density function for y Then the probability content of the region A is given by where the integration is from 0 to 1/d. Quadrature is used to estimate the value of the integral. The empirical cumulative step function estimate of H(y) is obtained by Monte Carlo with the location of the steps chosen so as to optimize the quadrature (Gauss-Legendre). "Brent's method" (Press et al., 1986) is used to iteratively solve the integral equation P(d) = 1 -α for d. (The SAS programs use "Ridders' method"). An important feature of the methodology is that the empirical cumulative step function estimate of H(y) need only be estimated one time. This is because the structure of A d is independent of the value of d, and the Gauss-Legendre quadrature scales the range of integration to (-1, 1).
A more detailed description of the methodology and a discussion of its efficiency can be found in Somerville (1997Somerville ( ,1999.
Sections 2 -6 refer specifically to the Fortran 90 programs. Although the Fortran 90 and the SAS programs are extremely similar, there are some differences. Section 7 details the differences and gives examples for the SAS-IML programs.

Usage of the FORTRAN 90 programs
DBATCH2.FOR and DINTER2.FOR calculate the proper d values for the fifteen different multiple comparison procedures. For these procedures, the programs generate B, the appropriate set of contrasts. The programs will also calculate the proper values of d for an arbitrary set B, which of course must be part of the input.
The d values can be calculated both for the case when the estimates x 1 , x 2 , ... , x k are uncorrelated with equal variances, and when the estimates are correlated or have different variances. In the latter case either the variance covariance or the correlation matrix needs to be input.
For the Dunnett procedures, if the estimates are not uncorrelated with equal variances, it is necessary to input the population number of the control.
For Hsu's multiple comparison with the best (MCB), when the estimates are not uncorrelated with equal variances, the program calculates the q values associated with each population.
The user may choose to use "mocar" random directions to obtain "irep" preliminary estimates of d. The "irep" estimates are then combined and used to obtain standard error of the estimate. (If the user sets "irep" equal to 1, the program uses an empirical formula to obtain the standard error of the estimate.) An alternate procedure is to specify a desired standard error of the estimate ("givense"). The program then uses an empirical formula to find the number of random directions necessary to achieve the specified accuracy. To use the alternate procedure, in addition to specifying a value for "givense", "irep" must be a negative integer.
The variance σ 2 , may be known or estimated. The degrees of freedom ("ndenom") of the estimate s 2 , must be included in the input. If σ 2 is known, "ndenom" is set to -1.
Values of 1000 and 10 for mocar and irep respectively (a total of 10,000 random directions) should result in accuracies adequate for most applications.

Description of the input file QCARB.IN
The file QCARB.IN must contain the following: i) two lines common to all procedures * ii) k lines for the lower triangular portion of the variance covariance or correlation matrix. ** iii) a line containing one or more constants *** iv) mm lines each containing a contrast from the set Β Β Β Β i) The first two lines are as follows: swcov swtype conf givense k seed ndenom mocar irep mm swcov This is a switch which tells the programs whether the estimates are uncorrelated with equal variances (swcov = 1) or whether the variance covariance or correlation matrix will be included in the input (swcov = 0) swtype This is a switch which tells which multiple comparison procedure q values are required 0 Arbitrary set of contrasts (to be included in the input) 1 Tukey (1953)  2 Bofinger (1985)  3 Hsu's MCB (1981) 4 Dunnett (one-sided) (1955) 5 Dunnett (two-sided) (1955) 6 Hayter (  Successive ordered treatments (1-sided), Liu et al (1999) 8 Successive ordered treatments (2-sided), Liu et al (1999) 9 Umbrella contrasts, Mack and Wolfe (1981) 10 Generalized umbrella contrasts Bretz and Hothorn -(submitted) 11 Williams type contrasts, Bretz (1999) 12 Marcus type contrasts, Bretz (1999) 13 Hirotsu contrasts, Hirotsu (1979Hirotsu ( ,1997 McDermott et al (1993)  15 Isotonic contrasts Bretz (1999) 20 Somerville et al (2001)  21 Step-up constants (one-sided) 22 Step-up constants (two-sided) (swtypes 20, 21 and 22 are not yet available for the SAS versions) conf The value of 1 -α, or overall confidence.
givense The user requested se of estimate (used only for alternate procedure described in section 2, in which case irep is set to any negative integer). Otherwise input any integer (say 999). k Dimension of x and µ µ µ µ.
seed Seed for the random number generator.
mocar Number of random directions for each peliminary estimate Not used for the "alternate" procedure, in which case input any integer, e.g. 999.
irep Number of preliminary estimates. For the "alternate" procedure which specifies the required standard error of the estimate by "givense", replace irep with any negative integer, e.g. -999. ii) If the estimates x 1 , x 2 , ... , x k are correlated or their variances are unequal, then the next input is the lower triangular portion of either the variance covariance or correlation matrix. k lines are used, one line for each row of the matrix. This is true for each of the procedures.
iii) If swtype = 4 or 5, this line contains the number of the control population. If swtype = 9, line contains the number of the peak population. If swtype = 10, line contains the interval containing the number of the "peak" population. If swtype is between 11 and 15, line contains n 1 , n 2 , ... ,n k , the k sample sizes. If swtype = 20, line contains ifix, qlower.
iv) If swtype = 0, 20, 21, or 22, the mm contrasts of the set B are next input, with one line being used for each contrast. A line consists of the coefficients of x 1 , x 2 , ... , x k , in that order. For 20, 21 or 22, a "d" value is needed at the end of each line.
Several calculations for d may be made in the same run, with the instructions for each run being in sequence in QCARB.IN. A line with 4 or more negative ones, for example "-1 -1 -1 -1 -1" is used to denote the end of the run.
A special note is in order regarding swtype 20. It is used for calculation of the special constants needed when combining one-sided and two-sided confidence interval procedures for successive ordered treatment effects, Somerville et al (2000). For swtype 20, the region is bounded by the hyperplanes .. , p. In the application the value of dlower is the critical value calculated using swtype 7.
For swtype 20, 21 and 22 the "structure" of the region A is a function of d. For this reason the computation is more complicated and running times are considerably greater. Table 1       -------------------v) Suppose we wish two-sided 99% confidence intervals for µ 1 -µ 2 , and one-sided confidence intervals for each of µ 1 -µ 4 , µ 2 -µ 4 and µ 3 -µ 4 . We use the variance covariance example of the previous example and with 1000 random directions repeated 10 times. σ 2 is estimated with 25 degrees of freedom. We may use the following input in QCARB.

Calculation of constants for step-down testing
Let t i be the test statistic for H i (if H i is two-sided use |t i |). Relabel the hypotheses so that t 1 denotes the least significant test statistic, and t p denotes the most significant. The t's are compared with a set of critical constants d 1 < d 2 < ... < d p . The procedure stops the first time t m < d m is observed, with the hypotheses for which i > m accepted, and the rest rejected. Dunnett and Tamhane (1991) showed that for step-down testing, the critical constants d m are those for which the equations Pr[t 1 < d m , ... , t m < d m ] = 1 -α, for m = 1, ... , p. Denote by B m , the set of contrasts corresponding to H 1 , H 2 , ... , H m . The value d m can be calculated given the values for k, ν, α, and the set B m .

SAS/IML Programs
This section provides a short description of the SAS/IML programs and provides examples of their use. The programs QBATCH4.SAS and QINTER4.SAS, in general follow closely the coding for QBATCH4.FOR and QINTER4.FOR. The major differences are in some SAS specific modifications. Differences in the q-values between the Fortran 90 and the SAS-IML programs are due to the use of a different routine for the generation of random normal variates. Accuracies for the Fortran 90 and SAS-IML programs are equivalent.
System requirements: The SAS programs have been written and tested using Release 6.12. However, users of older releases of Version 6, should be able to execute the programs. The following modules are required for processing of the SAS programs: SAS/BASE, SAS/STAT, SAS/IML. Description: QBATCH4.SAS is a batch program written using SAS/IML. No additional input or output files are required. The user need only insert the correct input statements at the beginning of the program before submission.
The SAS programs use the same variable names as the Fortran programs. Section 3 gives a complete description.

Use of the Program:
Instead of requiring a separate input file, as in QBATCH4.FOR, the user need only insert the "situation parameters" immediately following the "proc iml" statement. An easy way to do this is to insert statements corresponding to those in QCARB .IN (see 3). For  ; /* contr statement is optional! */ The variables givense and mm are not needed for this example and instead of assigning them arbitrary values, could be omitted or treated as missing values, i.e. givense = . ; mm = . ; It is permissible to input the complete vc matrix! It is important to note that that for all variables included in the input for QBATCH4.FOR, i.e. in QCARB.IN, and used, a statement for that variable must be included with the inserted statements.
Submission of QBATCH4.SAS for above example the results in the following output: The q-value is 3.0475009 . The seed is 1893 5 populations, 0.9 confidence.
variance is assumed known 10000 random directions used. swtype= 5 Standard error of estimate is 0.0041347 The output resembles the Fortran 90 output. A difference is the addition of the contrast matrix. Differences in the q-value and the standard error of estimate occur because a different random number generator is used. The SAS code uses some facilities provided by SAS, and not available in Fortran 90, namely: -PROBNORM for generating standard normal variates, -ROOT for the Cholesky decomposition, -PROBF for computing univariate F-probabilities, and -RIDDERS for obtaining the root of a real continuous function.
QINTER4.SAS is the interactive program. The interactive data entry is done by use of SAS/Macro-%window commands. Vectors or matrices are input one element at a time. Note that for the variancecovariance matrix only the upper (or lower) triangular portion is required. (If the lower triangular portion is used, the order is row by row). Press 'ENTER' after typing each element. No error control has been included. Pressing 'CTRL' and 'PAUSE' simultaneously interrupts the process and the user may start the program again. The order of events of the queries is given below: /* Order of events, where info_i(var) describe the %window of number i asking for the input variable var */ info_9(mocar) if ((swcov = 0) and ((swtype = 4) or (swtype = 5))) then info_12(icontrol) if (swtype = 0) then info_13(mm) info_15(seed) info_10 swcov if (swcov = 0) then info_11(vc) info_14(contr)

Examples:
Four additional examples are provided. On the left are the statements (situation parameters) which must be inserted after "proc iml". On the right the output is given (omitting the contrast matrix). Example 1: uncorrelated Bofinger, df = 49, 1 -α = 0.9, 4 groups, 10 estimates with 1000 replications each

Summary and conclusions
Fortran 90 and SAS-IML programs have been written which calculate the constants necessary for several multiple comparison and multiple testing procedures under fairly general conditions. The programs can also be used to obtain the constants for arbitrary sets of contrasts, and also for step-down testing. The Fortran programs can be used to obtain step-up constants. There are batch and interactive versions of both the Fortran 90 and SAS programs. A compiled version of the Fortran 90 programs which should run on any PC with Windows 95 or later can be found at http://pegasus.cc.ucf.edu/~somervil/home.html It should be observed that the programs can be used to obtain the constants necessary for multiple comparisons in the general linear model, with no approximations being necessary.