RRreg : An R Package for Correlation and Regression Analyses of Randomized Response Data

The randomized response (RR) technique was developed to improve the validity of measures assessing attitudes, behaviors, and attributes threatened by social desirability bias. The RR removes any direct link between individual responses and the sensitive attribute to maximize the anonymity of respondents and, in turn, to elicit more honest responding. Since multivariate analyses are no longer feasible using standard methods, we present the R package RRreg that allows for multivariate analyses of RR data in a user-friendly way. We show how to compute bivariate correlations, how to predict an RR variable in an adapted logistic regression framework (with or without random eﬀects), and how to use RR predictors in a modiﬁed linear regression. In addition, the package allows for power analysis and robustness simulations. To facilitate the application of these methods, we illustrate the beneﬁts of multivariate methods for RR variables using an empirical example.


Introduction
A common goal in the social and behavioral sciences is to investigate behaviors, attitudes, and attributes that are associated with certain individual, legal, or social norms.It is well known, however, that responses to direct questions merely reflect what participants tell investigators rather than their true status on the attribute of interest.As a consequence, prevalence estimates of sensitive, incriminating, or illegal attributes are systematically biased towards respondents' perception of what is socially acceptable (e.g., Tourangeau and Yan 2007), in turn obscuring associations between sensitive attributes and other variables.
The randomized response (RR) technique (Warner 1965) was developed as a means to reduce this response bias by providing a higher degree of anonymity and confidentiality than traditional direct questioning (DQ) formats -based on the idea that respondents should be more willing to respond truthfully in a truly anonymous situation.RR is best understood as a family of techniques that share the common rationale to add random noise to the individual answers of participants such that an individual's true status is not directly identifiable from her observed response.For example, in the historically first RR model proposed by Warner (1965), respondents either receive the sensitive statement (e.g., "I have used cocaine") or its negation ("I have never used cocaine"), depending on the outcome of a randomization process with known probability distribution (such as the roll of fair dice).The randomization outcome is only known to the respondent, so an individual's yes or no response does not disclose whether she actually has engaged in the sensitive behavior.Although the randomization thereby ensures the anonymity of individual respondents, elementary probability calculations allow for group based prevalence estimates of the attribute under consideration.
Importantly, typical research questions in the social and behavioral sciences go beyond determining the prevalence of a sensitive attribute, but rather refer to associations between several variables.In other words, the prime interest lies in correlations and predictions involving one or more covariates.Applying traditional correlation or regression analyses to data obtained via RR is inappropriate, however, as the randomization component of the RR distorts the response variable.A simple approach to circumvent this problem is to perform a median-split on a continuous covariate and to compare the estimates across the resulting groups (e.g., Latkin and Vlahov 1998;Ostapczuk and Musch 2011;Rosenfeld et al. 1991;Soeken and Macready 1986;Hilbig, Moshagen, and Zettler 2015).However, median-splits suffer from several shortcomings, including loss of information, decreased power, and difficulties when considering more than one variable at a time (Royston, Altman, and Sauerbrei 2006).Thus, multivariate analyses of RR data using appropriately adapted correlation and regression procedures are clearly preferable.
Fortunately, correlation and regression models appropriate for the analysis of RR variables have become available.However, for several years only two Stata (StataCorp 2015) routines provided basic implementations for the prediction of binary RR variables for the original Warner model and two other RR designs (i.e., RRLOGIT and RRREG;Jann 2008Jann , 2005)).More recently, logistic RR regression for four RR designs has become available in R (R Core Team 2018) as implemented in the package rr (Blair, Zhou, and Imai 2016), which also includes univariate power analyses.In addition, more general-purpose software for the analysis of misclassified data could potentially be adopted for the RR technique (e.g., using the SIMEX method; Lederer and Küchenhoff 2006;Carroll, Küchenhoff, Lombard, and Stefanski 1996).To further facilitate the multivariate analysis of RR data, herein we introduce the R package RRreg (Heck and Moshagen 2018).Package RRreg improves upon existing implementations by providing a unifying modeling and computational framework and thereby facilitating the application and comparison of a wide range of RR designs for several types of multivariate analysis.In particular, going beyond previous packages, package RRreg implements both logistic and linear RR regression models for a large class of RR designs (including those addressing instruction non-adherence as well as RR designs for continuous data; see Table 2), includes logistic regression with random effects (which is particularly important for large-scale surveys with clustered data), allows to compute RR correlations (including correlations between RR variables), and provides easy-to-use Monte Carlo simulation methods to estimate statistical power and to investigate robustness in both univariate and multivariate settings (allowing, for example, to select the most efficient RR design, randomization probabilities, and statistical approach before administering a survey).
In the following, we first describe several variants of the RR technique and their specification in package RRreg.Next, we introduce the implemented multivariate methods for RR variables.Finally, we illustrate the benefits and the implementation of these approaches in package RRreg via an empirical example.The package is available from the Comprehensive R Archive Network (CRAN) at https://CRAN.R-project.org/package=RRreg.

Randomized response models
As indicated above, RR models are a family of different techniques that share the common idea of scrambling an individual's response by means of some sort of random process.Here, we restrict ourselves to compulsory RR models (as opposed to optional models, e.g., Chaud-huri and Mukerjee 1985) that use a single randomization outcome (as opposed to multiple randomization outcomes, e.g., Ambainis, Jakobsson, and Lipmaa 2004;Mangat and Singh 1990) to assess a single attribute (as opposed to multiple attributes, e.g., Christofides 2005;Tamhane 1981).First, we briefly describe five specific RR designs to illustrate the general idea of the RR technique and to show how RR models are specified in package RRreg using two arguments, the RR name model and the randomization probability p.The package vignette, obtainable by typing vignette("RRreg", package = "RRreg"), contains a more detailed list of the implemented models including depictions of the RR designs as probability trees.For a comprehensive overview over different RR variants, see Antonak and Livneh (1995), Chaudhuri andChristofides (2013), andFox (2015).

Warner's model and crosswise model
According to the original RRT version by Warner (1965), respondents either answer the sensitive question with probability p or its negation with probability 1 − p.Therefore, the probability of an affirmative response is given by (1) so that the prevalence can be estimated as π = ( λ + p − 1)/(2p − 1) based on the relative frequency of observed yes responses, λ = n yes /N .
Recently, Yu et al. (2008) proposed to refrain from classical randomization devices such as dice throwing and to use instead a second, irrelevant question with known prevalence p (e.g., "Is your birthday in November or December?",p = 2/12).Essentially, the crosswise model prompts participants to provide a single response to both questions simultaneously.Participants are to choose option (A) if the response is either yes to both questions or no to both questions; or option (B) if one response is yes while the other is no.The instructions of the crosswise model are easy to understand, do not force participants to respond to the sensitive question directly, and lead to improved prevalence estimates in practice (Jann et al. 2012;Hoffmann and Musch 2017;Hoffmann et al. 2017;Hoffmann, Diedenhofen, Verschuere, and Musch 2015; but see also Höglinger and Jann 2016 for possible limitations and biases).From a statistical point of view the model is identical to Warner's model, because the probability of responding to option (A) is identical to the probability of responding yes as given in Equation 1.In all functions of the package RRreg, the model is specified either by model = "Warner" or model = "Crosswise" and the randomization probability p = p.

Unrelated question technique
In the unrelated question technique (UQT; Horvitz et al. 1967), participants are randomly assigned (with probability p) to either the sensitive question or to an independent, nonsensitive question (e.g., "Is your birthday in the first half of the year?") with (known or unknown) prevalence π UQ .Under this model, the probability of observing an affirmative response is thus In package RRreg, the UQT with known prevalence is specified using the arguments model = "UQTknown" and the randomization probabilities p = c(p, piUQ estimate both prevalence rates, π and π UQ , simultaneously.This version of the UQT requires the additional argument group, which determines the group membership of each respondent (i.e., a vector of length N with values of 1 and 2; see Section 4).The basic structure of the UQT allows for a generalization to continuous sensitive variables (such as income or number of abortions; Greenberg et al. 1971).

Forced response
Unlike the variants described above, the forced response (FR) model (Boruch 1972;Greenberg, Abul-Ela, Simmons, and Horvitz 1969;Peeters, Lensvelt-Mulders, and Lasthuizen 2010) relies on only one question, but prompts a certain proportion of participants to disregard the question entirely and to provide a prespecified response.In the symmetric dichotomous FR format, respondents are asked to respond yes with probability p y , no with probability p n , or truthfully to the sensitive question with probability 1 − p y − p n , depending on the outcome of the randomization device, so that This FR design easily generalizes to polytomous response options (e.g., never, sometimes, often) by choosing a randomization device that can take values across all of the possible response categories.In package RRreg, the responses across the M categories have to be coded using integers between 0, 1, . . ., M − 1.For analysis, the FR model is specified by model = "FR" and the unconditional randomization probabilities p = c(p0, p1, p2, ...).

Stochastic lie detector
Even though RR designs have been shown to reduce response biases, some participants still refuse to respond truthfully (e.g., Höglinger et al. 2016;Coutts and Jann 2011;Hoffmann et al. 2017).Addressing this issue, a core feature of more recently proposed RR variants is to take instruction non-adherence into account.One of these extended models is the stochastic lie detector (SLD; Moshagen et al. 2012), which is based on Mangat's RR procedure (Mangat 1994).The SLD has two free parameters, the proportion π of participants endorsing the sensitive question and the proportion of carriers of the sensitive attribute who respond honestly (t).As in Mangat's variant, anonymity is provided by assigning participants not holding the sensitive attribute to either the sensitive question with probability p or its negation with probability 1 − p. Participants who in fact hold the sensitive attribute are always asked to respond to the sensitive question.Unlike Mangat's variant, the SLD assumes that only a proportion t of participants holding the sensitive attribute responds according to the RR instructions by responding yes, whereas the remaining proportion (1 − t) of this subgroup is assumed to choose the innocuous response (no).In order to identify the two free parameters π and t, respondents are randomly assigned to two groups that differ only in the applied randomization probabilities p 1 and p 2 (p 1 = p 2 ).The probability of responding yes in group i is then given by In package RRreg, the SLD is specified by model = "SLD", p = c(p1, p2), and a vector group of length N with values of 1 and 2 (see the empirical example in Section 4).

Randomized response as misclassification
More generally, RR designs for discrete data can be described as misclassification schemes that add random noise to the true responses (Van den Hout and Van der Heijden 2002).Hence, it is possible to specify each RR design by a misclassification matrix P , in which the entries P ij are defined as the conditional probabilities to respond i (e.g., yes or no) given that the true state of a respondent is j (e.g., cocaine user or non-user).By definition, the misclassification matrix P links the true probabilities of latent states π to the expected observed proportions of responses λ by λ = P π. (5) Package RRreg allows to specify an RR design by the misclassification matrix P using the arguments model = "custom" and, for instance, in case of a binary RR variable, a 2 × 2 matrix P = matrix(c(p00, p10, p01, p11), nrow = 2, ncol = 2).Similarly, an M ×M misclassification matrix can be used for polytomous responses between 0, 1, . . ., M − 1.To facilitate a comparison of the prespecified models above and the misclassification approach, the function getPW() returns the misclassification matrices of all of the implemented RR designs.For instance, the FR model with unconditional randomization probabilities p 0 = 0.1 and p 1 = 0.15 to respond no or yes, respectively, is equivalent to an RR design specified by the misclassification matrix R> getPW(model = "FR", p = c(0.1,0.15)) 0 1 0 0.85 0.1 1 0.15 0.9 Note that for all RR models that require two independent groups (e.g., the UQT with unknown prevalence), the misclassification probabilities depend on the second parameter and on the group membership, passed by the additional arguments par2 and group, respectively.If the RR procedure is independently applied to several questions, the joint misclassification matrix of these variables is given by the Kronecker product of the misclassification matrices of each individual RR design (Bourke 1982;Van den Hout and Van der Heijden 2002).Hence, univariate and multivariate models for a single discrete RR variable can easily be generalized to multiple discrete RR variables.

RRcor(): Randomized response correlations
Since the randomization component inherent in any RR model distorts the response variable in a principally predictable way (due to the requirement of a known probability distribution of the randomization process), multivariate analyses of RR data can be performed relying on misclassification and measurement error models (Boruch 1972;Bross 1954;Fuller 2006).Based on this approach, Kraemer (1980) derived an unbiased estimator of correlations using binary RR variables for the Warner and the FR model, which was later generalized by Fox and Tracy (1984) to polytomous and continuous RR variables.This approach allows for estimating the test-retest reliability of an RR measure or the correlation between two RR variables, which can even differ in scale and RR design.
The method by Fox and Tracy (1984) reformulates the probability of an observed RR response y i for a participant i as a mixture distribution (similar to the UQT).Either the true underlying state x i or a value u i from an independent masking distribution is observed with probability p t and (1 − p t ), respectively: Since the probability p t to respond truthfully and the mean and variance of the masking distribution u are known by design, it is possible to estimate the correlation of the true states x with a second (directly or indirectly measured) variable by the method of moments (see Fox and Tracy 1984, for details).In addition to the point estimate of the correlation, the function RRcor() provides an option to run a parametric or nonparametric bootstrap (Efron and Tibshirani 1997) to obtain standard errors.
Note that RR models other than the UQT need to be reparameterized to fit the structure in Equation 6.For instance, the FR model with the unconditional randomization probabilities p 1 and p 0 of being asked to respond yes or no regardless of the true state, respectively, can be reparameterized to p t = 1 − p 1 − p 0 .Moreover, the masking distribution of the binary FR is a Bernoulli random variable with mean µ u = p 1 /(p 1 + p 0 ), which is the conditional probability to answer yes given that the respondent disregarded the sensitive question and responded in a predetermined way.Other RR variants are reparameterized by package RRreg accordingly.

RRlog(), RRmixed(): Predicting RR variables in a logistic regression
Often, the interest is to predict a binary sensitive outcome measured by RR using continuous or categorical predictors.For this purpose, the standard logistic regression model with the link function logit(p) = [log p/(1 − p)] needs to be adapted to take the randomization procedure into account (Maddala 1983;Scheers and Dayton 1988;Van den Hout, Van der Heijden, and Gilchrist 2007).In both the standard and the RR logistic regression, the individual expected prevalence rates π i are modeled by where β is the vector of regression coefficients and X i the i-th row of the design matrix X.
To obtain the likelihood for the standard logistic regression model, it is assumed that the observed responses y are binomially distributed with means given by the underlying prevalence rates π as modeled in Equation 7. In binary RR designs, however, the observed responses are binomially distributed with means given by the expected probabilities λ of observing an affirmative response.More specifically, in the general case of expressing an RR design by its misclassification matrix P (Equation 5), this yields the likelihood where y i and n i are the numbers of affirmative and total responses, respectively.
In other words, the true prevalence rate is transformed by the inverse logit function as in a standard logistic regression, whereas the link function for the likelihood changes according to the specific RR design ( Van den Hout et al. 2007;Van der Heijden and Van Gils 1996;Kerkvliet 1994).RRlogit() employs the expectation-maximization (EM) algorithm by Magder and Hughes (1997) to obtain starting values for a gradient-based search by the R function optim().
Asymptotic standard errors are estimated by taking the square root of the diagonal elements of the inverted Fisher information matrix.Similarly, likelihood ratio tests allow for testing parameter restrictions.
The logistic regression belongs to the class of generalized linear models and can thus easily be extended to hierarchical or so-called mixed effects models (Pinheiro and Bates 2000).Such extended models are appropriate if participants are clustered in several groups that differ in their overall intercept, slope, or both.Based on an adjusted link function for RR designs ( Van den Hout et al. 2007), the function RRmixed() uses the estimation routines of package lme4 (Bates et al. 2015) and thus allows for fitting a wide range of mixed effects models (e.g., level-1 and level-2 predictors, crossed random effects; see Bates et al. 2015).Currently (i.e., using RRreg 0.6.7),however, the application of this method is restricted to one-group RR designs that use the same randomization scheme across all responses and participants.

RRlin(): RR predictors in a linear regression
Instead of using RR variables as a criterion in a logistic regression, it is also possible to include RR variables as predictors in a linear regression.This model was proposed by Van den Hout and Kooiman (2006) based on the interpretation of RR designs as misclassification mechanisms on discrete predictors (Van den Hout and Van der Heijden 2002, cf.Section 2.5).
It suffices to state the likelihood of the model for a single RR predictor because several independent RR variables can be modeled using the Kronecker product of their misclassification matrices (Van den Hout and Kooiman 2006).In the following, the first entry of the vector of regression coefficients β refers to the RR variable, whereas the remaining entries refer to q directly observable predictors (e.g., variables measured by direct questions).
The likelihood of this linear RR regression model consists of two components.First, the likelihood of the latent distribution π of the true states is given by the mixture distribution of the observed responses and depends on the misclassification matrix P and the n * j observed frequencies of the j = 1, . . ., J possible RR responses: where C is a constant.
Second, it is assumed that the dependent variable y is conditionally normally distributed with standard deviation σ.For each observation y i , the expected value is determined by a linear combination of the true, latent states on the RR predictors and the observed values for directly observed variables (DQ).Because the true states of the participants on the RR variable are unknown, the marginal likelihood of the regression coefficients β needs to be computed to remove the uncertainty induced by misclassification, (10) where the design matrix X (j) has the j-th possible RR response in the first column and the directly observable predictor values in the remaining columns; P (i)j gives the probability to observe the RR response of the i-th person if she were in the j-th true state (see Van den Hout and Kooiman 2006, for details).Eventually, the joint loglikelihood of the whole model is obtained by the sum of Equations 9 and 10.The function RRlin() maximizes this likelihood and returns asymptotic standard errors based on the observed Fisher information matrix.
As mentioned in Section 2.5, multiple RR predictors are included by using the Kronecker product of their individual misclassification matrices, thereby taking multicollinearity into account.Note, however, that the randomization devices must be independent of each other and also independent of all sensitive attributes.Such requirements have to be carefully considered especially in RR designs that use irrelevant questions to protect anonymity (Moshagen et al. 2011).Unlike logistic RR regression, the method is not restricted to dichotomous RR variables, but can also accommodate polytomous RR variables.However, because a finitedimensional misclassification matrix P has to be specified, continuous RR variables cannot be used in this framework.

Empirical example: Attitudes against minarets in Germany
To illustrate the outlined methods1 , we use data from a study involving 1,621 participants (985 female participants; mean age = 23.7 years) who were asked whether they endorse a ban against the construction of minarets in Germany. 2 The study was conducted in 2010, shortly after a national referendum in favor of a respective law yielded a majority in Switzerland.The referendum result came as a surprise to many, because only a month before the referendum, pre-election polls (based on direct questioning) indicated a clear minority in favor of such law (Fetzer and Soper 2012).The unexpected referendum result led to a heated public debate about the prevalence of attitudes against minarets in Switzerland and German-speaking countries in general.
To test whether the RR technique provides higher -and in this case presumably more valid (Hoffmann and Musch 2017) -prevalence estimates, participants were randomly assigned with a ratio of 4:1 to either the RR (n = 1, 256) or the DQ format (n = 365).In the DQ format, the sensitive question read: "A minaret is a tower next to a mosque from which a muezzin calls Muslims to prayer five times a day.Are you in favor of prohibiting the construction of minarets in Germany?"The SLD (Moshagen et al. 2012) was chosen as RR variant.As explained above, respondents actually endorsing a minaret ban were asked to respond truthfully to the sensitive statement ("I am in favor of prohibiting the construction of minarets.").In contrast, the remaining participants were prompted to respond to the sensitive statement with probability p 1 = 2/12 and p 2 = 10/12 within two subgroups, respectively, but to the negation of the sensitive statement ("I am against prohibiting the construction of minarets.")with counter-probabilities 1 − p 1 and 1 − p 2 , respectively.Respondents' month of birth (unknown to the investigators) served as a randomization device.In addition, an 11-point self-placement scale ranging from −5 (left) to 5 (right) was used to assess political orientation.

Prevalence estimate
First, we consider the univariate prevalence estimates of attitudes against minarets obtained by means of RR and DQ, respectively.The code to load the data into R, to select only the subsample of participants in the RR condition, and to obtain the RR prevalence estimate is R> library("RRreg") R> data("minarets", package = "RRreg") R> head(minarets) The data frame minarets includes the binary response variable rrt (0 = no, 1 = yes) and the grouping variable condition (values 1 or 2).The latter determines the applied randomization probability.Moreover, the RR design is specified by the argument model = "SLD" and the corresponding randomization probabilities in the vector p.
As expected, the RR format yielded a higher prevalence estimate of attitudes against minarets (π RR = 0.87, SE = 0.04) than the DQ format (π DQ = 0.43, SE = 0.03). 3The difference was significant in a likelihood ratio test restricting the prevalence rate to be equal in both formats (χ 2 (1) = 84.1,p < 0.01). 4Moreover, even though the RR procedure secured the full anonymity of respondents, the probability of truthful responding among respondents endorsing a minaret ban was only t = 0.64 (SE = 0.02).

Correlations: Covariates of attitudes against minarets
The code to estimate the correlation of political left-right orientation and age with the sensitive attribute is: From the data frame minarets_RR, we first select three columns, that is, the response to the sensitive question, age, and political left-right orientation.RR models are specified by the vector models in order of their appearance ("d" indicates directly measured variables).
The randomization probabilities for all included RR designs have to be defined in p.list in the same order.The remaining two arguments bs.type and bs.n determine that 1,000 nonparametric bootstrap samples should be drawn to estimate standard errors and p values, respectively.The results show that politically right-oriented participants were more likely to endorse a ban of minarets (ρ = 0.59, SE = 0.14, p < 0.01), whereas no significant relationship to age was observed (ρ = −0.05,SE = 0.12, p = 0.61).

Logistic regression: Predicting attitudes against minarets
In the present example, two substantive research questions can be investigated using logistic regression analyses.First, do the covariates age and left-right orientation independently or interactively predict attitudes against minarets?Second, is the predictive power of any of these variables stronger in the RR compared to the DQ format?right, where age.c* leftRight.cestimates both main effects and the interaction term.The RR design is specified in the same way as in the case of bivariate correlations.
The results of this analysis are congruent with the correlations obtained in the preceding section.When shifting one point to the right on the political orientation scale, the odds that participants endorsed a ban against the construction of minarets increased by a factor of 1.87 ( β = 0.62, SE = 0.16, χ 2 (1) = 16.0,p < 0.01).In contrast, neither age (odds ratio = 1.01, β = 0.01, SE = 0.06, χ 2 (1) = 0.1, p = 0.80) nor the interaction between age and left-right orientation (odds ratio = 0.98, β = −0.02,SE = 0.01, χ 2 (1) = 2.0, p = 0.16) significantly contributed to the prediction of endorsement.Figure 1 shows the model predictions of the reduced model containing only left-right orientation as a predictor.As a graphical test, we computed prevalence estimates separately for each subgroup of left-right orientation with n > 20.Overall, the logistic regression model was in line with these separate estimates and resulted in smaller confidence intervals compared to the univariate analysis as expected.
To investigate the second research question concerning different effects of political left-right orientation in the DQ and the RR response format, we analyzed the complete data set including participants in both questioning conditions.For a comparison of DQ and RR, the function RRlog() requires a vector group defining the group membership of each participant; in the present example 0 = DQ, 1 = SLD(p 1 ), and 2 = SLD(p 2 ).Moreover, we included an effectcoded variable RRdesign for the response format (−1 = DQ, 1 = RR), the mean-centered political left-right orientation, and the interaction of both terms:  Note that the parameter t is tested against the null hypothesis that all participants answered truthfully (i.e., H0: t=1).
The results show that the odds of endorsing a ban of new minarets increased by a factor of 8.26 in the RR compared to the DQ condition ( β = 1.06,SE = 0.18, χ 2 (1) = 75.6,p < 0.01).Across both response formats, a shift of one point to the right on the political orientation scale increased the odds of endorsing the sensitive question by a factor of 1.61 ( β = 0.48, SE = 0.08, χ 2 (1) = 29.9,p < 0.01).Moreover, the regression coefficient of the interaction term descriptively supported the contention that left-right orientation is a better predictor in the RR than in the DQ format, but was not statistically significant (odds ratio = 1.10, β = 0.10, SE = 0.08, χ 2 (1) = 1.64, p = 0.20).It is evident from Figure 1 that the estimated prevalence of attitudes against minarets was generally higher in the RR than in the DQ condition and increased for politically right-oriented respondents.The (nonsignificant) interaction is visible as the steeper slope in the RR condition.

Linear regression: Predicting political left-right orientation
Endorsement of the sensitive question concerning a ban against minarets is not a very plausible The model is again specified by a formula including the dependent and independent variables on the left and right side, respectively.RR predictors have to be included first and are specified by the argument models in order of their appearance.If more predictors are included than RR models listed, the remaining variables are treated as being directly measured without error (DQ variables).

Power-analysis simulations
Package RRreg provides functions to assess two major practical issues that have to be considered in any RR application: statistical power and robustness.The RR technique necessarily reduces statistical power, because it adds random noise to the responses by design (Ulrich, Schröter, Striegel, and Simon 2012).Therefore, compared to standard correlation and regression analyses, larger sample sizes are required to obtain the same level of precision.The statistical power of multivariate RR models depends on several factors such as the prevalence π of the sensitive attribute, the type of RR design, the corresponding randomization probabilities, and, of course, on the magnitude of effect.
To estimate the statistical power by means of simulation, RRsimu() generates bivariate data assuming either a binary RR variable and a normally distributed covariate or two binary RR variables (relevant only for RRcor()).For instance, the code to estimate the power of multivariate methods for the SLD in case of our empirical example is: R> sim.power <-RRsimu(numRep = 1000, n = 1256, pi = 0.87, cor = 0.3, + model = "SLD", p = c(2/12, 10/12), groupRatio = 0.45, + complyRates = c(0.64,1), sysBias = c(0, 0), nCPU = 1) Most of the arguments of this function are self-explaining, that is, the number of bootstrap samples (numRep), the sample size (n), the prevalence of the sensitive attribute (pi), the correlation between the sensitive question and the normally distributed covariate (cor = 0.3), the RR model (model), the randomization probability (p), the ratio of participants in the two SLD conditions (groupRatio), and the number of parallel cores to use (nCPU).
Moreover, the vector complyRates defines the probabilities of adherence in both of the latent subgroups, that is, separately for participants holding the sensitive attribute and for those who do not.If any of these two compliance rates is smaller than one, the assumptions of standard RR designs (such as the Warner model) are violated.In contrast, the SLD explicitly models noncompliance of participants holding the sensitive attribute, and we can thus set the first compliance rate equal to the estimated adherence rate of t = 0.64 without violating the SLD assumptions.The second compliance rate is set to 1 because the SLD assumes that all noncarriers comply with the instructions.To specify the type of noncompliance in detail, the vector sysBias defines the conditional probability of responding yes in case of noncompliance for the two latent subgroups.In the code above, sysBias = c(0, 0) specifies that noncomplying participants always respond no.
As output, RRsimu() shows the mean and standard deviation of the estimated parameters by RRcor(), RRlog(), and RRlin(): To facilitate a comparison of the loss of efficiency in RR designs, the output includes parameters based on analyzing the true underlying sensitive attribute directly (e.g., cor.true, beta.true).The results of this simulation show that the power to detect the medium-sized correlation ρ = 0.3 is 73.0%, 82.6%, and 89.2% for the functions RRcor(), RRlog(), and RRlin(), respectively.
Often the interest lies in planning the required sample size to obtain a desired level of statistical power prior to conducting a study (see Ulrich et al. 2012, for power analyses of univariate tests).For such purposes, the function powerplot() plots the estimated power of the multivariate RR methods for different effect sizes as a function of the sample size.In comparison to printed tables or graphs, this function thus allows for a simple power analysis adjusted to the scenario of interest.To demonstrate such an application, we compare the power of a logistic regression using either the DQ format or the symmetric FR model with randomization probabilities of p y = p n = 0.1 (with a probability of responding to the sensitive question directly of p t = 1 − p y − p n = 0.8; see Section 3.1) or p y = p n = 0.2 (p t = 0.6).The code to estimate and plot the power for the FR design assuming a true prevalence of the sensitive attribute of π = 0.3 is R> power <-powerplot(numRep = 1000, model = "FR", p = c(0.2,0.2), pi = 0.3, + n = seq(100, 1000, 100), cor = seq(0, 0.25, 0.05), method = "RRlog", + alpha = 0.05, nCPU = 1) R> plot(power) The only arguments that differ from the functions above are the vectors for the sample sizes n (a sequence from 100 to 1,000 in steps of 100) and the true correlations cor (a sequence from 0 to 0.25 in steps of 0.05).As expected, Figure 2 reveals that the power increases for larger effect and sample sizes.More importantly, the power is largest for the DQ format and decreases with the amount of noise added in each of the two FR designs.Depending on the topic of interest, an RR design with appropriate randomization probabilities can be selected in order to balance the loss of efficiency and the increase in anonymity.

Robustness simulations for RR designs
Despite the increased anonymity afforded by RR, participants might still engage in selfprotecting response behavior, for instance, due to a lack of understanding or mistrust of the protection mechanism (James, Nepusz, Naughton, and Petróczi 2013;Landsheer et al. 1999).Therefore, robustness simulations are useful to judge the possible amount of biases in parameter estimates under plausible scenarios of non-adherence.
To simulate data for more complex scenarios (for example including multiple continuous and/or discrete covariates) appropriate covariance structures can be generated by using any other R package.Given a data set that includes true, latent states, each variable can be passed separately to the function RRgen() to simulate the outcome of a specific RR procedure with known compliance rates complyRates and systematic biases sysBias.For example, the FR design is applied to a vector of true, latent states as follows: R> true <-rbinom(n = 100, size = 1, prob = 0.5) R> new.dat <-RRgen(trueState = true, model = "FR", p = c(0.1,0.15), + complyRates = c(0.9,1), sysBias = c(0.5,0.5)) R> head(new.dat) Based on the observed RR responses (column response), the simulated data can be analyzed using functions such as RRlog() or RRmixed() to assess the power and robustness of the RR technique in a specific scenario.

Discussion
The RR technique may elicit more truthful responding and can yield estimates that are less biased by social desirability (Lensvelt-Mulders et al. 2005).Nevertheless, methods to compute correlation and regressions involving RR variables are rarely used in substantive research, probably as a result of their statistical complexity and the scarcity of ready-to-use software.As a remedy, we introduced the package RRreg that implements three multivariate methods for the analysis of RR variables: Computing bivariate correlations including one or two RR variables (Fox and Tracy 1984), predicting RR variables in a logistic regression (with or without random effects; Van den Hout et al. 2007), and using RR variables as predictors in a linear regression (Van den Hout and Kooiman 2006).
We restricted ourselves to these methods, because these are broadly applicable and yield correlation and regression coefficients that closely resemble those from standard statistical models.However, future versions of package RRreg could include multivariate RR methods for more specific situations such as computing sum scores of RR variables (Cruyff 2008), constructing scales comprised of RR items (Himmelfarb 2008), using RR in cross-sectional (Frank, Van den Hout, and Van der Heijden 2009) or cross-national studies (De Jong, Pieters, and Stremersch 2012), combining RR and propensity scoring (Gingerich 2010), or generalizing single and multidimensional item response theory to RR (Böckenholt and Van der Heijden 2007;Fox 2005;Fox, Entink, and Avetisyan 2014).
Despite the potential superiority of RR over direct questioning when assessing attributes that might be distorted by socially desirable responding, it must be kept in mind that the RR is a large sample technique.This drawback immediately follows from the basic idea of the RR to scramble individual responses by adding random noise, so that the goals of obtaining both efficient and valid estimates (through increased anonymity) necessarily conflict (Ljungqvist 1993).This is even more of an issue considering the advanced, more complex models described herein.The simulation options of the RRreg package may be of help in deciding on the required sample size for a particular model.Overall, package RRreg broadens the scope of potential applications of the RR technique and provides substantive researchers with a powerful tool to investigate sensitive topics.

Figure 1 :
Figure 1: Predicted probability of endorsing a ban against minarets as a function of political left-right orientation (measured on an 11-point self-placement scale) using either RR or DQ.Both logistic regression models include 95% confidence bands.Univariate RR and DQ prevalence estimates are shown for each subgroup of political left-right orientation with n > 20 (including 95% confidence intervals).

Figure 2 :
Figure 2: Statistical power of a logistic regression using either the DQ format or the symmetric FR design with different randomization probabilities.Estimates are based on a true prevalence rate of π = 0.3, and a significance level of α = 0.05.
).If the prevalence of the nonsensitive question π UQ is unknown, two independent random samples are drawn and different randomization probabilities p 1 and p 2 are applied in order to (Bates et al. 2015), and RRlin() uses RR variables as predictors in a linear regression of a continuous criterion.
The first research question can be investigated by simply fitting the modified logistic regression model for a dependent RR variable including the two (mean-centered) predictors age and left-right orientation: Similar to the standard regression function lm() in R, the argument formula defines the model by stating the dependent variable on the left and the independent variables on the causal predictor for any of the available continuous variables.Nevertheless, for illustration purposes, we will use it to predict individual left-right orientation.Additionally, we included age as a second, continuous predictor.The model is estimated by the function RRlin() as follows: