Published by the Foundation for Open Access Statistics Editors-in-chief: Bettina Grün, Torsten Hothorn, Rebecca Killick, Edzer Pebesma, Achim Zeileis    ISSN 1548-7660; CODEN JSSOBK
Authors: Giulio Barcaroli
Title: SamplingStrata: An R Package for the Optimization of Stratified Sampling
Abstract: When designing a sampling survey, usually constraints are set on the desired precision levels regarding one or more target estimates (the Ys). If a sampling frame is available, containing auxiliary information related to each unit (the Xs), it is possible to adopt a stratified sample design. For any given stratification of the frame, in the multivariate case it is possible to solve the problem of the best allocation of units in strata, by minimizing a cost function sub ject to precision constraints (or, conversely, by maximizing the precision of the estimates under a given budget). The problem is to determine the best stratification in the frame, i.e., the one that ensures the overall minimal cost of the sample necessary to satisfy precision constraints. The Xs can be categorical or continuous; continuous ones can be transformed into categorical ones. The most detailed stratification is given by the Cartesian product of the Xs (the atomic strata). A way to determine the best stratification is to explore exhaustively the set of all possible partitions derivable by the set of atomic strata, evaluating each one by calculating the corresponding cost in terms of the sample required to satisfy precision constraints. This is unaffordable in practical situations, where the dimension of the space of the partitions can be very high. Another possible way is to explore the space of partitions with an algorithm that is particularly suitable in such situations: the genetic algorithm. The R package SamplingStrata, based on the use of a genetic algorithm, allows to determine the best stratification for a population frame, i.e., the one that ensures the minimum sample cost necessary to satisfy precision constraints, in a multivariate and multi-domain case.

Page views:: 5845. Submitted: 2012-12-05. Published: 2014-11-03.
Paper: SamplingStrata: An R Package for the Optimization of Stratified Sampling     Download PDF (Downloads: 10226)
SamplingStrata_1.0-3.tar.gz: R source package Download (Downloads: 243; 642KB)
v61i04.R: R example code from the paper Download (Downloads: 307; 15KB)

DOI: 10.18637/jss.v061.i04

This work is licensed under the licenses
Paper: Creative Commons Attribution 3.0 Unported License
Code: GNU General Public License (at least one of version 2 or version 3) or a GPL-compatible license.