Survey design effect - R programming
Manuela Alcañiz, Montserrat Guillén & Zaida Vicente
Information on a typical survey can be found in:
- Alcañiz, M., Mompart, A., Guillén, M., Medina, A., Aragay, J.M., Brugulat, P. and Tresserras, R. (2014) "New design of the Health Survey of Catalonia (Spain, 2010-2014): A step forward in health planning and evaluation [Nuevo diseño de la Encuesta de Salud de Cataluña (2010-2014): un paso adelante en planificación y evaluación sanitaria]" Gaceta Sanitaria, 28, 338-340.
DATA DESCRIPTION
Name | Content description |
ilustra.xls |
RS_7=id for stratum level 1, SEXEDAT= id for stratum level 2,
PES_1=sample weight, variable[a 0/1 indicator], municrec=id for cluster. |
ilustra.RData | Same data set in workspace R format. |
ilustra.sas7bdat | Same data set in SAS format. |
ilustra.dta | Same data set in STATA format. |
SIMPLE RANDOM SAMPLE vs MORE COMPLEX DESIGNS
In this example we use an exemple data set to show the calculation of Standard errors for a proportion in the simple random sample aproach.
Additionally, we also show how to condeir sample, weights, stratification, clustering and all these features together.
The 95% confidence interval for the proportion p of elements that have some condition identified by a binary variable, like the one in our example is: \begin{equation} \hat{p} \pm 1,96 \cdot \sqrt{\left( \displaystyle\frac{N-n}{N}\right) \cdot \left( \displaystyle\frac{\hat{p}(1 - \hat{p}}{n-1}\right)} \end{equation}
where p̂ is the estimated proportion.
REFERENCES
[1] Lohr, S. (2010) Sampling: Design and Analysis. Brooks/Cole.
[2] Lohr, S. (2010) Solutions Manual for Sampling: Design and Analysis. Brooks/Cole.
[3] Lohr, S. (2010) Computer Programs for Sampling: Design and Analysis. Brooks/Cole.
[4] Tillé, Y. (2006) Sampling Algorithms. Springer.
[5] Survey Analysis in R.