On Bootstrap Algorithms in Survey Sampling

Authors

DOI:

https://doi.org/10.15678/KREM.2024.1004.0207

Keywords:

survey sampling, small area estimation, bootstrap, estimation and prediction accuracy

Abstract

Objective: The aim of this paper is to present bootstrap algorithms for measuring the accuracy of estimation and prediction in design-based and model-based approaches in survey sampling and small area estimation. Three proposals of prediction-mean squared error estimators are also examined.

Research Design & Methods: Various bootstrap procedures are shown and used to estimate the design- and prediction-mean squared errors based on real data. Computations are supported by two R packages.

Findings: Three prediction-mean squared error estimators are proposed.

Implications / Recommendations: The bootstrap algorithms used in the design-based approach give similar results for the considered data for the variance estimates of the considered estimator, implying that the speed of the algorithms may be important for practitioners in cases of similar properties. The proposed estimators of the prediction mean squared error produce higher estimates than other estimators in the model-based approach, indicating a positive bias that can be interpreted as a pessimistic accuracy estimate.

Contribution: All the presented bootstrap algorithms are easily applicable using two R packages available at R CRAN and GitHub. Three double bootstrap prediction-MSE estimators are proposed and analysed in the real-data application.

Downloads

Download data is not yet available.

References

Antal, E., & Tillé, Y. (2011). A Direct Bootstrap Method for Complex Sampling Designs from a Finite Population. Journal of the American Statistical Association, 106(494), 534–543. https://doi.org/10.1198/jasa.2011.tm09767 DOI: https://doi.org/10.1198/jasa.2011.tm09767

Antal, E., & Tillé, Y. (2014). A New Resampling Method for Sampling Designs without Replacement: The Doubled Half Bootstrap. Computational Statistics, 29(5), 1345–1363. https://doi.org/10.1007/s00180-014-0495-0 DOI: https://doi.org/10.1007/s00180-014-0495-0

Barbiero, A., Manzi, G., & Mecatti, F. (2015). Bootstrapping Probability-proportional-to-size Samples via Calibrated Empirical Population. Journal of Statistical Computation and Simulation, 85(3), 608–620. https://doi.org/10.1080/00949655.2013.833204 DOI: https://doi.org/10.1080/00949655.2013.833204

Barbiero, A., & Mecatti, F. (2010). Bootstrap Algorithms for Variance Estimation in πPS Sampling. In: P. Mantovan, P. Secchi (Eds), Complex Data Modeling and Computationally Intensive Statistical Methods. Contribution to Statistics (pp. 57–69). Springer. https://doi.org/10.1007/978-88-470-1386-5_5 DOI: https://doi.org/10.1007/978-88-470-1386-5_5

Beaumont, J.-F., & Patak, Z. (2012). On the Generalized Bootstrap for Sample Surveys with Special Attention to Poisson Sampling. International Statistical Review, 80(1), 127–148. https://doi.org/10.1111/j.1751-5823.2011.00166.x DOI: https://doi.org/10.1111/j.1751-5823.2011.00166.x

Butar, F. B., & Lahiri, P. (2003). On Measures of Uncertainty of Empirical Bayes Small-area Estimators. Journal of Statistical Planning and Inference, 112(1–2), 63–76. https://doi.org/10.1016/S0378-3758(02)00323-3 DOI: https://doi.org/10.1016/S0378-3758(02)00323-3

Carpenter, J. R., Goldstein, H., & Rasbash, J. (2003). A Novel Bootstrap Procedure for Assessing the Relationship between Class Size and Achievement. Journal of the Royal Statistical Society: Series C (Applied Statistics), 52(4), 431–443. https://doi.org/10.1111/1467-9876.00415 DOI: https://doi.org/10.1111/1467-9876.00415

Cassel, C. M., Särndal, C.-E., & Wretman, J. H. (1977). Foundations of Inference in Survey Sampling. Wiley-Interscience.

Chambers, R., & Chandra, H. (2013). A Random Effect Block Bootstrap for Clustered Data. Journal of Computational and Graphical Statistics, 22(2), 452–470. https://doi.org/10.1080/10618600.2012.681216 DOI: https://doi.org/10.1080/10618600.2012.681216

Chwila, A., & Żądło, T. (2020). On the Choice of the Number of Monte Carlo Iterations and Bootstrap Replicates in Empirical Best Prediction. Statistics in Transition New Series, 21(2), 35–60. https://doi.org/10.21307/stattrans-2020-013 DOI: https://doi.org/10.21307/stattrans-2020-013

Chwila, A., & Żądło, T. (2022). On Properties of Empirical Best Predictors. Communications in Statistics – Simulation and Computation, 51(1), 220–253. https://doi.org/10.1080/03610918.2019.1649422 DOI: https://doi.org/10.1080/03610918.2019.1649422

Deville, J.-C., & Särndal, C.-E. (1992). Calibration Estimators in Survey Sampling. Journal of the American Statistical Association, 87(418), 376–382. https://doi.org/10.1080/01621459.1992.10475217 DOI: https://doi.org/10.1080/01621459.1992.10475217

Efron, B. (1979). Bootstrap Methods: Another Look at the Jackknife. The Annals of Statistics, 7(1), 1–26. https://doi.org/10.1214/aos/1176344552 DOI: https://doi.org/10.1214/aos/1176344552

Erciulescu, A. L., & Fuller, W. A. (2014). Parametric Bootstrap Procedures for Small Area Prediction Variance. In: JSM 2014 – Survey Research Methods Section (pp. 3307–3318). American Statistical Association.

Hall, P., & Maiti, T. (2006). On Parametric Bootstrap Methods for Small Area Prediction. Journal of the Royal Statistical Society Series B: Statistical Methodology, 68(2), 221–238. https://doi.org/10.1111/j.1467-9868.2006.00541.x DOI: https://doi.org/10.1111/j.1467-9868.2006.00541.x

Holmberg, A. (1988). A Bootstrap Approach to Probability Proportional-to-size Sampling. In: Proceedings of Section on Survey Research Methods (pp. 378–383). American Statistical Association.

Horvitz, D. G., & Thompson, D. J. (1952). A Generalization of Sampling without Replacement from a Finite Universe. Journal of the American Statistical Association, 47(260), 663–685. https://doi.org/10.1080/01621459.1952.10483446 DOI: https://doi.org/10.1080/01621459.1952.10483446

Jacqmin-Gadda, H., Sibillot, S., Proust, C., Molina, J.-M., & Thiébaut, R. (2007). Robustness of the Linear Mixed Model to Misspecified Error Distribution. Computational Statistics & Data Analysis, 51(10), 5142–5154. https://doi.org/10.1016/j.csda.2006.05.021 DOI: https://doi.org/10.1016/j.csda.2006.05.021

Krzciuk, M., & Żądło, T. (2014a). On Some Tests of Fixed Effects for Linear Mixed Models. Studia Ekonomiczne, 189, 49–57.

Krzciuk, M., & Żądło, T. (2014b). On Some Tests of Variance Components for Linear Mixed Models. Studia Ekonomiczne, 189, 77–85.

Krzciuk, M. K. (2018). On the Simulation Study of Jackknife and Bootstrap MSE Estimators of a Domain Mean Predictor for Fay‑Herriot Model. Acta Universitatis Lodziensis. Folia Oeconomica, 5(331), 169–183. https://doi.org/10.18778/0208-6018.331.11 DOI: https://doi.org/10.18778/0208-6018.331.11

Kucharski, R., & Żądło, T. (2021). pipsboot: Bootstrap for Probability Proportional to Size Sampling, R package. Retrieved from: https://github.com/kucharsky/pipsboot (accessed: 29.03.2024).

Quatember, A. (2014). The Finite Population Bootstrap – from the Maximum Likelihood to the Horvitz-Thompson Approach. Austrian Journal of Statistics, 43(2), 93–102. https://doi.org/10.17713/ajs.v43i2.10 DOI: https://doi.org/10.17713/ajs.v43i2.10

Ranalli, M. G., & Mecatti, F. (2012). Comparing Recent Approaches for Bootstrapping Sample Survey Data: A First Step towards a Unified Approach. In: Section on Survey Research Methods – JSM 2012 (pp. 4088–4099). American Statistical Association.

Rao, J. N. K., & Molina, I. (2015). Small Area Estimation (2nd ed.). John Wiley & Sons. DOI: https://doi.org/10.1002/9781118735855

Rao, J. N. K., & Wu, C. F. J. (1988). Resampling Inference with Complex Survey Data. Journal of the American Statistical Association, 83(401), 231–241. https://doi.org/10.1080/01621459.1988.10478591 DOI: https://doi.org/10.1080/01621459.1988.10478591

Särndal, C.-E., Swensson, B., & Wretman, J. (1992). Model Assisted Survey Sampling. Springer. DOI: https://doi.org/10.1007/978-1-4612-4378-6

Sen, A. R. (1953). On the Estimate of the Variance in Sampling with Varying Probabilities. Journal of the Indian Society of Agricultural Statistics, 5(1194). DOI: https://doi.org/10.1177/0008068319530101

Sitter, R. R. (1992). Comparing Three Bootstrap Methods for Survey Data. Canadian Journal of Statistics, 20(2), 135–154. https://doi.org/10.2307/3315464 DOI: https://doi.org/10.2307/3315464

Stachurski, T. (2018). A Simulation Analysis of the Accuracy of Median Estimators for Different Sampling Designs. In: L. Váchová, V. Kratochvíl (Eds), Proceedings of the 36th International Conference Mathematical Methods in Economics MME 2018 (pp. 509–514). MatfyzPress, Publishing House of the Faculty of Mathematics and Physics Charles University.

Stachurski, T. (2021). Small Area Quantile Estimation Based on Distribution Function Using Linear Mixed Models. Economics and Business Review, 7(2), 97–114. https://doi.org/10.18559/ebr.2021.2.7 DOI: https://doi.org/10.18559/ebr.2021.2.7

Sverchkov, M., & Pfeffermann, D. (2004). Prediction of Finite Population Totals Based on the Sample Distribution. Survey Methodology, 30(1), 79–92.

Thai, H.-T., Mentré, F., Holford, N. H. G., Veyrat-Follet, C., & Comets, E. (2013). A Comparison of Bootstrap Approaches for Estimating Uncertainty of Parameters in Linear Mixed-effects Models. Pharmaceutical Statistics, 12(3), 129–140. https://doi.org/10.1002/pst.1561 DOI: https://doi.org/10.1002/pst.1561

Tillé, Y. (2006). Sampling Algorithms. Springer.

Tillé, Y., & Matei, A. (2021). sampling: Survey Sampling, R package. Retrieved from: https://CRAN.R-project.org/package=sampling (accessed: 29.03.2024).

Wolny-Dominiak, A. (2017). Bootstrap Mean Squared Error of Prediction in Loss Reserving. In: K. Jajuga, L. T. Orlowski, K. Staehr (Eds), Contemporary Trends and Challenges in Finance. Springer Proceedings in Business and Economics (pp. 213–220). Springer International Publishing. https://doi.org/10.1007/978-3-319-54885-2_20 DOI: https://doi.org/10.1007/978-3-319-54885-2_20

Wolny-Dominiak, A., & Żądło, T. (2022a). On Bootstrap Estimators of Some Prediction Accuracy Measures of Loss Reserves in a Non-life Insurance Company. Communications in Statistics – Simulation and Computation, 51(8), 4225–4240. https://doi.org/10.1080/03610918.2020.1740263 DOI: https://doi.org/10.1080/03610918.2020.1740263

Wolny-Dominiak, A., & Żądło, T. (2022b). qape: Quantile of Absolute Prediction Errors, R package. Retrieved from: https://CRAN.R-project.org/package=qape (accessed: 29.03.2024).

Yates, F., & Grundy, P. M. (1953). Selection without Replacement from within Strata with Probability Proportional to Size. Journal of the Royal Statistical Society: Series B (Methodological), 15(2), 253–261. https://doi.org/10.1111/j.2517-6161.1953.tb00140.x DOI: https://doi.org/10.1111/j.2517-6161.1953.tb00140.x

Żądło, T. (2015). Statystyka małych obszarów w badaniach ekonomicznych. Podejście modelowe i mieszane. Wydawnictwo Uniwersytetu Ekonomicznego w Katowicach.

Żądło, T. (2021). On the Generalisation of Quatember’s Bootstrap. Statistics in Transition New Series, 22(1), 163–178. https://doi.org/10.21307/stattrans-2021-009 DOI: https://doi.org/10.21307/stattrans-2021-009

Downloads

Published

01-07-2024 — Updated on 09-09-2024

Versions

Issue

Section

Articles

How to Cite

Żądło, T. (2024). On Bootstrap Algorithms in Survey Sampling. Krakow Review of Economics and Management Zeszyty Naukowe Uniwersytetu Ekonomicznego W Krakowie, 2(1004), 121-138. https://doi.org/10.15678/KREM.2024.1004.0207 (Original work published 2024)