Chinese Journal of Sociology ›› 2024, Vol. 44 ›› Issue (3): 173-219.

Previous Articles     Next Articles

Generating Macro-Level Data Using Latent Variable Modeling and Dynamic Bayesian Methods

ZHANG Gaoxiang, CHEN Zhe, CHEN Yunsong   

  • Published:2024-05-29

Abstract: In contemporary quantitative sociological research,the testing of causal mechanisms and macro theories has driven researchers’ need for high-quality time-series data at the district cluster level. However,sociological research suffers from significant shortcomings in accessing large-scale,long time-span tracking data compared to fields such as economics. While the aggregation of individual social survey data from multiple sources to generate panel data is an important way to improve data scarcity,it is also constrained by the limitations of the spatial and temporal distribution of social surveys and the variability across surveys. In this paper,we introduce a dynamic Bayesian latent variable modeling framework designed to facilitate the generation of complete panel data at the cluster level. The implementation of this framework is demonstrated through a practical example,and its efficacy is highlighted in comparison to several common missing data imputation techniques. The results show that the dynamic Bayesian latent variable model has noticeable advantages in terms of temporal-spatial imputation,multi-dimensional social index integration,and even the inclusion of parameter uncertainty. This method has potential in the estimating and imputing missing data for years and regions within surveys,yielding a clear picture of its future appliance in panel data generation and dimension integration for macro-level sociological research. However,the practical application of this approach still faces certain limitations,such as data availability,“synonym repetition”,and insufficient sensitivity to drastic changes. In view of this,this paper proposes corresponding optimization strategies to enhance the applicability and flexibility of this modeling framework,thereby expanding its application scope in the field of social sciences. The research in this paper provides valuable insights for practical application of the dynamic Bayesian latent variable modeling approach,offering inspiration for future related studies.

Key words: data generation, dimension integration, latent variables, Bayesian item response theory model, dynamic linear model