统计模型的“不确定性”问题:与倾向值方法

社会杂志 ›› 2017, Vol. 37 ›› Issue (1): 186-210.

统计模型的“不确定性”问题:与倾向值方法

胡安宁

复旦大学社会学系

出版日期:2017-01-20 发布日期:2017-01-20
通讯作者: 胡安宁,E-mail:huanning@fudan.edu.cn E-mail:huanning@fudan.edu.cn
基金资助:
本文得到国家社科基金青年项目（15CSH030）、上海市教育委员会科研创新项目（15ZS001）和复旦大学“卓学人才计划”项目的支持

Uncertainty of Statistical Models and Propensity Score Methods

HU Anning

Department of Sociology, Fudan University

Online:2017-01-20 Published:2017-01-20
Supported by:
This study was supported by the National Social Science Foundation (15CSH030), the Innovation Program of Shanghai Municipal Education Commission (15ZS001), and the Outstanding Scholar (Zhuoxue） Grant from Fudan University

摘要/Abstract

摘要：

量化社会学研究往往基于特定的统计模型展开。近十几年来日益流行的倾向值方法也不例外，其在实施过程中需要同时拟合估计倾向值得分的“倾向值模型”与估计因果关系的“结果模型”。然而，无论是其模型形式还是系数估计，统计模型本身都具有不可忽视的“不确定性”问题。本研究在倾向值分析方法的框架下，系统梳理和阐释了模型形式不确定性与模型系数不确定性的内涵及其处理方法。通过分析“蒙特卡洛模拟”数据与经验调查数据，本文展示了在使用倾向值方法进行因果估计的过程中，研究者如何通过“贝叶斯平均法”进行多个备选倾向值模型的选择，以及如何通过联合估计解决倾向值模型与估计模型中的系数不确定性问题。本文的研究也表明，在考虑倾向值估计过程的不确定性之后，结果模型中对于因果关系的估计呈现更小的置信区间和更高的统计效率。

关键词: 贝叶斯平均, 倾向值方法, 模型形式不确定性, 统计效率, 模型系数不确定性

Abstract:

Quantitative sociological research has always employed certain specific statistical models. Over the past several decades, the focus on causal relationship in sociological studies has led to a wide spread application of propensity score methods.Using an explicit estimation of the probability of being subject to a specific treatment or intervention, sociologists are able to mimic random experiments to predict causal effects. In practice, propensity score methods require an estimation from two models:one predicts propensity scores and the other estimates causal effects. However, the model structure and coefficient of both contain considerable uncertainty. This study offers a systematic review of the model structure and coefficient uncertainty in propensity score methods as well ascertain strategies to tackle the issue. By analyzing Monte Carlo's simulated data along with empirical survey statistics, the paper demonstrates how researchers can use Bayesian Model Averaging to select multiple backup models and deal with possible model-coefficient uncertainty with the joint maximum likelihood estimation in propensity score methods. The paper also finds that after taking into account of various sources of uncertainty,the estimated causal effects display a narrower confidence interval but a higher level of statistical efficiency.

Key words: Propensity Score Method, Statistical Efficiency, Model Coefficient Uncertainty, Model Form Uncertainty, Bayesian Averaging

胡安宁. 统计模型的“不确定性”问题:与倾向值方法[J]. 社会杂志, 2017, 37(1): 186-210.

HU Anning. Uncertainty of Statistical Models and Propensity Score Methods[J]. Chinese Journal of Sociology, 2017, 37(1): 186-210.

参考文献

陈云松、吴晓刚.2012.走向开源的社会学——定量分析中的复制性研究[J].社会 32(3):1-23.
胡安宁. 2012.倾向值匹配与因果推论:方法论述评[J].社会学研究(1):221-242.
胡安宁.2014.教育能否让我们更健康——基于2010年中国综合社会调查的城乡比较分析[J].中国社会科学(5):116-130.
Abadie,Alberto and Imbens Guido. 2016."Matching on the Estimated Propensity Score."Econometrica 84(2):781-807.
An,Weihua. 2010. "Bayesian Propensity Score Estimators:Incorporating Uncertainties in Propensity Scores into Causal Inference."Sociological Methodology 40(1):151-189.
Bartels,Larry M. 1997. "Specification Uncertainty and Model Averaging."American Journal of Political Science 41(2):641-674.
Box,George E. P. and Norman R. Draper. 1987. Empirical Model Building and Response Surfaces. New York:Wiley.
Cohen-Cole,Ethan,Steven Durlauf, Jeffrey Fagan,and Daniel Nagin. 2009. "Model Uncertainty and the Deterrent Effect of Capital Punishment. "American Law and Economics Review 11(2):335-369.
Drake,Christiana. 1993."Effects of Misspecification of the Propensity Score on Estimators of Treatment Effect." Biometrics 49(4):1231-1236.
Draper, David. 1995. "Assessment and Propagation of Model Uncertainty." Journal of the Royal Statistical Society:Series B:57(1):45-97.
Durlauf,Steven,Chao Fu,and Salvador Navarro. 2012."Assumptions Matter:Model Uncertainty and the Deterrent Effect of Capital Punishment."American Economic Review 102 (3):487-492.
Fan,Jianqing,Fang Han,and Han Liu. 2014."Challenges of Big Data Analysis."National Science Review 1(2):293-314.
Heckman,James and Xuesong Li. 2004. "Selection Bias,Comparative Advantage and Heterogeneous Returns to Education:Evidence from China in 2000."Pacific Economic Review 9(3):155-171.
Ho,Daniel,Kosuke Imai,Gary King,and Elizabeth Stuart. 2007."Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference."Political Analysis 15(3):199-236.
Hoeting,Jennifer,David Madigan,Adrian Raftery,and Chris Volinsky. 1999."Bayesian Model Averaging:A Tutorial."Statistical Science 14(4):382-417.
Hu,Anning. 2014. "The Health Benefits of College Education in Urban China:Selection Bias and Heterogeneity." Social Indicators Research 115(3):1101-1121.
Imbens,Guido and Donald Rubin. 2015. Causal Inference for Statistics,Social,and Biomedical Sciences:An Introduction. New York:Cambridge University Press.
Kaplan,David and Jianshen Chen. 2012."A Two-Step Bayesian Approach for Propensity Score Analysis:Simulations and Case Study."Psychometrika 77(3):581-609.
Kaplan,David and Jianshen Chen. 2014. "Bayesian Model Averaging for Propensity Score Analysis."Multivariate Behavioral Research 49(6):505-517.
Leamer,Edward.1983."Let's Take the Con Out of Econometrics."American Economic Review 73(1):31-43.
Madigan,David and Adrian Raftery. 1994."Model Selection and Accounting for Model Uncertainty in Graphical Models Using Occam's Window."Journal of the American Statistical Association 89(428):1535-1546.
Magnus,Jan and Mary Morgan. 1999. Methodology and Tacit Knowledge:Two Experiments in Econometrics. New York:Wiley.
McCandless,Lawrence C.,Paul Gustafson,and Peter C. Austin 2009."Bayesian Propensity Score Analysis for Observational Data."Statistics in Medicine 28(1):94-112.
Montgomery, Jacob M. and Brendan Nyhan. 2010."Bayesian Model Averaging:Theoretical Development and Practical Applications."Political Analysis 18(2):245-270.
Moral-Benito,Enrique. 2015."Model Averaging in Economics:An Overview."Journal of Economic Surveys 29(1):46-75.
Morgan,Stephen L. 2014. Handbook of Causal Analysis for Social Research. Springer.
Raftery,Adrian E. 1995."Bayesian Model Selection in Social Research."Sociological Methodology (25):111-163.
Raftery,Adrian E. 2001."Statistics in Sociology,1950-2000:A Selective Review."Sociological Methodology 31(1):1-45.
Rosenbaum,Paul R.and Donald B. Rubin. 1983."The Central Role of the Propensity Score in Observational Studies for Causal Effects."Biometrika 70(1):41-55.
Rubin,Donald B.1997."Estimating Causal Effects from Large Data Sets Using Propensity Scores."Annals of Internal Medicine 127(8):757-763.
Sala-i-Martin,Xavier X. 1997."I Just Ran Two Million Regressions."American Economic Review 87(2):178-183.
Sala-i-Martin,Xavier X.,G.Doppelhofer,and R. I.Miller.2004."Determinants of Longterm Growth:A Bayesian Averaging of Classical Estimates (BACE) Approach."American Economic Review 94(4):813-835.
Watts,Duncan. 2015."Common Sense and Sociological Explanations."American Journal of Sociology 102(2):313-351.
Western,Bruce. 1996. "Vague Theory and Model Uncertainty in Macrosociology."Sociological Methodology (26):165-192.
Xie,Yu and Xiaogang Wu. 2005."Reply:Market Premium,Social Process,and Statisticism."American Sociological Review 70(5):865-870.
Young,Christobal. 2009."Model Uncertainty in Sociological Research:An Application to Religion and Economic Growth."American Sociological Review 74(3):380-397.
Zigler,Corwin Matthew and Francesca Dominici. 2014."Uncertainty in Propensity Score Estimation:Bayesian Methods for Variable Selection and Model-Averaged Causal Effect."Journal of the American Statistical Association 109(505):95-107.