* Chinese higher education during expansion: Quality and social stratification * CJS 2015 Vol. 3 * Xiaoyang Ye & Yanqing Ding * April 5, 2014; modified May 5, 2015 /* This program creates all the talbes (except tables 2&3) in the paper: - Ye, Xiaoyang, and Ding Yanqing. 2015. "Chinese higher education during expansion: Quality and scoial stratification." Chinese Journal of Sociology (in Chinese). Vol(3). The program is organized in to the following sections: - Setup, log file and data file - Basic variable cleaning - Part 1 - Access to elite college (table 4) - Part 2 - Effects of college quality (tables 5-9) - Summary stat - (table 1) NOTE ABOUT THE DATA. The "2011 Beijing Survey" data are confidentially provided by the Institute of Economics of Education at Peking University (http://iee.gse.pku.edu.cn/). Researchers who want to use the data should complete the application process. The full survey data cover 5 types of colleges: 985 (6), 211 (16), 1st&2nd tier (24), private (3), vocational (10). We only use 985, 211 and 1st&2nd tier in this paper. Figures 1-4. These figures were plotted in 2011 using Excel. We can't provide the Stata codes here. The statistics are directly from the Chinese Education Statistics Yearbook. If you have any questions w.r.t. these figures, please contact us. */ **************************************************************************** ** STANDARD HEADERS TO SETUP YOUR SESSION ** **************************************************************************** version 13.1 /* set the version */ capture clear /* clear any existing data */ capture log close /* close any open logs */ set more off, perm /* set 'more' to off in scrolling */ **************************************************************************** ** SETUP DIRECTORIES & SPECIFY DATA FILE ** **************************************************************************** global basedir "C:\Users\...\Data" global logfile "${basedir}log" **************************************************************************** ** DATA and LOG FILE ** **************************************************************************** log using "${logfile}higher_ed_`c(current_date)'.log", replace use "${basedir}\2011Beijing Survey", clear keep if school>3 /* keep 985, 211 and 1st/2nd tiers */ **************************************************************************** ** Variables ** **************************************************************************** /* Vairables (e.g., male, hanzu, eduy_father were cleaned by the researchers in the IEE at Peking U.*/ /* ISEI was coded by the authors */ ***** Dependent var /* 985 */ tab university985 /* 1st tier (1), 211 (2), 985 (3) */ tab school /* key high school */ tab highschool ***** Independent var /* province of origins (Hukou) */ g prov=q1e1_2 /* high school ranking (1=10%, 5<75% reference group */ tab rank, g(r) * Global controls /* individual: male, race, only-child */ global individual "male hanzu sibling" /* family: parental education, ISEI, log income, city hukou, province FE */ global family "eduy_father eduy_mother isei ln_income2010 city i.prov" /* high school: sicence track, key high, ranking */ global highschool "science highschool r1 r2 r3 r4" ******************* ***** TABLE 3 ***** ******************* /* The national % are from Lu (2003). */ /* Father and mother % */ tab q1g1a, gen(f_) tab q1g2a, gen(m_) preserve collapse (mean) f_* m_* /* total */ /* father social class rate % */ gen rate_f_1=f_1/4.8 gen rate_f_2=f_2/42.9 gen rate_f_3=f_3/17.5 gen rate_f_4=f_4/11.2 gen rate_f_5=f_5/7.1 gen rate_f_6=f_6/7.2 gen rate_f_7=f_7/4.6 gen rate_f_8=f_8/1 gen rate_f_9=f_9/1.6 gen rate_f_10=f_10/2.1 save "${basedir}\t3_1.dta", replace restore preserve collapse (mean) f_* m_*, by(school) /* by school */ /* father social class rate % */ gen rate_f_1=f_1/4.8 gen rate_f_2=f_2/42.9 gen rate_f_3=f_3/17.5 gen rate_f_4=f_4/11.2 gen rate_f_5=f_5/7.1 gen rate_f_6=f_6/7.2 gen rate_f_7=f_7/4.6 gen rate_f_8=f_8/1 gen rate_f_9=f_9/1.6 gen rate_f_10=f_10/2.1 save "${basedir}\t3_2.dta", replace restore **************************************************************************** ** Q1: Access to elite college ** **************************************************************************** ***** Non-missing data /* CHECK THE RANDOM-MISSING ASSUMPTION */ /* Results are similar using the missing dummies, thus we use case deletion */ drop if missing(university985, school,male,hanzu,sibling,eduy_father,eduy_mother) drop if missing(isei,ln_income2010,city,prov, science, highschool, rank) save "${basedir}\data-0405", replace ******************* ***** TABLE 4 ***** ******************* /* Raw coefficients (log odds) are reported */ ***** Regression use "${basedir}\data-0405", clear g grade=q1c2_2 g robust=college*major4*grade /* cluster robust s.e. */ *** College entry * Logit logit university985 $individual $family, cl(robust) est store m1 logit university985 $individual $family $highschool, cl(robust) est store m2 /* Grade-trend - not reported in the table */ logit university985 $individual $family $highschool if grade==1, cl(robust) est store m11 logit university985 $individual $family $highschool if grade==2, cl(robust) est store m12 logit university985 $individual $family $highschool if grade==3, cl(robust) est store m13 logit university985 $individual $family $highschool if grade==4, cl(robust) est store m14 outreg2 [m11 m12 m13 m14] using reg1.xls, replace dec(3) * Multinomial mlogit school $individual $family, cl(robust) /*not reported in the table*/ mlogit school $individual $family $highschool, cl(robust) est store m3 /* Grade-trend - not reported in the table */ mlogit school $individual $family $highschool if grade==1, cl(robust) b(1) est store m11 mlogit school $individual $family $highschool if grade==2, cl(robust) b(1) est store m12 mlogit school $individual $family $highschool if grade==3, cl(robust) b(1) est store m13 mlogit school $individual $family $highschool if grade==4, cl(robust) b(1) est store m14 outreg2 [m11 m12 m13 m14] using reg2.xls, replace dec(3) *** High school entry g ranking=6-rank logit highschool $individual $family, cl(robust) est store m4 ologit ranking $individual $family, cl(robust) est store m5 outreg2 [m1 m2 m3 m4 m5] using reg.xls, replace dec(3) /* Table 4 */ **************************************************************************** ** Q2: Effects of college quality ** **************************************************************************** ***** Dependent var /* 4th grade */ tab destination /* Re-order the destinations */ recode destination 6=. /* no */ recode destination 5=5 recode destination 1=5 recode destination 2=1 recode destination 3=2 recode destination 4=3 recode destination 6=4 /* all grades */ * Whether has a plan tab q39a recode q39a 2=0 * Plan tab grapro recode grapro 6=5 label define occupation 1 "State" 2 "Foriegn" 3 "Private" 4 "Government" 5 "School" label values grapro occupation label values destination occupation ***** Independent var * Factor analysis (See Table 2 ) factor q21s1-q21s7 q22s1-q22s9, ml /* test */ screeplot factor q21s1-q21s7 q22s1-q22s9, factors(3) ml /* three factors used */ screeplot rotate predict f1 /* prediction */ * Generate college-grade-major factor scores sort robust egen f1a=mean(f1), by(robust) /* quality measure 1 */ /* by school-year */ g syear=(college+100)*grade egen f1b=mean(f1), by(syear) /* quality measure 2 */ ******************* ***** TABLE 5 ***** ******************* ***** Regression *** No interactions global family "eduy_father eduy_mother isei ln_income2010 city" global college "partystatus ganbu i.school i.major4" * Multinomial set matsize 5000 /* destination */ /* school-grade-major */ drop if missing(eduy_father,eduy_mother,isei,ln_income2010,city, /// partystatus,ganbu, school, major4, male, hanzu, sibling ) mlogit destination f1a if grade==4, b(5) cl(robust) est store m1 mlogit destination f1a $individual $family $college if grade==4, b(5) cl(robust) est store m2 g f1aisei=f1a*isei mlogit destination f1a f1aisei $individual $family $college if grade==4, b(5) cl(robust) est store m3 /* school-grade */ mlogit destination f1b if grade==4, b(5) cl(robust) est store m4 mlogit destination f1b $individual $family $college if grade==4, b(5) cl(robust) est store m5 outreg2 [m2 m5] using reg.xls, replace bd(3) /* used in table 5 */ *********************** ***** TABLE 6,7,8 ***** *********************** /* plan for grades 1-4 */ mlogit grapro f1a $individual $family $college, b(5) cl(robust) est store m5 mlogit grapro f1b $individual $family $college, b(5) cl(robust) est store m6 outreg2 [m5 m6] using reg.xls, replace bd(3) /* used in table 7 */ *** Interactions tab school, g(s) g fs21=f1a*s2 g fs31=f1a*s3 g fsb2=f1b*s2 g fsb3=f1b*s3 global fa "fs21 fs31" global fb "fsb2 fsb3" /* regressions */ mlogit destination f1a $fa $individual $family $college if grade==4, b(5) cl(robust) est store m7 mlogit grapro f1a $fa $individual $family $college, b(5) cl(robust) est store m9 mlogit destination f1b $fb $individual $family $college if grade==4, b(5) cl(robust) est store m8 mlogit grapro f1b $fb $individual $family $college, b(5) cl(robust) est store m10 outreg2 [m7 m8] using reg.xls, replace bd(3) /* used in table 6 */ outreg2 [m9 m10] using reg.xls, replace bd(3) /* used in table 8 */ ******************* ***** TABLE 9 ***** ******************* *** Additional - log wage (tobit model) g lwage=ln(q40b_1) tobit lwage f1a $individual $family $college if grade==4, ll(7.6) ul(10.3) cl(robust) est store m1 tobit lwage f1a if grade==4, ll(7.6) ul(10.3) cl(robust) est store m10 tobit lwage f1b if grade==4, ll(7.6) ul(10.3) cl(robust) est store m2 tobit lwage f1b $individual $family $college if grade==4, ll(7.6) ul(10.3) cl(robust) est store m11 outreg2 [m1 m2] using reg.xls, replace bd(3) /* used in table 9 */ /* Low begininig */ drop if missing(q39c1a,q39c2a) ologit q39c1a f1a $individual $family $college, cl(robust) est store m21 ologit q39c2a f1a $individual $family $college, cl(robust) est store m22 outreg2 [m21 m22] using reg.xls, replace bd(3) /* used in table 9 */ ********************************* ***** TABLE 1 - descriptive ***** ********************************* ***** Descriptive sum partystatus sum partystatus if school==1 sum partystatus if school==2 sum partystatus if school==3 sum ganbu sum ganbu if school==1 sum ganbu if school==2 sum ganbu if school==3 tab market tab market if school==1 tab market if school==2 tab market if school==3 * All preserve keep male hanzu sibling eduy_father eduy_mother isei ln_income2010 city /// highschool r1 school partystatus ganbu f1a f1b outreg2 using sum.doc, sum(detail) replace eqkeep(N mean sd) restore * College type forvalues i=1/3 { preserve keep male hanzu sibling eduy_father eduy_mother isei ln_income2010 city /// highschool r1 school partystatus ganbu f1a f1b keep if school==`i' outreg2 using sum_`i'.xls, sum(detail) replace eqkeep(N mean sd) restore } * Additional - destination tab grapro bysort school: tab grapro ******************************************************************************** log close exit, clear