The Difference of Occupation Choice Among Graduates with Different Majors and Degrees

The percentages of graduates who get work in educational institutes, industry, or government are different among them with different majors and degrees. This article tries to take a look at how the majors and degrees, and other factors may affect people's occupation choices.

The data is from 2013 National Survey of Graduates with totally 104599 observations and 515 variables. A subdata is extracted with 87145 observations who had graduated before 2013 and currently had jobs during the survey reference month (Feb 2013). There are 9 interested variables:

Variable in the Raw Data	Description	Variable Used in Regression Models	Catogories
emsecsm	Employer sector: 1 = education institute, 2 = government, 3 = industry	job_edu	1 = education institute, 0 = others
dgrdg	Degree: 1 = bachelor, 2 = master, 3 = PhD, 4 = professional	degree	1 = bachelor, 2 = master, 3 = PhD or professional
ndgmemg	Major: 1 = Computer and mathematical sciences 2 = Biological, agricultural and environmental life sciences 3 = Physical and related sciences 4 = Social and related sciences 5 = Engineering 6 = S and E-Related Fields 7 = Non-S and E Fields	(same)
dgryr	Year of graduation	gradyr	number of years after graduation: 2013 - dgryr
gender	Gender	(same)
salary	Salary	ln_salary	log of salary
ctzn	Citizenship: 1 = US citizen, native, 2 = US citizen, naturalized, 3 = permanent resident, 4 = temporary resident	citizen	1 = US citizen, 2 = permanent resident, 3 = temporary resident
satsal	Satisfaction of Salary: 1 = very satisfied, 2 = somewhat satisfied, 3 = somewhat dissatisfied, 4 = very dissatisfied	satis	4 = very satisfied, 3 = somewhat satisfied, 2 = somewhat dissatisfied, 1 = very dissatisfied
wtsurvy	Weight	(same)

The table below shows the percentage of graduates who are working in industry or education institutes for different majors and degrees. Graduates with PhD degrees are more likely to work in education institutes, and most graduates with master or bachelor degrees are working in industry.

	Education Institute			Industry
	Bachelor	Master	PhD	Bachelor	Master	PhD
Computer & Math	0.143	0.218	0.476	0.771	0.703	0.451
Biology, Agriculture, & Environment	0.199	0.320	0.499	0.624	0.478	0.386
Physics	0.192	0.327	0.406	0.662	0.541	0.502
Social Science	0.162	0.320	0.491	0.690	0.492	0.398
Engineering	0.043	0.095	0.269	0.837	0.783	0.648
Science & Engineering Related	0.164	0.294	0.266	0.753	0.602	0.658
Non Science & Engineering	0.177	0.359	0.351	0.691	0.535	0.511

The results of this table above can be exported by SAS proc sql as follows:

proc sql;

create table job_major_degree as

select degree,

ndgmemg,

mean(emsecsm = '1') as Education,

mean(emsecsm = '2') as Government,

mean(emsecsm = '3') as Industry

from proj_jobchoice

group by ndgmemg,degree;

quit;

proc transpose data = job_major_degree out = job_major_degree2;

by ndgmemg;

id degree;

var Education;

run;

proc transpose data = job_major_degree out = job_major_degree3;

by ndgmemg;

id degree;

var Industry;

run;

The graph below shows the percentage of graduates who are working in industry, education institute, or government, against their year of graduation. We can find that much higher percentage of people who graduated in last 5 years (graduated after 2008) are working in education institute. Maybe it could be explained by the tenure track. Some faculties may switch their jobs to industry during 5 to 6 years of tenure track period.

The SAS code for the data output and graph is as follows. "Proc sgplot" is used to generate this line plot.

proc sql;

create table job_grad as

select dgryr,

mean(emsecsm = '1') as Education,

mean(emsecsm = '2') as Government,

mean(emsecsm = '3') as Industry

from proj_jobchoice

group by dgryr;

quit;

proc sgplot data=job_grad (where=(1959<dgryr<2012));

title "Occupation of Graduates";

series x=dgryr y=Education / lineattrs = (thickness = 2);

series x=dgryr y=Government / lineattrs = (thickness = 2 pattern = 2);

series x=dgryr y=Industry / lineattrs = (thickness = 2 pattern = 4);

xaxis label = 'Year of Graduation';

yaxis label = 'Percentage';

run;

Then I want to construct a logistic model to study on the relationship between occupation sector and degree, major, or other factors. The dependent variable is whether working at education institute or not, "job_edu". And there are 6 independent variables in the model: degree, major, (interaction between degree and major), number of years after graduation, gender, log of salary, and citizenship. The SAS code is as follows. "plots = " statement is used to generate estimation plots such as ROC curve and confidence interval plot, "ctable" statement with "pprob = " is used to generate classification table with specific cutoffs, "lackfit" statement is used to do the lack of fit test.

ods graphics on;

proc logistic data = proj_jobchoice descending plots=all;

class degree (ref = '1') ndgmemg (ref = '1') citizen (ref = '1') gender / param = ref;

model jobedu = degree|ndgmemg gradyr gender lnsalary citizen / ctable pprob = 0.4 0.5 0.6 lackfit;

weight wtsurvy;

run;

ods graphics off;

I posted parts of results below, including estimation result, classification table, and ROC curve. From the estimation table below, we can find that all these 6 independent variables have significant relationship with the probability to work at education institute. PhDs are more likely to work at education institutes, the engineering students are most unlikely to work at education institutes, women are more likely to work at education institutes, foreigners with temporary visa are more likely to work at education institutes but who with permanent residence visa are more unlikely to work at education institutes, people who graduated earlier and earned higher salary are more unlikely to work at education institutes.

Rcpp Example: Partition Based Selection Algorithm

In this post, I'm going to take a Rcpp example that call a C++ function to find kth smallest element from an array. A partition-based selection algorithm could be used for implementation. A most basic partition-based selection algorithm, quickselect , is able to achieve linear performance to find the kth element in an unordered list. Quickselect is a variant of quicksort , both of which choose a pivot and then partitions the data by it. The procedure of quickselect is to firstly move all elements smaller than the pivot to the left and what greater than the pivot the the right by exchanging the location of them, given a pivot such as the last element in the list; and then to move the elements in the left or right sublist again according to a new pivot until getting exact kth elements. The difference from quicksort is that quickselect only need to recurses on one side where the desired kth element is, instead of recursing on both sides of the partition which is what quicksort ...

Jason's Blog

Search This Blog

The Difference of Occupation Choice Among Graduates with Different Majors and Degrees

Labels

Comments

Post a Comment

Popular posts from this blog

Weighted Percentile in Python Pandas

Rcpp Example: Partition Based Selection Algorithm

Trend Removal Using the Hodrick-Prescott (HP) Filter