The Difference of Occupation Choice Among Graduates with Different Majors and Degrees

The percentages of graduates who get work in educational institutes, industry, or government are different among them with different majors and degrees. This article tries to take a look at how the majors and degrees, and other factors may affect people's occupation choices.

The data is from 2013 National Survey of Graduates with totally 104599 observations and 515 variables. A subdata is extracted with 87145 observations who had graduated before 2013 and currently had jobs during the survey reference month (Feb 2013). There are 9 interested variables:

Variable in the Raw Data
Variable Used in Regression Models
Employer sector:
1 = education institute,
2 = government,
3 = industry
1 = education institute,
0 = others
1 = bachelor,
2 = master,
3 = PhD,
4 = professional
1 = bachelor,
2 = master,
3 = PhD or professional
1 = Computer and mathematical sciences
2 = Biological, agricultural and environmental life sciences
3 = Physical and related sciences
4 = Social and related sciences
5 = Engineering
6 = S and E-Related Fields
7 = Non-S and E Fields

Year of graduation
 number of years after graduation:
2013 - dgryr

log of salary
1 = US citizen, native,
2 = US citizen, naturalized,
3 = permanent resident,
4 = temporary resident
1 = US citizen,
2 = permanent resident,
3 = temporary resident
Satisfaction of Salary:
1 = very satisfied,
2 = somewhat satisfied,
3 = somewhat dissatisfied,
4 = very dissatisfied
4 = very satisfied,
3 = somewhat satisfied,
2 = somewhat dissatisfied,
1 = very dissatisfied

The table below shows the percentage of graduates who are working in industry or education institutes for different majors and degrees. Graduates with PhD degrees are more likely to work in education institutes, and most graduates with master or bachelor degrees are working in industry.

Education Institute Industry
Bachelor Master PhD Bachelor Master PhD
Computer & Math 0.143 0.218 0.476 0.771 0.703 0.451
Biology, Agriculture, & Environment 0.199 0.320 0.499 0.624 0.478 0.386
Physics 0.192 0.327 0.406 0.662 0.541 0.502
Social Science 0.162 0.320 0.491 0.690 0.492 0.398
Engineering 0.043 0.095 0.269 0.837 0.783 0.648
Science & Engineering Related 0.164 0.294 0.266 0.753 0.602 0.658
Non Science & Engineering 0.177 0.359 0.351 0.691 0.535 0.511

The results of this table above can be exported by SAS proc sql as follows:

proc sql;
create table job_major_degree as
select degree,
       mean(emsecsm = '1') as Education,
       mean(emsecsm = '2') as Government,
       mean(emsecsm = '3') as Industry
from proj_jobchoice
group by ndgmemg,degree;

proc transpose data = job_major_degree out = job_major_degree2;
by ndgmemg;
id degree;
var Education;

proc transpose data = job_major_degree out = job_major_degree3;
by ndgmemg;
id degree;
var Industry;


The graph below shows the percentage of graduates who are working in industry, education institute, or government, against their year of graduation. We can find that much higher percentage of people who graduated in last 5 years (graduated after 2008) are working in education institute. Maybe it could be explained by the tenure track. Some faculties may switch their jobs to industry during 5 to 6 years of tenure track period.
The SAS code for the data output and graph is as follows. "Proc sgplot" is used to generate this line plot.

proc sql;
create table job_grad as
select dgryr,
       mean(emsecsm = '1') as Education,
       mean(emsecsm = '2') as Government,
       mean(emsecsm = '3') as Industry
from proj_jobchoice
group by dgryr;

proc sgplot data=job_grad (where=(1959<dgryr<2012));
title "Occupation of Graduates";
series x=dgryr y=Education / lineattrs = (thickness = 2);
series x=dgryr y=Government / lineattrs = (thickness = 2 pattern = 2);
series x=dgryr y=Industry / lineattrs = (thickness = 2 pattern = 4);
xaxis label = 'Year of Graduation';
yaxis label = 'Percentage';


Then I want to construct a logistic model to study on the relationship between occupation sector and degree, major, or other factors. The dependent variable is whether working at education institute or not, "job_edu". And there are 6 independent variables in the model: degree, major, (interaction between degree and major), number of years after graduation, gender, log of salary, and citizenship. The SAS code is as follows. "plots = " statement is used to generate estimation plots such as ROC curve and confidence interval plot, "ctable" statement with "pprob = " is used to generate classification table with specific cutoffs, "lackfit" statement is used to do the lack of fit test.

ods graphics on;
proc logistic data = proj_jobchoice descending plots=all;
class degree (ref = '1') ndgmemg (ref = '1') citizen (ref = '1') gender / param = ref;
model jobedu = degree|ndgmemg gradyr gender lnsalary citizen / ctable pprob = 0.4 0.5 0.6 lackfit;
weight wtsurvy;

ods graphics off;

I posted parts of results below, including estimation result, classification table, and ROC curve. From the estimation table below, we can find that all these 6 independent variables have significant relationship with the probability to work at education institute. PhDs are more likely to work at education institutes, the engineering students are most unlikely to work at education institutes, women are more likely to work at education institutes, foreigners with temporary visa are more likely to work at education institutes but who with permanent residence visa are more unlikely to work at education institutes, people who graduated earlier and earned higher salary are more unlikely to work at education institutes.


