Skip to main content

PROC GMAP Example: Voter Turnout

IPUMS-CPS collects the voter turnout information for each quadrennial presidential election and mid-term election. I used respondents' feedback for the question whether he/she voted in the 2008 presidential election, and their residence information, to visualize the state turnout rates for 2008 election.

Before the data visualization by PROC GMAP, I should firstly input the data and compute the turnout rates by state. DATA step and PROC MEANS were used as follows:

%let path=...; 
libname mylab "&path";
filename ASCIIDAT "&path/cps.dat";

data mylab.cps;
infile ASCIIDAT pad missover lrecl=14;
input
  STATEFIP    1-2
  WTSUPP      3-12 .4
  VOTEREG     13-14
;
run;

data vote08 (rename = (STATEFIP = state));
set mylab.cps;
if VOTEREG = 99 then delete;
  else if VOTEREG = 2 then vote_indicator = 1;
  else vote_indicator = 0;
run;

proc sort data = vote08;
by state;
proc means data = vote08 noprint;
by state;
var vote_indicator;
weight WTSUPP;
output out = turnout_rates_by_state mean(vote_indicator) = turnout_rate;
run;

Then, the output dataset "turnout_rates_by_state" included two variables: the state FIPS code, and the turnout rate, that could be used for shown in the US map. If we simply used PROC GMAP without additional option settings, the graph might look like this, which was not really beautiful. I added two more important settings, the customized color and printing out turnout rate on each state. SAS Macros were employed so that this program could be packaged and called directly in the future. The SAS code is as follows:

%macro turnout_map;
%local i colors colorscount;
%let colors = cb0020 df4949 ef9277 f5c0a9 f6e4dd e0ebf1 b3d5e6 82bbd8 4196c7 0471b0;
%let colorscount = %sysfunc(countw(&colors.));

libname mylab "...";

data turnout_rates_by_state;
set mylab.turnout_rates_by_state;
format turnout_rate percent7.2;
run;

data mapanno1;
length function $8 text $16 size 8;
retain xsys ysys '2' hsys '3' when 'a' style "'Albany AMT'";
merge turnout_rates_by_state maps.uscenter( where = ( fipstate( state ) ne 'PR' ) );
by state;
text = catx( " / ", fipstate(state), strip(putn(turnout_rate,'percent7.2')));
lagocean = lag( ocean );
size = 1.5;
if ocean = 'Y' then do;
  function = 'label';
  position = '6';
  output;
  function = 'move';
  output;
end;
position = '5';
if ocean = 'N' then do;
  if lagocean = 'Y' then do;
    function = 'draw';
    size = round( 1.5 / 4, .01 );
  end;
  else do;
    function = 'label';
    position = '2';
    text = fipstate(state);
    output;
    position = '5';
    text = putn(turnout_rate, 'percent7.2' );
  end;
end;
output;
run;

options nodate nonumber orientation = landscape;

%do i = 1 %to &colorscount.;
pattern&i. v = s c = cx%scan(&colors., &i.);
%end;

legend1 label = (position = top j = l 'Turnout Rate') shape = bar(.1in, .1in);
title ls = 1.5 "State Turnout Rates (2008 Election)";

ods listing close;
ods pdf file = '...\turnout_map.pdf' notoc;

proc gmap data = turnout_rates_by_state map = maps.us ( where = ( fipstate(state) ne 'PR' ) ) all;
id state;
choro turnout_rate / anno = mapanno1 legend = legend1 levels = &colorscount.;
run;

ods pdf close;
ods listing;
%mend;

Where the data "mapanno1" was constructed to setup the state postal codes and turnout rates shown on the map. And I defined the customized colors in the global variables "colorscount". Then PROC GMAP could generate the state turnout rates map for 2008 election, and a pdf file would be exported by "ods pdf".







Comments

  1. Hi Jason,

    Thank you for the tutorial. I am trying to use SAS Maps to show substance use rates in the US. I do not have data for all states, and when I try to graph my map , there are state missing from the image. I was wondering why this happens?

    ReplyDelete

Post a Comment

Popular posts from this blog

Weighted Percentile in Python Pandas

Unfortunately, there is no weighted built-in functions in Python. If we want to get some weighted percentiles by Python, one possible method is to extend the list of data, letting the values of weight as the numbers of elements, which is discussed in a Stack Overflow poster . For example, if we have a data like, score   weight 5          2 4          3 2          4 8          1 we firstly extend the list of scores to {5, 5, 4, 4, 4, 2, 2, 2, 2, 8}, and then find the percentiles such as 10% or 50% percentile. The limitations of this method are, (1) weight must be integers; (2) values of weight cannot be very large. What if we want to calculate the weighted percentiles of a large dataset with very large non-integer weights? In this article, I want to show you an alternative method, under Python pandas. step1: given percentile q, (0<=q<=1), calculate p = q * sum of weights; step2: sort the data according the column we want to calculate the weighted percentile thereof;

Rcpp Example: Partition Based Selection Algorithm

In this post, I'm going to take a Rcpp example that call a C++ function to find kth smallest element from an array. A partition-based selection algorithm could be used for implementation. A most basic partition-based selection algorithm, quickselect , is able to achieve linear performance to find the kth element in an unordered list. Quickselect is a variant of quicksort , both of which choose a pivot and then partitions the data by it. The procedure of quickselect is to firstly move all elements smaller than the pivot to the left and what greater than the pivot the the right by exchanging the location of them, given a pivot such as the last element in the list; and then to move the elements in the left or right sublist again according to a new pivot until getting exact kth elements. The difference from quicksort is that quickselect only need to recurses on one side where the desired kth element is, instead of recursing on both sides of the partition which is what quicksort

Trend Removal Using the Hodrick-Prescott (HP) Filter

Hodrick-Prescott filter (see Hodrick and Prescott (1997)) is a popular tool in macroeconomics for fitting smooth trend to time series. In SAS, we can use PROC UCM to realize the HP filter.  The dataset considered in this example consists of quarterly real GDP for the United States from 1947-2016  (b illions of chained 2009 dollars ,  seasonally adjusted annual rate ). The data can be download from this link  https://fred.stlouisfed.org/series/GDPC1   %macro hp(input= ,date= ,int= ,var= ,par= ,out= ); proc ucm data=&input; id &date interval=&int; model &var; irregular plot=smooth; level var= 0 noest plot=smooth; slope var=&par noest; estimate PROFILE; forecast plot=(decomp) outfor=&out; run; %mend ; % hp (input=gdp,date=year,int=qtr,var=gdp,par= 0.000625 ,out=result); I use SAS MACROS to define a function for HP filter. "input" is the data file you use, "date" is the variable for time, "int&qu