POLS 3045: Public Opinion and Political Behavior: Fall 2021
Research Paper Assignment
In this paper assignment, you will write a research paper that analyzes either public opinion on a particular issue or
analyzes political behavior (voting and/or other forms of participation in the political system) using data from the
2020 American National Election Study (ANES).
Throughout this paper assignment, we will use the example of writing a paper on the topic of attitudes toward
same-sex (or gay) marriage. You are required to select a different topic.
You should approach this assignment as follows:
1. Using Survey Documentation and Analysis (SDA),
https://sda.berkeley.edu/sdaweb/analysis/?dataset=nes2020full, you should find a variable that measures an
aspect of public opinion or political behavior you are interested in studying.
The easiest way to do this is to use the Search panel to find variables using keywords; for example, if you
want to investigate public opinion on same-sex marriage, you might search for terms such as “marriage,” “gay,”
A search finds that V201416 is a variable that relates to attitudes related to same-sex marriage; its label is
“PRE: R position on gay marriage.” In this label, “PRE” means that the question was asked in the pre-election
interview (similarly, “POST” means the question was asked in the post-election interview). “R” typically
means “respondent,” or the person who was interviewed. So we can tell that this variable probably has the
information we need in it.
2. Once you have identified one or more variables that might reflect the attitude or behavior you want to study,
you should investigate each variable in more detail by using the View button next to the variable name. This
will pop up a window showing the distribution of responses to the question, along with a “value” or code for
For example, if we were to view V201416, we would find that 68.2% of respondents (corresponding to 5,568
interviewees) took the position “Gay and lesbian couples should be allowed to legally marry”; 18.6% (1,517)
said “Gay and lesbian couples should be allowed to form civil unions but not legally marry”; and 13.2%
(1,076) said “There should be no legal recognition of a gay or lesbian couple’s relationship.”1
If there were
other variables about same-sex marriage, we would also want to view those.
3. Now that you have identified a variable that measures the concept you want to study, you should put aside the
SDA website for a while and find some existing academic research on the topic you plan to look at or similar
topics. There are a number of tools that are suitable for finding academic research: Google Scholar at
https://scholar.google.com/ is fairly good for finding academic articles, working papers, and books; GALILEO
is also a useful resource for this. It may also be helpful to work with a reference librarian to find existing
research. You should then read through the things you have found, and keep track of what you looked at so
you can use those articles and books as sources when writing the paper itself and so you will remember to
properly cite them.
For example, we would want to search for academic articles and books on attitudes toward gay marriage. If
we couldn’t find much on this topic, we probably would widen the net to include opinion about related issues
(i.e. attitudes toward gays and lesbians more generally, along with attitudes toward other restrictions on
A good research paper will typically rely on several distinct sources; if you identify fewer than five, you
probably need more.
1You will also note that there are categories for people who refused to answer the question and those who said they did not have an opinion
(“don’t know”). These are not counted among the “valid” responses and we would typically ignore them in an analysis.
4. Based on the articles and other research you examine, you should identify at least two factors that you think
might influence the attitude or behavior (outcome) you want to study. These factors could be things that
other studies used, or they could be factors that you think are important but were not included in earlier
studies. It may be helpful to consider factors that influence other attitudes and behaviors as well.
Once you have identified some factors that might influence the outcome you wish to study, you should search
the 2016 ANES for relevant variables that might reflect the factors you have identified. It may prove
impossible to find a variable for some of the factors, but there should be data on the most common factors.
For example, attitudes toward same-sex marriage are likely to be influenced by age, race and ethnicity, gender,
party identification, religiosity (level of devoutness to one’s religion), religious affiliation (denomination or
faith tradition, such as Catholic or Hindu), ideology, and numerous other factors. There are variables in the
ANES reflecting each of these concepts (age, for example, is V201507x). You may need to use alternative
search terms; “religiosity” does not find any variables, but words like “church,” “religion,” etc. would.
5. You should now come up with at least two hypotheses about the relationship between the outcome and the
factor(s) that might affect it. For example, you might hypothesize that younger people are more likely to
support same-sex marriage than older people, while people who consider themselves “born-again Christians”
would be less likely to support same-sex marriage than Christians who do not consider themselves “born
6. You can now test your hypotheses using the SDA system under the Analysis tab. There are a number of
options in SDA for hypothesis testing:
Tables Contingency tables (crosstabulations or crosstabs) are typically best when both the explanatory and
outcome variables are categorical variables (including ordinal variables); that is, when the variables are
measured with values that have no intrinsic numeric meaning. For example, the same-sex marriage
variable (V201416) is categorical since the three values just reflect three different responses to the
If either variable is interval or ratio (i.e. a variable that has meaningful numeric values, like age or
weight), you should recode the variable to have categories. For example, you would want to recode the
age variable (V201507x) into categories before using it in a contingency table. You might recode age
into three categories—18–39, 40–64, and 65 and over—or you could recode into generations (“World
War 2,” “Baby boomers,” “Generation X,” “Millenials,” “Generation Z”) based on how old each
generation was in 2020.
Means The comparison of means (difference of means) approach works best when you have an outcome
variable that is interval or ratio. The explanatory (row) variable should be categorical; if your
explanatory variable is interval or ratio, you will want to recode it to have categories to do a comparison
Regression Like the comparison of means approach, this works when you have an interval or ratio outcome
variable. The explanatory variable (or variables) should be interval or ratio; any categorical explanatory
variables should be transformed into one or more dummy variables. (You should not use dummy
variables for outcome variables in regression.)
Logit/probit More advanced approaches that can be used when you have a dichotomous (two-category)
outcome variable or can convert the outcome to be dichotomous (i.e. a dummy variable); for example,
whether or not someone voted, or whether someone voted for Biden or Trump in 2020 (you would
need to exclude those who didn’t vote or voted for independent/minor party candidates).
Like in regression, your explanatory variables would need to be interval or ratio, or converted to dummy
Note: In general I would recommend avoiding regression or logit/probit unless you are familiar with using them
already. You should typically be able to answer these questions using crosstabs or difference of means.
For more information on how to use the SDA system for recoding and creating dummy variables, you can
refer to the documentation or the practice assignment below.
7. You should then write your research paper based on the findings of this analysis. The paper should include an
introduction, a literature review, statements of your hypotheses, a brief description of the data you analyzed
and how you analyzed it, your results, and a discussion of those results, with some conclusions about your
findings and how they fit into the existing research on your topic.
Your paper should typically be somewhere between eight and twelve pages in length; it should be typed and
double-spaced using a reasonably-sized font (11 or 12 point, proportional) and reasonable (one-inch) margins. Be
sure to properly cite any paraphrased or quoted material, and include a list of works cited at the end of the paper.
For further guidance on requirements for written work for this class, please refer to the syllabus.
You should also include any appropriate graphs or tables in your paper, usually by cutting and pasting; if printing in
black and white, disabling the “color coding” option in SDA will make any tables you include clearer. (You should
not, however, include the full output of each SDA procedure; instead, you should be judicious in selecting the
important part of each analysis to include.)
Some useful resources in writing your paper may include:
• Lisa A. Baglione. 2019. Writing a Research Paper in Political Science: A Practical Guide to Inquiry, Structure, and
Methods, 4th ed. Washington, D.C.: Sage/CQ Press.
• Jane E. Miller. 2015. The Chicago Guide to Writing About Numbers, 2nd ed. Chicago: U of Chicago Press.
• Jane E. Miller. 2015. The Chicago Guide to Writing About Multivariate Analysis, 2nd ed. Chicago: U of Chicago
You should complete steps one and two of the assignment (identifying a topic to study) by Wednesday, October
13th at 11:59 p.m. Once you have identified a topic and a suitable outcome variable, you should send me an email
that briefly describes the topic of interest and at least one variable in the 2020 ANES that reflects attitudes or
behavior on that topic. I will review the information and give you guidance on the next steps; if you have found
multiple potential outcome variables, I will also suggest which outcome variable is most likely to lead to a fruitful
analysis. (Completing this part is worth 15% of the final paper grade.)
Then, you should complete steps three, four, and five; upon completing these steps (before Wednesday, November
3rd at 11:59 p.m.) you should send me an email with the relevant information from steps one through five; namely:
the outcome you are studying and the variable that reflects it, the explanatory factors you will explore and associated
variables, and your preliminary hypotheses about the relationships between the outcome and explanatory variables. I
will review your submission and, again, provide any necessary guidance to proceed further; this will likely include
suggestions about the analysis to conduct in step six. (This part is also worth 15% of the final paper grade.)
From there you should complete steps six and seven. I will be available for consultation as you conduct your analysis
if you have any questions or need assistance. The final paper will be due in the Brightspace drop box on the last day
of class (Wednesday, December 1st at 11:59 p.m.). Please refer to the syllabus for late penalties. The paper itself
will count as 70% of the final paper grade; it will be evaluated using a rubric that will be posted in Brightspace.
SDA Practice Assignment (optional)
This is intended as a brief walk-through of how to use the SDA system for analyzing public opinion and political
behavior data. You do not need to turn in this assignment; it is simply provided to give you some practice using
the SDA tool if needed.
You will need to access the “General Social Survey (GSS) Cumulative Data File 1972–2016” data set for this
exercise. You can either search for this data from the SDA website at https://sda.berkeley.edu/ or enter the
following address into your browser directly: https://sda.berkeley.edu/sdaweb/analysis/?dataset=gss16
We are going to investigate trends in party identification over time. This variable is called PARTYID in the GSS
cumulative file. Using the SDA system, answer the following questions as part of a short research paper (e.g. write a
single essay that answers these questions; do not respond to each question separately):
1. In 2016, what percentage of respondents said they were “strong” or “weak” Democrats?
Hint: Using “Tables” in SDA, the “Row” variable should be PARTYID(0-6)2
and the “Selection Filter” should be
set to YEAR(2016). You should add up the percentages for these two categories.
2. In 1972, what percentage of respondents said they were “strong” or “weak” Democrats?
Hint: The “Selection Filter” should be changed to YEAR(1972). Again, you should add up the percentages for
the “strong” and “weak” categories.
3. To investigate the trend in party identification, rather than producing a chart for each year it would probably
be easier to combine all the years of data into a single chart. We can do this by removing the selection filter,
YEAR(1972), and using the “year of study” variable, YEAR, as the “Column variable.”
To simplify the chart, “recode” the “Row” variable by replacing that entry with:
PARTYID(r: “D” 0-1; “I” 2-4; “R” 5-6)
Doing this will consolidate the “Democrats,” “Independents,” and “Republicans” into separate categories.
This chart will also be clearer if you use a Line Chart under Chart Options.
What general trend(s), if any, can you identify in party identification over the 1972–2016 period? Based on
the readings and your own knowledge, what reason(s), if any, can you offer to explain these trend(s)?
4. Use one the following variables as a control variable to investigate differences between individuals based on
• Gender: male or female, SEX.
• Race: black or white, RACE.
In 2016, what was the distribution of party identification for each category of your control variable? What
reason(s), if any, can you offer for why the control variable makes a difference in peoples’ party identification?
Hint: As in part 1, the “Selection Filter” should be set to YEAR(2016); you should use your control variable in
the “Control” box. The “Row” box should contain PARTYID, while the “Column” box should be empty.
I recommend choosing either a Bar Chart or Stacked Bar Chart here.
You should get two sets of results: one for each category of the control variable, and an overall set of results
including both categories (showing “All Valid Cases”); you do not need to report the overall set of results.
2The (0-6) part means that we will only include responses coded between 0 and 6, representing the “strong Democrats” through the “strong
Republicans,” excluding the “other” responses.
5. What is the trend in party identification over the 1972–2016 period for each group of the control variable
you were assigned in part 4? Can you offer any explanation for this trend, based on historical events or other
Hint: The column variable should be set to YEAR and the selection filter should be empty. As in part 3, it will
be easier to identify the trend if the row variable is recoded as PARTYID(r: “D” 0-1; “I” 2-4; “R” 5-6).
The chart will be clearer if you use a Line Chart in this case.