How to Read Survey Data (and How Not To)

Through some of my classes here, I’ve become more interested in the math and science of surveys and polls. I know that some people don’t trust polls, but the more I learn about them, the more fascinating I find them. I’d like to know more about how they work and how people should use them. Experts, what can you tell me about polls and surveys? Why is it that some people don’t trust them? What sort of opportunities are there in terms of careers and research related to polls?

Polls and surveys are powerful tools that we can use to better understand what people think, see, believe, and act on. But they are also often misunderstood: plenty of people say they don’t trust survey data or “the polls,” and claim to see high-profile failures in things like election polls.

The reality is a little more complicated, though, and much rosier for pollsters than the general public might assume. The central problem has to do with how people perceive the workings and purpose of polls and surveys (by the way, a distinction is sometimes made between a single “poll” question and a multi-question “survey,” but, for convenience, we’ll use the words interchangeably here). To many casual observers, it seems that certain conclusions follow naturally from polls: for instance, if one political candidate is winning a poll, observers may presume that the poll is claiming to predict the coming election.

But that’s not really true. Part of the key to understanding poll data is knowing how to read it–and knowing when you may safely draw a conclusion based on the results. Polls and surveys don’t tell us what to think: they just help us measure certain things. What we choose to do with the raw information that surveys give us is our own responsibility.

Let’s take a step back and talk about how polls and surveys are conducted, because this will help us see where we might stumble into pitfalls when we read out their data.

The reason polls and surveys work is that we can use statistics to help us be more sure about how a small group of opinions represents a larger and more complex picture. With a large enough and representative enough sample size, we can be reasonably sure that our manageable poll’s pool of respondents is reflective of a much larger group that we could never poll in an affordable way. In short, polls and surveys work because you don’t have to ask everyone to know what everyone’s thinking: you just have to ask a subset of everyone and extrapolate from there.

That means that one of the trickiest and most important parts of designing a survey is determining who you will ask your questions to, and how. The experts involved in the biggest polling organizations on the planet work very carefully to make sure that they don’t skew their data by, for instance, relying only on landline phones to call potential respondents. They know that they need to design their survey in such a way that it results in a pool of respondents that truly represents the group they actually want to know about–whether that group is the citizens of a state or a country, customers of a certain store, or any other group.

The size and appropriateness of the sample is the key to an accurate survey. But no survey can be perfectly accurate, since the whole point is that we’re using a sample to estimate what the results would be if we could ask everyone. There is always a margin of error, and that’s very important to keep in mind.

When all is said and done, a survey gives us raw data. It tells us what people answered, and it tells us how close, statistically speaking, we should assume the results are to what they would be with a larger group. The most powerful surveys allow for cross tabulation, enabling readers and researchers to break down results further and further by demographic groups and other categories. It’s a lot to consider!

Now that we know more about how polls and surveys work, we can start to see how easy it is to fall into pitfalls when we read the data.

For one, we must remember that we’re looking at a sample, and that there is a margin of error to account for here. We should pay attention to the methodology of the survey, and we should take into account the reputation and expertise of the group that ran it. Some observers use polling averages to reduce statistical noise that might result from bias, mistakes, or simple bad luck.

Being able to break down results by demographic groups can yield fascinating insights, but we also must remember that this reduces the size of our sample. Pay careful attention to margins of error for sub-groups within polls, and don’t use a survey for purposes it wasn’t designed for!

And, above all, remember that surveys show us data–not hypotheses. If we look at an election poll that has one candidate upƒ on the other, that tells us how popular that candidate is (among “likely voters,” in the case of most polls). But it doesn’t tell us what those people might think of the candidates a month from now. It doesn’t tell us how procedural rules like the Electoral College might affect things. And it doesn’t tell us if people are telling the truth about how likely they are to vote, or if they may become more or less likely to do so in the days before election day.

Understanding the complexities of polls can be very rewarding, and there are a lot of ways you can explore this subject more. The study of statistics is one of the most direct ways you can learn more about survey and polling methodology. You may also want to look for internship and job opportunities with major polling organizations. Political writers and other journalists deal with polls often, as do political operatives and campaign managers. And, of course, there are professionals who design polls, people who administer them, and experts who write reports and present the data to the public, private clients, and other parties. Which area of polling interests you most is something that only you can answer!