A Discussion on Survey Weights

| Filed under: National, Ontario, Toronto

A brief explanation of survey weighting

by William Schatten

There has been much discussion over the Forum Poll’s recent Toronto mayoral poll which showed Doug Ford closing the gap on John Tory. The discussion has focused on one subtable in the data release which shows that Doug Ford is only winning amongst 18-34 year old voters, while every other age category has John Tory in the lead.

To respond to this discussion a brief examination on the statistical methods of survey weighting is required.

What is a weight?

A weight is a value assigned to each respondent in a survey sample. This value represents how much a respondents' answers are weighted in final survey results. Researchers assign these values based on how a sample matches up with the actual population. For actual population numbers Forum utilizes Statistics Canada Census Data.

Why do you use weights?

A weight is used in conjunction with random dialing in order to make a survey sample as representative of the targeted population as possible - in this case Toronto.

Example: Weighting by Gender

For example, let’s assume the city of Toronto has approximately a 50-50 split of males versus female. (Actual numbers for voters in Toronto is approx. 47% male and 53% female – Census Metropolitan Area of Toronto)

Let's also assume that a survey was completed and it has a randomly selected sample of 600 males and 400 females. 

As shown above we can see there is a discrepancy between the sample size and the population. 

To solve this issue researchers would assign a value to all male respondents decreasing the weight of their responses. A value would also be assigned to females increasing the weight of their responses. These weights are used to bring the sample's distribution in line with the population's.

Example Calculation

The weighting formula is: Population Distribution / Sample Distribution = Weight

In this case the equation is:

Male: 50% / 60% = 0.833

Female: 50% / 40% = 1.25

Therefore the assigned weight for all male respondents is 0.833 and for all females it is 1.25. 

Why is Doug ahead if only 18-34 year old voters favour him?

A bi-variate subtable, such as the age table above, only shows the vote broken down by one variable. The final popular vote numbers Forum publishes take into account multiple variables, including but not limited to: Age, Gender, Region, and vote likelihood (when applicable). 

Therefore the final calculation not only considers age but also a series of other factors.

Building on our male and female example above, let’s see what happens when we add another factor into the mix.

Example: Adding another variable - Region

Continuing with our example above, we are now going to add region to our weighting. Let's assume the GTA is made up of approximately 750,000 people living in Old Toronto and 1,250,000 living in the suburbs to the West, North, and East. This population distribution equates to approximately 38% of eligible voters living in Old Toronto and 62% in the suburbs. 

Remember that males have a lower weight than females already based on the gender data. Now let’s assume that of the 600 males surveyed in our example above, 45% were in Old Toronto and 55% were in the suburbs. 

The regional weight calculation therefore is:

Males in Old Toronto: 38% / 45% = 0.833

Males in the Suburbs: 62% / 55% = 1.13

The males who are in Old Toronto will again receive a slight reduction to their overall weight, while those who live in the suburbs will receive a slight increase. Again the goal of the increase/decrease is to make the sample proportionate to actual Census Data in order to make the sample more reliable and representative of the public at whole.

The process continues on in this manner until all applicable variables have been considered. 

Why bother showing demographic subtables if they can't be used for vote prediction ?

The subtables that we show in our data releases have a value, but that value is not to try and calculate final vote just using the info in that table alone.

What the age table reveals to us is that overall - without considering any other factors - Doug Ford is the most popular amongst young voters aged 18-34. Tory is most popular amongst every other age category. This is the only conclusion we can draw from this table. We cannot project a final vote from this data, and if we did it would not be representative of Toronto.


William Schatten is a Research Director at Forum Research. He can be reached at WSchatten@ForumResearch.com.