Inductive Generalizations

Thaddeus Robinson

Common Inductive Arguments

14 Inductive Generalizations

Section 1: Introduction

The ability to reason using generalizations is one of our most basic rational functions. We generalize all the time, and once we believe a generalization we readily apply it to new cases. Put otherwise, we have a natural capacity to reason from experience to a generalization and once we believe some generalization is true, we are able to easily put this knowledge to use. We never suspect, for example, that the elderly woman walking behind us down the sidewalk intends to rob us. We never suspect this because we have the general belief that few elderly women are street thieves. Reasoning to and from generalizations is largely an inductive process, and in this chapter we will focus on the practice of reasoning to a generalization from particular instances. We will discuss what makes these arguments distinctive, identify some crucial questions for determining logical strength, and will end by considering a common bias that influences the process of generalization.

Section 2: Generalizations

Generalizations are common in our reasoning. In order to think about them clearly, let’s get some examples on the table (starting with one of the examples from the introduction):

Ex. 1: Few elderly women are street thieves.

Ex. 2: Most science textbooks cost more than $150.

Ex. 3: 77% of felony criminals are repeat offenders.

Ex. 4: All birds are warm-blooded.

You can probably see a pattern here. In each of these examples, we are being told something about the members of a group. Among the group of elderly women there are few street thieves, and among the group of science textbooks most cost more than $150, and so on. Accordingly, we will say that a generalization is a claim about how many members of a particular group have some property, feature, or characteristic.

There are three notable parts to any generalization. First there is a quantity term (‘few’, ‘most’, ‘77%’, and ‘all’). As you can see from the examples above, the quantity term can be more or less precise. Second, there is the subject term. In a generalization, the subject term refers to the group we are talking about a quantity of (‘elderly women’, ‘science textbooks’, ‘felony criminals’, and ‘birds’ respectively). This group—the group referred to by the subject term—is called the subject class. Third, there is the predicate term. The predicate term refers to the property, feature, or characteristic had by the members referred to by the subject term (‘street thieves’, ‘expensive’, repeat offenders’, and ‘warm-blooded’ respectively). We will refer to the group of all things that have the property, feature or characteristic identified by the predicate as the predicate class. Using these terms, we can understand Ex. 4 as saying that every individual that is a member of the subject class ‘bird’, is also a member of the predicate class ‘warm-blooded things’. This will probably strike you as a somewhat stilted and unnatural way to read this simple sentence. It is, but putting things this way will help us understand more clearly what is going on in generalizations moving forward.

Generalizations that make a claim about all or none of the members of the subject class are called universal generalizations (see Ex. 4 above). For the time being we will put aside these generalizations to focus on generalizations that claim something about most or few of the subject class. Such generalizations are called statistical generalizations. We are not always clear about whether we intend a universal or a statistical generalization, and in many cases people leave the quantity term unstated. Consider the following:

Ex. 5: Dogs like to chew on bones.

The quantity term is unstated, and so this generalization is ambiguous. While it isn’t always important to identify precisely what an author or speaker intends, sometimes it is. In these cases, normally it is charitable to interpret ambiguous generalizations as expressing statistical generalizations. The reason for this is that it only takes one counter-example to show that a universal generalization is false. That is, if we interpret Ex. 5 as a universal generalization all we need to do is find one dog that doesn’t like bones. A statistical generalization claims only that most dogs like to chew on bones and so will not be falsified by a single counter-example.

How do we use generalizations in our reasoning? Perhaps the most important way we use generalizations is to make predictions about particular people, objects, or events. To return to a previous example, given the generalization that few elderly women are street thieves as a premise, we can apply this to a particular case and conclude that the elderly woman walking down the street behind us is probably not going to try to rob us. We not only use generalizations as premises, however; often we reason to a generalization in a conclusion. Put otherwise, sometimes a generalization is the result of our reasoning, not a starting point for it. In what follows we will separate our discussion of generalizations into these two contexts—reasoning from a generalization (Chapter 15) and reasoning to a generalization (this chapter).

Section 3: Inductive Generalizations

Let’s begin by looking at a couple of examples of reasoning to a generalization. Here is example that might appear in everyday conversation:

Ex. 6:

Imani says, “The textbook prices for science classes are out of control! Most science textbooks cost more than $150! I mean over the last two years I’ve bought 6 science textbooks, and they have all cost more than $150.”

Imani is drawing two conclusions here. First, on the basis of her experience buying science textbooks, she is concluding the most science textbooks cost more than $150. Second, she is using the generalization about textbook prices to argue that textbook prices are out of control. Let’s zero in on the first argument because this is where she reasons to a generalization.

She argues:

I’ve bought 6 science textbooks and each one cost more than $150.
So, most college textbooks cost more than $150.

Here is another example:

Ex. 7:

Carly: I’m thinking about trading in my laptop for a new Copperbook.

Jason: Why would you do that? I had Copperbook before, and I was always having problems with it. If you ask me Copperbooks are junk!

Jason is reasoning to the conclusion that Copperbooks are junk, on the basis of his experience owning a Copperbook. Standardized, Jason is arguing:

I had a Copperbook before, and was always having problems with it.
So, Copperbooks are junk!

When these examples are standardized, it is not difficult to see that both arguments have a significant gap. Both arguments start with an experiential statement: Imani’s experiences buying textbooks and Jason’s experience with his Copperbook. Both arguments then jump to a much broader generalized conclusion. Imani draws a conclusion about most science textbooks, and Jason concludes that most Copperbooks are junk. Given the Golden Rule of Argument Interpretation, we will take it for granted that each speaker is assuming a connection between the stated premise and conclusion. What is it likely to be?

A stack of Biology books — “Biology and Neuroscience books” by brewbooks CC BY-SA 2.0

Let’s focus for a moment on Ex. 6. Despite the fact that Imani is drawing a conclusion about most science textbooks, Imani has not bought, or even seen, most science textbooks; after all, there are hundreds of science textbooks on the market, and she has experience shopping for only a few of these. In this argument, she is applying what she knows about the textbooks she’s bought, to textbooks outside of her experience, and this only makes sense if she is assuming that the science textbooks she doesn’t know about are just like the ones she does know about. We can capture this idea using the term ‘representative’:

I’ve bought 6 science textbooks and each one cost more than $150.
The books I’ve bought are representative of most college textbooks with regard to price. (MP)
So, most college textbooks cost more than $150.

The deep analysis of this argument shows us how the premise connects to the conclusion, namely because Imani takes her experiences to be like or representative of science textbooks more generally. More broadly speaking, in arguments of this form a conclusion is drawn about a group on the basis of observations or experiences of only a part of that group. Arguments with this form are known as inductive generalizations.

In thinking about inductive generalizations, it will be helpful to add two more terms to our vocabulary: sample and population. In an inductive generalization, the sample is the group of experiences or observations that serve as premises, whereas the population is the group the conclusion generalizes about. In different terms, the population is the subject class of the concluding generalization. Thus, Jason’s sample is the one Copperbook within his experience, and the population is Copperbooks in general. As a brief aside, the term ‘population’ is often used to refer only to groups of people, as in “the population of Cincinnati is about 300,000.” However, we are using this term much more broadly to cover anything that is the subject of a generalization. Given these terms, we can define inductive generalizations more specifically:

An inductive generalization concludes that the population has some characteristic because the sample has that characteristic.

Inductive generalizations are a particularly well-known and well-studied inductive form, and this is, in part, due the importance they play in public discourse. After all, polls and surveys are inductive generalizations. Consider the following:

Ex. 8:

According to a Muhlenberg College/Morning Call poll taken in early fall of 2016, Hilary Clinton was leading Donald Trump in the presidential race 40% to 32% among likely Pennsylvania voters. ^[1]

This makes a claim about a very large population of people—namely likely Pennsylvania voters. How did the pollsters behind the Muhlenberg College/Morning Call poll figure this out, that is, what was their evidence for this conclusion? They did not go out and ask every likely Pennsylvania voter (doing so would cost too much and take too long); rather they took a poll of a small group of likely Pennsylvania voters (as it turns out—405 people) and generalized upon those results to make a claim about Pennsylvania voters as a whole. We can represent the reasoning here as follows:

In early fall of 2016, 40% of the 405 polled likely Pennsylvania voters planned on voting for Clinton while 32% planned on voting for Trump.
The sample is representative of the population with respect to voting plans. (MP)
So, in early fall of 2016, close to 40% of all likely Pennsylvania voters planned on voting for Clinton while close to 32% planned on voting for Trump.

Ultimately, in an inductive generalization we extend or extrapolate upon our experience to draw a conclusion about a whole class of unexperienced objects or events. In doing so, we claim that our experiences of some class match or are representative of that class more broadly. Our goal, when it comes to inductive generalizations, is to distinguish the logically strong ones from the logically weak ones. Let’s take up this task.

Section 4: Representative Samples and Logical Strength

Our deep analysis above shows that the core element of inductive generalizations is the claim that the sample matches, reflects, or is otherwise representative of the population. This means that the most important part of evaluating an inductive generalization for logical strength is to determine whether the sample really is representative. Before we walk through the key steps in this process, we need to take note of the fact that a sample can be more or less representative of the population with respect to some characteristic. A sample can be perfectly representative in which case the sample is exactly like the population, or it can be less than perfectly representative in which case it approximates the population to a certain degree. In a logically strong inductive generalization, the sample approximates the population to a significant degree. Put this way, this is a pretty vague requirement, but in most everyday contexts we won’t be able to be any more specific. When it comes to professional surveys and polls, however, a number of techniques allow pollsters to be very precise. The Muhlenberg College/Morning Call poll above, for example, gives a margin of error of 5.5 percent. The margin of error in a survey or poll is a measure of how representative the sample is of the population with respect to the characteristic in question. In giving the margin of error in the poll above, the authors are saying that in early fall of 2016, somewhere between 34.5% and 45.5% of all likely Pennsylvania voters planned on voting for Clinton (40% plus or minus 5.5%), and somewhere between 26.5 and 37.5% planned on voting for Trump (32% plus or minus 5.5%). Overall, when we are evaluating inductive generalizations for logical strength, we are trying to determine whether the sample approximates the population to a significant degree.

Section 5: Is the Sample Representative?

So how can we know that a sample approximates the population to a significant degree? The answer is twofold. First, the sample must be sufficiently large. In order for a sample to be representative it must be large enough to capture the diversity of the population. To illustrate, return to Ex. 7 above. Jason is drawing a conclusion about most Copperbooks, namely that they are junk, on the basis of one Copperbook. Clearly this is a weak argument. The fact that one Copperbook laptop is unreliable is no more a guide to the reliability of Copperbooks overall than the fact that one human is 8 feet tall is a guide to the height of humans more generally. Why? Just as one finds a lot of variability in the height of humans, so too one would expect to find a lot of variability in the reliability of a particular brand of laptop, and a sample that is not large enough to reflect this variability cannot be a representative sample. To generalize on a sample that is too small is to draw a premature conclusion that is not justified by the evidence. Let us call a failure with respect to sample size, a hasty generalization. More specifically, we will say that:

A person gives a hasty generalization when they generalize on the basis of a sample that is too small to capture the diversity of the population.

Thus, Jason has made a hasty generalization in drawing his conclusion that Copperbooks are junk. Again, his sample size is way too small to capture the diversity of Copperbooks with respect to reliability.

Sample size itself does not guarantee that a sample is representative. After all, a sample might be large enough to capture the diversity of the population without actually doing so. This brings us to the second key issue when it comes to evaluating inductive generalizations: is the sample sufficiently diverse? A sample is not sufficiently diverse if it disproportionately includes observations or experiences that will likely bias the sample. If we wanted to know, for example, what percentage of American baseball fans favor for the Yankees, it would be a bad idea to get our sample by polling people at the entrance to a Yankees home game! After all, we would get a disproportionate number of Yankee fans in our sample. Rather, you would want to collect a sample of baseball fans from all over the country. An inductive generalization whose sample is not sufficiently diverse is called a biased generalization. More specifically, we will say that:

A person gives a biased generalization when they generalize on the basis of a sample that does not adequately reflect the diversity of the population.

As is probably clear, sample size and diversity are related. A sample cannot be diverse if it is too small. Again, just because a sample is large does not mean that the sample is sufficiently diverse. One of the most famous mistakes in polling history illustrates just this fact. The candidates in the presidential election of 1936 were Franklin Roosevelt and Alf Landon. As part of an effort to predict the outcome of the election a magazine called Literary Digest sent out 10 million questionnaires and received 2.5 million back. The returned questionnaires indicated that Landon would win 56% to 44%. In reality, the results of the election were reversed! Roosevelt beat Landon 62% to 38%. The sample size in this case was sufficiently large—in fact it was massive. The problem is that the addresses to which the questionnaires were sent were taken from phone books and various club membership lists, and in 1936 many people did not have phones—particularly the poor. Similarly, these same people tended to not show up on club membership lists. The sample did not reflect the views of this sub-group of the American population and most importantly, this sub-group voted almost exclusively for Roosevelt. Thus, although the sample was large enough, because its diversity was compromised it was a biased generalization.

Consider another illustration of the relationship between sample size and diversity: a population with little to no variability or diversity with regard to some characteristic. Here is an example:

Ex. 9:

A chef is making soup as part of the evening’s menu. As she finishes, she takes a quick taste of the soup, and decides the soup needs a little salt.

It might be surprising, but this is an inductive generalization. Is it a good one? On its face, the answer seems to be no. The chef took one taste and drew a conclusion about the soup as a whole, and so we might think this argument is as weak as Jason’s in Ex. 7. But remember, we insist on large sample sizes primarily to ensure diversity. Soup is something that is largely homogenous—one spoonful of soup tends to be just the same as any other. There just isn’t much difference between one spoonful of soup and another with regard to taste. Because there isn’t much diversity among spoonfuls, a very small sample size—maybe even one—can suffice. There are many examples of populations that are mostly or entirely homogeneous with respect to some characteristic or another: most fish have gills, most cars have radios, and so forth. Overall, then, when we have reason to believe that some characteristic is not likely to vary much across a population, then a small sample sizes can be sufficient to ensure logical strength.

Bowl of coconut pumpkin soup — “coconut pumpkin soup” by stu_spivack CC BY-SA 2.0

As a final note, when it comes to scientific polls and surveys, it is important to be aware that there are sophisticated mathematical and statistical methods pollsters can use to find out how large a sample must be. Similarly, there are a variety of sampling techniques one can call upon to try to ensure a diverse sample. A discussion of these methods and techniques is outside the scope of this introduction. Nevertheless, there are a few things a non-specialist should keep in mind when it comes to polls and surveys in particular. First, finding out how big a sample was and how it was collected will often give you enough information to identify suspect polls and surveys. Second, statisticians have found that a sample of around 1000 people—if properly selected—can accurately represent the view of millions of people. For example, according to the National Council on Public Polls, the average result of the 19 major nationwide polls collected right before the 2008 presidential election varied less than 1% from the actual results of the election.^[2] While this is an amazing level of precision, it is important to point out that national polls are not always this accurate (as the U.S. presidential election of 2016 showed). Third, whether a sample is biased or not is largely a matter of how the sample was collected. The single best way to prevent bias from creeping into your sample is to take a random sample. A sample is random when every member of the population has an equal chance of being chosen as a member of the sample. That a sample is random does not guarantee that it is unbiased, but it greatly limits its likelihood. The Muhlenberg College/Morning Call poll, for example was drawn from a random sample of telephone numbers (conventional and cellular).

Where does this leave us? The crucial question to answer for any inductive generalization is whether the sample is representative of the population. We have identified two steps to answering this question. First, we determine whether the observations or experiences in question constitute a large enough sample. Second, we determine whether the observations or experiences in question constitute a diverse enough sample.

Two Questions to Ask of Inductive Generalizations:

Is the sample large enough? (Failure: Hasty Generalization)
Is the sample diverse enough? (Failure: Biased Generalization)

Section 6: The Psychology of Generalization

In everyday contexts we rarely have access to carefully and rigorously collected information. More commonly when we generalize we merely consult our experience. Think about the example at the beginning of the chapter: few elderly women are street thieves. Why do you believe this generalization? Presumably it is not because you have examined crime statistics; rather you likely believe this because you can’t recall ever having heard of a mugging perpetrated by an elderly woman. That is, you have likely never experienced this nor even heard of a case. Because this is an unheard-of event, you conclude that this is probably pretty unusual overall.

In general, using this procedure to judge the frequency of events is very common; in many everyday contexts we use remembered experiences as a guide for drawing and assessing generalizations. Of course we might be wrong. After all, as we have seen, it may be that our experiences are simply not representative. But there is a deeper potential problem—it may be that our memory of our own experiences is not representative, even if our actual experiences are!

In the 1970’s, psychologists Amos Tversky and Daniel Kahneman used a number of simple experiments to show that when we consult our experiences in this way, we unreflectively tend toward one strategy in particular. When we appeal to this strategy, which they called the Availability Heuristic, we use the ease with which instances of some event come to mind, or are available, as an indicator of the frequency of the event. On one hand this might not sound like much of a discovery—after all, it is probably difficult to think of a mugging perpetrated by an elderly woman because this simply doesn’t happen very often! Tversky and Kahneman were well aware of this, and observed that “Availability is a useful clue for assessing frequency or probability, because instances of large classes are usually recalled better and faster than instances of less frequent classes.”^[3] Despite its usefulness, psychologists have found that this heuristic lies at the root of a number of different kinds of mistakes.

In general, the problem with the availability heuristic is that some events come to mind more easily than others, but not because we have experienced them more often. As it turns out, there are a number of factors which tend to make events stick out in our minds more easily than others. Events that are especially vivid, interesting, or surprising are easier to recall; so too recent events are easier to recall than events that occurred long ago, as well as those that have some kind of personal significance. Because of these kinds of factors, our judgments of the frequency of particular kinds of events will be skewed, and so too therefore, will the generalizations we make from them.

For example, consider the prevalence of heavy drinking among college students. Students who are themselves heavy drinkers tend to overestimate the number of other students at their institution who are heavy drinkers, and the availability heuristic explains why. Because drinking tends to be a social event, it wouldn’t be surprising to find out that it is much easier for a heavy drinker to bring to mind other heavy drinkers, than it is to bring to mind non-drinkers. Another interesting illustration of the availability heuristic can be found in people’s (mis)judgments of risk. People tend to think that dying in an accident is much more likely than dying by stroke, when, in fact, the opposite is the case. So too people mistakenly tend to think that dying by tornado is more likely than dying by asthma attack.^[4] Given the availability heuristic it is not difficult to understand these kinds of mistakes. Accidents and tornados are highly significant events to which people (and the media!) tend to pay a lot of attention. The same cannot be said for strokes or asthma attacks.

One final example of the availability heuristic is found in the judgments that people living together make about each other’s contributions to household chores and other shared responsibilities. In general, people tend to overestimate their own contribution and underestimate the contribution of others. This is actually an example of a much broader phenomenon. Overestimation of one’s contribution is common in cases where two or more people are working together towards a shared goal.^[5] Again, the availability heuristic explains why. We tend to remember our own contributions to the group’s work much more easily than the contributions of others. Given the availability heuristic, it seems to us as if our contributions were more frequent than others’.

Again, this is not to say that we should not make use of this strategy. In fact, we can’t help but make use of it, and it gives us accurate frequency judgments at least some of the time. Rather, the lesson to take from these observations about the psychology of generalization is that we should pause to second-guess important generalizations that are based on intuitive frequency judgments.

Exercises

Exercise Set 14A:

Directions: For each of the following first determine whether it is a generalization or not. If it is a generalization, say whether it is a universal or a statistical generalization, and then identify the quantity term or likely quantity term (if the claim is ambiguous), the subject class, and the predicate class.

#1:

Most new video games cost more than $50.

#2:

No prisoner has ever succeeded in breaking out of this prison.

#3:

The diamond was stolen either by the student or by the lawyer.

#4:

For 10 percent of the time U.S. drivers are behind the wheel, their eyes are off the road due to eating, reaching for the phone or texting. ^[6]

#5:

Cattle are frightened by shiny objects.

#6:

Almost everybody who started the race finished it.

Exercise Set 14B:

Directions: Determine whether each of the following is an inductive generalization or not. If so, identify the sample and the population and then comment on the argument’s logical strength. Is the argument likely to be logically strong or not, why?

#1:

Almost everybody on the soccer team is closely following the World Cup tournament, so I suspect that most Americans are following it too.

#2:

Your friends keep telling you about this hilarious show and they have watched several seasons of it. So, you sit down and watch an episode. You don’t think it was very funny, and conclude “I don’t know what my friends were thinking, that show isn’t very funny.”

#3:

You dip your foot into the river, and conclude that the water is too cold for swimming.

#4:

Most species of tomato love hot weather, so these cherry tomatoes are probably doing pretty well.

#5:

A nationwide poll of a random sample of thousands of college graduates revealed that 75 percent of them are in favor of the proposed debt forgiveness program for student loans. Therefore, roughly 75 percent of the adult population is in favor of this program.

#6:

I don’t think that true, selfless love really exists, since I’ve never experienced it or seen it in other people.

#7:

It is unfair that you won’t let me dance in the recital because of my injury. You let Peter dance last year, and my situation is no different from his.

#8:

A major internet news website asks users to contribute to a daily poll. Today’s question is: when you buy gas do you normally buy the lowest octane fuel available at the station or do you buy higher octane gas? About 2500 people responded, and according to the results, only 20 percent of these people normally buy the lowest octane fuel. Commenting on this poll, a writer for the website writes a piece entitled: “Why about 80% of Americans buy higher octane fuel”.

Exercise Set 14C:

#1:

We generalize on or extrapolate from our experience all the time. Give three examples of generalizations you have made or heard.

#2:

Pick one of the examples you just gave and evaluate it for logical strength.

#3:

Other than those mentioned above, what kinds of events do you think people overestimate due to the availability heuristic?

Olson, L. (2016, Sept. 17). New Morning Call/Muhlenberg College poll shows Clinton ahead in Pennsylvania. Morning Call. https://www.mcall.com/news/local/mc-pa-trump-clinton-poll-20160917-story.html. ↵
National Council on Public Polls. (2008, Dec. 18). NCPP Analysis of Final Presidential Pre-Election Polls, 2008. http://www.ncpp.org/files/NCPP_2008_analysis_of_election_polls_121808%20pdf_0.pdf. ↵
Tversky, A. and Kahneman, D. "Judgment under Uncertainty: Heuristics and Biases." Science 185 (4157), 1127. ↵
Lichtenstein, S., Slovic, P., Fischhoff, B., Layman, M., & Combs, B. (1978). “Judged frequency of lethal events” Journal of Experimental Psychology: Human Learning and Memory, 4, 551–578. ↵
Ross, Michael and Sicoly, Firore. (1979). “Egocentric Biases in Availability and Attribution.” Journal of Personality and Social Psychology 37: 322-336. ↵
US Drivers Take Eyes Off the Road 10 Percent of the Time. (2014, Jan. 1); UPI.com;http://www.upi.com/Health_News/2014/01/01/US-drivers-take-eyes-off-the-road-10-percent-of-the-time/UPI-23811388634974/) ↵

License

Icon for the Creative Commons Attribution-NonCommercial 4.0 International License