|
|
| Professor of Sociology |
| Chapter 1 | Introduction |
| Chapter 2 | The Nature of Science |
| Chapter 3 | Elements of Research Design |
| Chapter 4 | Measurement |
| Chapter 5 | Sampling |
| Chapter 6 | Experimentation |
| Chapter 7 | Experimental Designs |
| Chapter 8 | Survey Research |
| Chapter 9 | Survey Instrumentation |
| Chapter 10 | Field Research |
| Chapter 11 | Research Using Available Data |
| Chapter 12 | Multiple Methods |
| Chapter 13 | Evaluation Research |
| Chapter 14 | Data Processing and Elementary Data Analysis |
| Chapter 15 | Multivariate Analysis |
| Chapter 16 | Research Ethics |
Introduction
1. People may be consumers of research evidence as professionals reading research articles and reports, or as lay persons learning about research indirectly through the media. In either case, it is important to be able to understand and evaluate research evidence and to know the limits of social scientific knowledge. Only then can one be alert to the methodological errors and misinterpretations of data that all too often occur.
2. Experimenters isolate and systematically manipulate
some feature of the environment and then observe whether systematic
changes occur in subjects’ behavior. Survey researchers administer
questionnaires and interviews from carefully selected and relatively
large groups of individuals. Field researchers immerse themselves in
a naturally-occurring situation in order to gain firsthand information
and often to understand the world as their subjects see it. Researchers
using available data collect and analyze information that has been produced
for purposes other than those for which the researcher uses it.
The Nature of Science
1. The ultimate aim of science is to understand and explain some aspect of the empirical world.
2. The world exists. There is an order to the world. We can know the world through our senses. Empirically verifiable knowledge is possible. Events are causally related.
3. Examples of scientific questions: (a) Why is capital punishment legal in some states but not in others? (b) How are attitudes toward abortion related to religious affiliation? (c) How are levels of intelligence related to social class? Examples of nonscientific questions: (a) Is capital punishment morally wrong? (b) Should abortion be legalized? (c) Should intelligence be valued over physical prowess?
4. (1) One word, one concept; (2) scientists must agree on ways of tying concepts to tangible objects and events; (3) concepts should be judged by their usefulness.
5. The twin objectives of scientific knowledge
are to explain the past and present and to predict the future. Scientific
explanation and prediction are alike in that both have the same logically
deductive pattern: the event to be explained or predicted is a conclusion
that logically follows from a set of statements, one of which is a general
empirical proposition. For example, objects denser than air fall when
dropped. This object is denser than air. Therefore, it fell (explanation)—or
it will fall (prediction)—when dropped.
6. Hypotheses, empirical generalizations, and
laws are statements linking two or more observable events. Hypotheses
are predictions, often deduced from a larger theory, that have not been
tested. Empirical generalizations are inferred from observations. Laws
have been verified repeatedly and are widely accepted. Theories are
sets of abstract statements which have the same form as, but are more
general than, hypotheses and laws.
7. One theory may be judged superior to another
to the extent that it (1) involves the fewest number of statements and
assumptions, (2) explains the broadest range of phenomena, and (3) makes
more accurate predictions.
8. Describing the causal process that connects
events provides a sense of understanding, which is the goal of scientific
research. Moreover, unless a hypothesis or empirical generalization
conveys a causal relationship, it cannot both explain and predict.
9. One such generalization comes from sociobiological
explanations of altruism: Evolution through natural selection has promoted
altruism toward one’s kin. That is, parents who put their children’s
welfare ahead of their own will be more likely to have their genes passed
on to future generations than parents who disregard their children’s
welfare. This generalization claims that altruism exists because of
its survival function. But, of course, the same can be said of any existing
characteristic of humans.
10. Scientific knowledge is considered tentative
for two related reasons. First, complete understanding is not possible
in that every answer inevitably leads to new questions, and every new
fact, law, or theory presents new problems. Second, science bases the
validity of its statements on observable evidence, which is always open
to change through reinterpretation or to possible contradiction by new
evidence.
11. The scientific process cycles between observation
and theory; where one begins in the cycle is arbitrary. At one point,
scientists observe and record facts; then they formulate theories to
describe and explain what they see; then they make predictions on the
basis of their theories, which they check against observations, and
so on. Durkheim began with observations in the form suicide statistics
from various European nations. He developed a sociological theory, based
on the notion of social integration, to account for certain patterns
in these data. Then he further showed how the theory could explain other
variations in suicide rates.
12. The primary distinction between deductive
and inductive reasoning concerns the strength or certainty with which
one can claim that the conclusion follows from the evidence. In deductive
reasoning, the conclusion must be true if all the evidence is true,
whereas in inductive reasoning, the conclusion is probably but not necessarily
true if the evidence is true.
13. Scientists reason inductively when moving
from specific observations to empirical generalizations and theories;
for example, Durkheim inferred that Protestants were more likely to
commit suicides than Catholics based on the fact that predominantly
Protestant nations had higher suicide rates than predominantly Catholic
nations; he also reasoned inductively when he came up with his theory
of egoistic suicide to explain variations in suicide rates that were
a function of religion, marital and family status, and political crises.
Scientists reason deductively when they proceed from general principles
to specific observations or facts, as when they show how specific facts
follow logically from a hypothesis or theory.
14. (1) Empiricism, a way of knowing the world
which relies directly on what we experience through our senses; (2)
objectivity, the making of observations in such a way as to permit agreement
among scientists on the results of the observation; (3) control, the
use of procedures and techniques that eliminate or minimize factors
that might confound the interpretation of one’s observations.
15. Indirect observation involves the use of
instruments, such as a thermometer, scale, or questionnaire, which aid
and extend the scientist’s ability to observe.
16. Because one’s observation (i.e.,
interpretation) of the world is affected by past and present experiences,
including one’s culture and language—factors of which one
is not always conscious--it is rarely, if ever, possible to be completely
free of bias.
17. Objectivity, in science, means that observations
are made and recorded under conditions that allow two or more independent
scientists to agree on what has been observed. This is sometimes called
intersubjective testability.
18. The public dissemination of scientific
knowledge contributes to its objectivity by allowing other researchers,
with their own peculiar biases, independently to evaluate and verify
research evidence.
19. Scientific controls rule out factors, such
as personal bias and error, that might confound the interpretation of
research findings.
20. The depiction of social science as science
is idealized in that theoretical knowledge is not well developed, with
far fewer well established generalizations and less accurate predictions
than are found in the physical sciences. The presentation of science
also may imply that research follows a smooth path from theory to hypothesis
to observation to generalization, and so on, whereas in reality the
path tends to be very irregular and circuitous. Finally, there are pervasive
subjective influences on scientists that affect the selection of research
problems and methods, interpretation of evidence, and adherence to theoretical
ideas.
21. Phenomenologists reject the positivist
model insofar as it assumes that human beings can be treated the same
as nonhuman objects. In contrast to nonhuman objects, humans can interpret
and act upon their interpretations of the world, making it necessary
for the researcher to understand the world from the subject’s
frame of reference. Others claim that the positivist model is inappropriate
because it fails to promote social change, falsely assumes the possibility
of universal laws, and is not self-conscious about the way that language
and culture shape scientific knowledge.
Elements of Research Design
1. The text mentions five general
factors: (1) structure and theoretical development of the discipline;
(2) social problems; (3) personal values and predilections of the
researcher; (4) social premiums such as the climate of opinion in
society and the availability of funding; and (5) practical considerations
such as the time and other resources available to conduct research.
2. (a) the country or nation; (b) the college;
(c) the family; (d) the individual; (e) the novel.
3. Erroneously using information about an
aggregate (e.g., a school) to draw inferences about the units of analysis
that make up the aggregate (e.g., individual students).
4. The unit of analysis is the school. This
conclusion is not valid because it commits the ecological fallacy:
one cannot draw conclusions about individual students from collective
information about schools.
5. (a) School quality is an independent variable;
academic achievement is the dependent variable; social class level
is an uncontrolled extraneous variable. (b) Pay rate is a control
variable; job satisfaction is an independent variable; productivity
is a dependent variable. (c) Conflict between groups is an independent
variable; cohesion within groups is the dependent variable.
6. Research does not consist of the blind
and unguided acquisition of facts. The social world consists of an
unlimited number of variables that may be investigated and facts that
may be observed. Therefore, researchers always face decisions about
what variables to measure and what to ignore, which facts are relevant
and which are not, and how the facts should be interpreted. To make
such decisions, researchers necessarily are guided—implicitly
or explicitly—by anticipated relationships, which constitute
the “theoretical context” of the research.
7. Some examples are: (a) The more hours
of televised violence to which a child is exposed, the more aggressive
he or she will tend to be. As level of education increases, annual
income increases. The higher the population density of an area, the
higher the crime rate. The greater the unemployment rate, the greater
the homicide rate. (b) The lower the median income of an area, the
higher the rate of infant mortality. The more cigarettes smoked per
day, the lower the life expectancy. As age increases, attendance at
religious services decreases.
8. Strength refers to the extent to which
two (or more) variables are associated or correlated; form refers
to how changes in one variable are related to changes in another.
For qualitative variables, we can only describe statistically the
strength of a relationship; however, for quantitative variables we
can statistically measure two properties of the form of the relationship:
directionality and linearity.
9. This means that the probability is less
than 1 in 1000 that this relationship could have occurred by chance
or random processes (in this case the selection of a random sample
of all high school students in the U.S.), assuming that no relationship
exists among high school students nationally. Therefore, the association
is likely to exist among all high school students nationally.
10. Association (Are the variables statistically
associated?), direction of influence (Is the direction of influence
from the presumed cause to the presumed effect?), and nonspuriousness
(Do the variables have a common cause that might have produced the
statistical association between them?).
11. Identifying variables that intervene
between a cause and an effect adds to the theoretical understanding
of relationships. It also provides additional evidence that a relationship
is nonspuriousness, thereby strengthening acceptance of the causal
relationship.
12. This statement captures the idea that
“correlation” is only one of three criteria necessary
to establish a causal relationship. Other examples of obviously spurious
correlations that do not imply causation are (1) a correlation between
the number of birds and the number of leaves on a tree, and (2) a
correlation between the number of drownings and the amount of ice
cream consumed per day.
13. Galle et al. (1972) speculated that the
statistical association between crowding and social pathology might
be due to the variable social class. That is, the lower one’s
social class, the more likely one is to live in areas with a high
population density and the more likely one is to be the victim of
social pathology. (However, when they controlled for social class,
they still found a significant correlation between density and pathology,
thereby providing evidence of nonspuriousness.)
14. (a) The independent variable is whether
or not a student has taken a course in research methods; the dependent
variable is grades (or subsequent grades) in sociology courses. (b)
Academic major (sociology/non-sociology major), class (first-/second-/third-/fourth-year),
grade-point average (GPA), gender. (c) Class standing may create a
spurious relationship insofar as third- and fourth-year students are
more likely to have taken a course in research methods and tend to
have higher grades overall than first- and second-year students. Academic
major may create a spurious relationship if sociology majors are more
likely to take a course in research methods and also are likely to
receive higher grades in sociology courses than non-sociology majors.
GPA could create a spurious relationship if a higher GPA were associated
both with taking a course in research methods and with higher grades
in sociology courses.
15. (a) This statement does not specify how
the variables are related; for example, are Republicans more likely
to be members of the upper class, middle class, or lower class? (b)
This statement clearly spells out the statistical relationship, although
one cannot tell which variable is cause and which is effect. (c) The
hypothesis here is clear except for the ambiguity of the category
“adult.” A better statement would specify the ages over
which intellectual ability is expected to decline. (d) Both “broken
homes” and “delinquency” are not variables, even
though they imply variables. Therefore, this a poor statement of the
hypothesis. To clarify the implied relationship, one should convert
the key terms into variables; for example, children from single-parent
families are more likely to be delinquent than children from two-parent
families.
16. Examples: (1) The higher one’s
level of education, the less likely one is to believe in life after
death. (2) Republicans are more likely to favor capital punishment
than Democrats and Independents. (3) Men are more likely to believe
in life after death than women.
17. None of the hypotheses contains two quantitative
variables; therefore, continuous statements are not possible. Restating
the second hypothesis above as a difference statement, we have: If
someone’s party affiliation is Republican, then he or she is
conservative or reactionary; if someone’s party affiliation
is Democrat or Independent, then he or she is radical or liberal.
18. Exploration, description, and hypothesis
testing.
19. In contrast to descriptive and explanatory
research, which is highly structured, exploratory research is not
based on clearly formulated ideas or a set research design with specific
guidelines for the collection and analysis of data. Hypothesis-testing
research is designed to test specifically formulated hypotheses, whereas
basic descriptive research is designed primarily to collect information
about isolated variables.
20. (1) Selection and formulation of the
research problem; (2) preparation of the research design; (3) measurement;
(4) sampling; (5) data collection; (6) data processing; and (7) data
analysis and interpretation.
Measurement
1. Conceptualization and operationalization.
2. Each indicator of a concept is subject to
error and unlikely to capture the precise meaning of a concept. The
use of multiple indicators reduces error and generally provides a better
overall representation of the underlying concept.
3. An operational definition is a description
of the research operations or procedures required to measure a concept
or variable.
4. Manipulated operational definitions introduce
systematic changes into the environment that are designed to represent
the values or categories of a variable; nonmanipulated operational definitions
estimate the existing, naturally occurring values or categories of variables.
For example, you could measure exposure to violent television by manipulating
the programs subjects watch, such as by having them watch one of two
television programs, one with violent content and the other without
violent content; or you could ask subjects to list their four or five
favorite programs, assign a violence rating to each program, and then
compute an average violence score for every subject.
5. An indicator is a single, observable representation
of a variable, such as a questionnaire item; an index or scale is a
combination of two or more indicators, such as the sum of the responses
to two or more questionnaire items.
6. Leadership: the specific form of measurement
would depend on the context; for example, we would measure campus leadership
differently than leadership in an ad hoc discussion group. For the latter,
one might ask each group member, “who is most influential?”
or “who is the leader of the group?” and then rank order
members in terms of how often they were identified. Campus leadership
could be measured by recording the number of campus organizations (dormitories,
athletic teams, clubs, student government, etc.) in which a person has
held formal positions of leadership (e.g., president, vice president).
Campus involvement: the number of campus clubs and organizations to
which a person belongs. One might obtain verbal reports of actual memberships
by having persons respond to a check list of clubs and organizations.
Quality of life: ask people how happy they are—very happy, pretty
happy, or not too happy? or what is their state of health—excellent,
good, fair, or poor? or how they find their life, in general—exciting,
pretty routine, or dull? (See Davis, Smith, and Marsden, 2002:179-80)
Social class: principal indicators, derived from verbal reports, are
(1) the highest year or grade in school completed (level of education),
(2) occupation (from which one can arrive at an occupational prestige
rating), and (3) annual income. Interpersonal attraction: ask people
how much they like another—very much, quite a bit, a fair amount,
a little, or not at all? (See Byrne, 1971)
8. The numbers in nominal measurement function
as shorthand labels or names for classifying cases into different categories;
the numbers in ordinal measurement indicate rank order: first, second,
third, etc.
9. (a) ordinal; (b) ratio; (c) nominal; (d)
ordinal; (e) ordinal.
10. (a) Presence or absence of an income (although
some researchers would treat this as ordinal as opposed to nominal measurement
because some income is greater than none). (b) under $10,000; $10,000
to 19,999; $20,000 to 29,999; $30,000 to 39,999; $40,000 to 49,999;
$50,000 or greater. (c) Responses to the question, what is your annual
income?
11. For studies in Western societies, the list
might be Protestant, Catholic, Jewish, Other, None. For studies in Asian
societies, it might be Hindu, Muslim, Buddhist, Christian, Other, None.
Common errors here are for students to (1) include overlapping categories
such as Protestant and Baptist, (2) use inappropriate categories such
as “atheist” and “agnostic,” which pertain to
specific beliefs rather than formal religious groups, and (3) omit the
categories “other” or “none,” which are necessary
to make the list exhaustive.
12. (a) not exhaustive (add under $3,000);
(b) neither exhaustive nor mutually exclusive (the GSS uses the following
categories: working full time; working part time; with a job but not
at work because of temporary illness, vacation, or strike; unemployed
or laid off; retired; in school; keeping house); (c) not mutually exclusive
(change last category to “over 30 years”).
13. Reliability is a necessary but not a sufficient
condition for validity. It is possible to have a highly reliable measure
that is invalid, but a valid measure is necessarily reliable.
14. Operational definitions may reflect: (1)
true differences in the property being measured; (2) biases in the method
of measurement (systematic error); and (3) errors due to random or chance
factors (random error).
15. Error may be created by personal differences
in (1) ability to articulate or write, (2) prior knowledge of the material,
(3) fatigue or health, and (4) mental preparedness or confidence; by
situational effects such as (5) distracting noise or an oppressively
warm room temperature; by features of the test such as (6) a set of
questions that are too difficult or unrepresentative of the material
and (7) lack of clarity in the instructions for taking the exam or in
one or more individual questions. (1) and (2) create systematic error
insofar as they bias the test in favor of persons possessing characteristics
other than that which is being tested, and (6) creates systematic error
by underestimating the extent to which all students know the material.
The other sources of error create random or chance differences in the
scores of individuals taking the test.
16. Random errors affect reliability, which
in turn affects validity. Systematic errors affect validity; however,
they do not reveal themselves in tests of reliability since they do
not cause measures to fluctuate randomly either from one respondent
to the next or from one administration of the measure to the next.
17. With verbal report measures, respondents
measured twice may (1) remember and simply repeat their first response,
especially if the interval between the two measures is short, (2) change
their response because of real changes in the property being measured,
and (3) change their response because of changes brought about by the
first administration of the measure. In addition, it is often uneconomical
to “test” the same group of respondents twice.
18. A stability estimate is based upon consistency
over time, as in the test-retest procedure. An equivalence estimate
of reliability is based on measurements made on the same occasion. As
the definitions imply, an equivalence estimate would be preferable;
if a change in the variable occurs over time, a stability estimate will
underestimate reliability.
19. Both techniques for assessing reliability
apply to multiple-item measures (i.e., scales and indexes), and both
assume the equivalence of items within the scale—the equivalence
of either subsets of items, as in split-half reliability, or of each
item in the scale with every other item.
20. Intercoder reliability refers to the degree
of consistency across different interviewers, coders, or observers applying
the same measure.
21. Reliability can be increased by (1) pretesting
to uncover and eliminate misinterpretations and ambiguous wording, (2)
increasing the number of different indicators of a concept (in general,
the more items, the more reliable the measure), (3) conducting item
analyses to identify weak items, and (4) carefully instructing respondents,
or carefully training interviewers in the use of an instrument or measure.
22. Face validity is generally unsatisfactory
because it depends solely on the judgment of the researcher rather than
on objective evidence.
23. Content validity is most appropriate in
tests of skill, knowledge, and achievement, when one can clearly define
the components of the conceptual domain and show that the items of the
test clearly represent these components.
24. All forms of validation are subjective
in the sense that validity rests on the interpretation of research evidence
and ultimately depends on the consensus of the scientific community
that an operational definition adequately represents a particular concept.
25. Criterion-related validation should be
used when validity is strictly a matter of how well a measure identifies
or predicts a particular property (called the criterion), and when a
clearly defined and reliable measure of the criterion exists. It is
only as good, however, as the appropriateness and quality of the criterion
measure.
26. The test of criterion-related validity
is simply how strongly a measure is correlated with or predicts a particular
criterion. The test of construct validity is an accumulation of research
evidence that a measure adequately represents the meaning of a concept;
it is not based on a single prediction but rather on a set of correlations
indicating that the measure discriminates the underlying concept from
related concepts (discriminant validity) and correlates with measures
of other theoretically-related concepts (convergent validity).
27. (1) Positive correlations with measures
of related variables; (2) consistency across different indicators or
different methods of measurement (e.g., verbal reports and observations
of behavior); (3) low correlations with unrelated variables; and (4)
predicted differences among groups known to differ on the concept being
measured.
28. (a) split-half reliability; (b) criterion-related
validity; (c) construct validity; (d) content validity; (e) construct
validity (“known groups” validation, which differs here
from criterion-related validity insofar as this is a measure of a theoretical
construct and is not intended to serve a practical, predictive purpose);
(f) test-retest reliability; (g) construct validity (or lack of discriminant
validity).
Sampling
1. (1) To represent population variability adequately;
(2) to make an investigation feasible by reducing time and costs; and
(3) in the case of extremely large populations, to increase accuracy.
2. The planning and operation of research are more
manageable in the study of a sample than an entire population. More
attention can be given to other elements of research design, such as
questionnaire development, to procedures for locating respondents, and
to the training and supervision of interviewers, all of which enhance
the quality of the data.
3. The Literary Digest poll had a (1) biased sampling
frame, constructed from such sources as phone books, automobile registration
lists, and the Digest’s own subscription list; and (2) large nonresponse
bias created by a reliance on voluntary responses to mailed sample ballots,
and exacerbated by a low response rate. Both of these problems created
a disproportionate number of responses among the financially well-off,
precisely those who were least likely to vote for Roosevelt. The Gallup
poll of 1948 failed partly because (1) undecided voters at the time
of the poll voted predominantly for Truman, and (2) the quota sampling
method then in vogue was biased toward the selection of Republican voters.
4. The first step is the identification of the target
population, the population to which the researcher wishes to generalize.
The second step is the construction of a sampling frame, the set of
all cases from which the sample is actually selected.
5. Sampling frames may be constructed (or operationalized)
by (1) listing all cases or (2) establishing a rule defining membership.
An example of (1) would be a campus telephone directory which lists
all students currently enrolled. An example of (2) would be all persons
who exit a particular polling site on the day of an election.
6. It is impossible to judge a sample by its overall
representativeness because one can never be sure that a sample is like
a population in all respects. Moreover, it is rarely possible even to
evaluate the representativeness of specific sample characteristics,
since this requires knowledge of relevant population parameters, which
researchers usually do not have. Consequently, samples are judged in
terms of the procedures (the sampling design) that produced them.
7. Probability sampling always involves a process
of random selection, so that all cases in the population have a known
probability of being included in the sample. Nonprobability sampling
involves some form of nonrandom case selection.
8. Sampling is biased when cases have different,
unknown chances of being selected. Random selection guards against bias
by giving each case in the population an equal or known chance of being
included in the sample.
9. Every possible combination of cases has an equal
chance of being included in the sample. For example, for a population
of 100 cases, there are 4,950 different combinations of samples consisting
of 2 cases. With simple random sampling, each of these combinations
has an equal chance of being selected as the sample.
10. To increase the accuracy of a simple random sample,
one must increase sample size.
11. Stratified random sampling is more accurate (i.e.,
produces a smaller standard error) than simple random sampling (SRS)
when the stratifying variable is related to the variable that the researcher
is estimating or studying. Therefore, since one can increase the accuracy
of a SRS by sampling more cases, stratified sampling is more efficient
provided that the cost of stratifying is low relative to the cost of
increasing sample size.
12. Disproportionate stratified random sampling is
used when the proportion of cases in a stratum is so small that proportionate
sampling would yield too few cases for reliable statistical estimates
of the stratum. Disproportionate sampling thus is designed to increase
the number of cases selected from particular strata.
13. Single-stage cluster sampling involves breaking
the population down into groups of cases, called clusters, selecting
a random sample of clusters, and then selecting all cases within each
randomly selected cluster. Stratified random sampling involves breaking
the population down into variable categories, called strata, and then
selecting a random sample of cases from each stratum.
14. Cluster sampling is used to reduce the costs
of data collection, such as the costs of interviewer travel and constructing
a list of the population.
15. We can reduce sampling error by selecting more
clusters and fewer cases within clusters. Increasing the number of clusters
reduces sampling error at the stage that is subject to the greatest
error.
16. If each cluster had an equal chance of being
selected, and the clusters varied in size, then the cases in larger
clusters would have a lower probability of being selected than those
in smaller clusters. Making the probability of selecting a cluster proportionate
to its size gives each case in the population an equal chance of being
selected.
17. If one does not have a computerized list and
must draw a sample manually, a systematic sample is easier to draw than
a simple random sample. You do not need to number all cases, as in simple
random sampling, and you only need to use the table of random numbers
to select the first case. The danger is that the population listing
(i.e., the sampling frame) may have a periodic or cyclical pattern that
corresponds to the sampling interval, thereby creating a biased sample.
18. Nonprobability sampling is called for when (1)
an extremely small number of cases are to be studied, (2) the availability
of cases is limited because of death, inaccessibility, or refusals to
cooperate, (3) in the early stages of investigating a problem, and (4)
the population is either very small or not readily identifiable.
19. A convenience sample is selected on the basis
of availability, whereas a purposive sample is selected so as to represent
the population in certain respects.
20. Both quota sampling and stratified random sampling
partition the population according to the categories (or strata) of
one or more variables. But whereas in stratified random sampling cases
are randomly selected within each stratum, in quota sampling the quota
of cases within a stratum is selected by nonprobability methods such
as convenience sampling.
21. Interviewers using quota sampling may select
a disproportionate number of friends, visible persons in highly concentrated
populations, and persons living in attractive homes and neighborhoods.
Such selection is likely to produce samples that are biased in terms
of age, education, socioeconomic status, and numerous other related
variables.
22. The two most common combinations in social research
are (1) multistage cluster sampling with quota sampling of individual
respondents at the final stage, and (2) nonprobability (usually purposive)
sampling of areas with probability sampling of cases within each area.
Such combinations are used to reduce the costs associated with probability
sampling.
23. With sufficient resources and a readily identifiable
target population, one may select a relatively large probability sample
and screen for members of the target population. Ordinarily, however,
researchers use some form of referral sampling, such as network sampling,
in which targeted respondents are asked to identify other members of
the target population with whom they are socially linked in some way
(e.g., neighbor, brother); or snowball sampling, in which targeted respondents
are asked to identify other members of the target population, who are
asked to identify others, and so on.
24. The (1) stage of research and (2) intended use
of the data determine how accurate the sample must be. Accuracy is least
important in the early, exploratory phases of research and most important
in the later stages of research and in large-scale fact-finding studies
that influence policy decisions. (3) Available resources, such as time,
money, materials, and personnel, place limitations on how cases can
be selected. (4) The four data collection approaches typically incorporate
different sampling designs: convenience sampling in experiments; probability
sampling in surveys and available data research; and purposive sampling
in field research.
25. The five considerations affecting sample size
are (1) heterogeneity of the population with respect to the variable
under investigation, (2) desired level of precision in a sample interval
estimate, (3) sampling design, (4) time and money available to conduct
a study, and (5) number of breakdowns planned during the data analysis.
The more heterogeneous the population, the more precise the estimate
must be, and the more planned breakdowns, the larger the sample must
be. A given level of precision requires a larger cluster sample than
a simple random sample and a larger simple random sample than a stratified
random sample. Of course, larger samples also require more time and
money.
26. One might respond by saying, first, that it is
the absolute size of the sample rather than the proportion of the population
sampled that determines precision, and second, that the sample need
not be exceedingly large to yield precise results.
27. (1) Incomplete sampling frames (coverage error)
and (2) incomplete data collection created by refusals to cooperate,
unreturned questionnaires, and missing records (nonresponse bias).
28. (a) Chances are that the target population—sociology
majors—is not very big. If there are fewer than 100 majors, a
survey study should question the entire population; and if the population
numbers in the hundreds, it would be relatively easy to select a simple
random sample. (b) Because members of the gay community will not be
readily identifiable, some form of nonprobability sampling is necessary.
The sampling procedure will depend on one’s knowledge of and contacts
within the gay community, and may include snowball sampling. (c) The
size and geographic dispersion of the population necessitates a multistage
cluster sample.
30. (a) The list of all students currently enrolled
that you obtained from the registrar. (b) All students currently enrolled
at your school. (c) First, number all of the cases in the sampling frame;
second, determine the sample size; third, from the size of the sampling
frame, determine the block size of numbers (e.g., three-digits, four-digits,
etc.) to be read in the random numbers table; fourth, proceed through
the table of random numbers until you have selected the number of cases
required for your sample. (d) Stratified random sampling increases the
accuracy of a sample provided that the stratifying variable is related
to the characteristic being estimated. Thus, if major is related to
time spent studying (which it probably is), selecting a stratified random
sample, with major as the stratifying variable, should increase the
accuracy of an estimate of average study time on your campus. (e) A
cluster sample might be drawn by using living units such as dormitories
as the primary sampling units, first selecting a sample of dormitories,
and then selecting a simple random sample of students in each selected
dormitory. However, this sampling design is clearly inappropriate because
it is used primarily to reduce the costs of sampling large and geographically
dispersed populations, which rarely, if ever, applies to a college or
university. (f) If the target population is relatively large (say, N
> 5,000) and the list of students is not computerized, then systematic
sampling would be easier than taking a simple random sample. However,
stratified random sampling (probably disproportionate), as described
in (d), would be preferable, particularly if one needs to make sure
that each major is adequately represented in the sample.
31. (a) This is a two-stage cluster sample. (b) This
problem could be resolved by drawing a stratified random sample, with
size of school (i.e., number of students) as the stratifying variable.
One might, for example, divide schools into three strata (less than
3,000 students, 3,000 to 10,000 students, and greater than 10,000 students),
and randomly select schools from each stratum.
32. (1) Random sampling error—random variation
from sample to sample. Of course, random sampling error is a function
of sample size and sampling design. (2) Coverage error due to incomplete
sampling frames. The quality of sampling frames almost certainly varies
from one poll to another. We would expect a master voter list to be
better than a phone book, for example. If a phone book or random-digit
dialing were used, the pollster would need to ask a set of screening
questions to identify eligible respondents—registered voters who
will vote in the upcoming election. Differences in the screening techniques
may affect the coverage error. (3) Nonresponse error. If the pollster
used telephone interviewing, then refusals and failures to reach not-at-home
respondents can create error. How often do the pollsters call back if
there is no answer? If they do not call back two or three times at the
minimum, they are likely to undercount young people and the affluent
and overcount people more likely to be home—those with young children
and the elderly. (In addition to these sources of sampling error, there
also could be measurement error due to the wording of questions, respondent’s
lack of knowledge of the candidates [Do respondents know who the candidates
are? Only 30 percent of the public can even name a congressperson from
their state. Some unknowledgeable respondents may respond to the poll
so as not to appear uninformed.], and interviewer effects [see chapter
8]).
Experimentation
1. Evidence of association is demonstrated
by differences among experimental conditions on measures of the dependent
variable: a statistically significant difference shows that the independent
variable, in terms of which the experimental conditions differ, is related
to the dependent variable. Direction of influence is established by
the ordering of events in an experiment: the manipulation of the independent
variable always occurs before the measurement of the dependent variable.
Plausible rival explanations are eliminated by randomization, which
controls for characteristics that subjects bring to the experiment,
and by the constancy of conditions other than the manipulation of the
independent variable, which controls for extraneous factors during the
course of the experiment.
2. Tests of statistical significance indicate
whether the results—the observed differences among experimental
conditions—are likely to have occurred by chance.
3. No. Although matching creates experimental
conditions that are similar on the matched characteristics, the conditions
may still differ in the distribution of other, unmatched characteristics
unless subjects are also randomly assigned.
4. Sampling refers to the selection of cases
for a study; random sampling indicates that every case has an equal
chance of being selected. Once subjects are selected for an experiment,
which rarely involves random sampling, they must be randomly assigned
to the conditions of the experiment. Thus, random sampling is a method
of drawing a sample of cases, such as the pool of subjects in an experiment,
whereas random assignment is a method of assigning subjects from the
pool to experimental conditions.
5. Internal validity refers to the validity
of the study design—whether the study allows one to infer a causal
relationship between the independent and dependent variables. External
validity refers to the extent to which experimental results may be generalized
to situations beyond the specific context of the experiment.
6. The typical sample of students in an experiment
consists of college students who have “volunteered” to participate
in exchange for a small payment or course “credit.”
7. Experimenters argue that (1) differences
in background characteristics such as age and education are likely to
have little effect on subjects’ reactions, (2) sampling considerations
are secondary to the primary aim of establishing the existence of a
causal relationship, and (3) the generality of experimental results
with regard to different subject populations can be demonstrated through
replication.
8. External validity is commonly increased
by replicating the experiment while varying one or more features, such
as the nature of the subject population, the setting, the experimental
manipulation, or the measurement of the dependent variable.
9. The four main stages of an experiment are
(1) introduction to the experiment, (2) manipulation of the independent
variable, (3) measurement of the dependent variable, and (4) debriefing
and/or postexperimental interview.
10. The cover story—a plausible false
explanation of the nature of the experiment—is designed to deceive
subjects about the true intent of the experiment so that they will not
be preoccupied with guessing the hypothesis or trying to be helpful
by acting in accord with a presumed hypothesis.
11. Multiple meanings refers to the measurement
problem that occurs when an experimental manipulation is open to a variety
of interpretations. In general, the more complex the experimental situation,
the more likely that subjects’ interpretations of the situation
will vary and differ from the experimenter’s intended meaning.
12. Manipulation checks assess the validity
of experimental manipulations by determining if they are appropriately
experienced or interpreted by subjects.
13. Behavioral measures are less likely to
be contaminated by subjects’ self-censoring of responses. Also,
if overt behavior is the object of study, it is better to measure it
directly rather than obtain an indirect self-report of how subjects
say they will behave.
14. When deception is used, debriefing serves
to inform subjects about the nature of and reasons for the deception.
It is also a time to explain the experiment’s true purpose and
importance, to learn about subjects’ thoughts and reactions during
the experiment, and to convince subjects not to tell others about the
experiment.
15. An experiment is high in experimental realism
when it has an impact on subjects, so that they pay careful attention,
regard the situation seriously, and feel involved rather than detached.
An experiment is high in mundane realism when the setting and events
of the experiment are similar to everyday experiences.
16. In a judgment experiment, subjects make
judgments from materials provided by the experimenter, whereas in an
impact experiment, which is higher in experimental realism, subjects
directly experience the manipulation. This difference would be reflected,
for example, in reading a description of a stimulus person as opposed
to actually interacting with a stimulus person.
17. It is important to consider the social
nature of an experiment because the motives and expectations of subjects
and their interaction with the experimenter may have as much to do with
how subjects respond as do the experimental manipulations.
18. Subjects in an experiment agree to place
themselves under the control of the experimenter and to carry out assigned
tasks unquestioningly.
19. The “good” subject believes
in the value of the research, willingly complies with all instructions
and requests, and hopes to help out the experimenter by acting so as
to validate the experimental hypothesis. Such motives may heighten subjects’
sensitivity to demand characteristics, so that the latter accounts for
their actions more than the intended experimental manipulation.
20. The “anxious” subject is concerned
about being evaluated, and therefore motivated to make a positive impression
or at least avoid a negative one. The “bad” subject, out
of hostility or disdain, is motivated to sabotage the research by providing
useless or invalid responses.
21. Experimenters can bias findings by making
recording or computational errors, by falsifying data, or by inadvertently
allowing personal characteristics or expectancies to affect subjects’
behavior.
22. Experimenter expectancies appear to be
conveyed nonverbally—through facial expressions, gestures, voice
quality, and tone of voice.
23. Unlike most experiments, studies of experimenter
effects (1) tend to involve highly ambiguous tasks with (2) experimenters
running subjects in only one condition.
24. The effects of demand characteristics may
be minimized by (1) pretesting to identify subjects’ perceptions
of demand characteristics, (2) using a cover story to satisfy subjects’
suspicions about the purpose of the experiment, (3) increasing experimental
realism, (4) physically separating the settings for the experimental
manipulation and the measurement of the dependent variable, (5) conducting
the experiment in a natural setting in which subjects are unaware that
an experiment is taking place, and (6) asking subjects to play the role
of “faithful” subject. Experimenter effects may be reduced
by (1) keeping both subject and experimenter blind to which condition
a subject is in (the “double-blind technique”), (2) using
two or more experimenters, each of whom is blind to some part of the
experiment, (3) conducting a single experimental session for all subjects,
and (4) conveying instructions through audio or videotapes.
25. Compared to a tape recording, a “live”
experimenter will enhance experimental realism but may introduce variation
in the manipulation or experimental conditions through inadvertent nonverbal
cues.
26. Compared to laboratory experiments, field
experiments are higher in mundane realism and external validity, often
completely eliminate the effects of demand characteristics, and are
more amenable to applied research. On the other hand, field experiments
usually offer less control than laboratory experiments, so that they
may only approximate a true experimental design, and often cannot incorporate
standard ethical safeguards of subjects’ rights such as informed
consent and debriefing.
27. Experimentation may be carried out in conjunction
with surveys by systematically varying the wording of questions contained
in a questionnaire or the factors presented in decision-making vignettes,
and by creating different sets of questionnaires.
28. No. It is common for experiments to involve
dyads (pairs of individuals) or larger groups of individuals. The text
cites an example of a study in which neighborhoods were the units of
analysis.
Experimental Designs
1. The basic principle of good design is allowing
only one factor to vary at a time while controlling for all other factors.
2. Experimental designs are valid to the extent
that they offer sound evidence that the manipulated independent variable
is the only viable explanation of observed differences in the dependent
variable. Threats to validity refer to extraneous variables which, if
uncontrolled, offer plausible alternative explanations of such differences.
3. effects of the independent variable
4. History and maturation refer to factors
that are concurrent with but extraneous to the experimental manipulation.
History effects are events in the subjects’ environment that affect
the experimental outcome; maturation effects are psychological or physiological
changes in subjects that affect the outcome. Testing and instrumentation
refer to factors that influence the measurement of the dependent variable.
A testing effect occurs when pretesting affects subjects’ responses
on the posttest; an instrumentation effect occurs when the method of
measuring the dependent variable changes over time or differs across
experimental conditions.
5. Statistical regression poses a probable
threat to internal validity when subjects are selected for a particular
experimental condition because of their extreme scores on the dependent
variable.
6. First-year students may be more likely than
upperclass students to experience maturational changes that produce
improvement in writing skills. Therefore, differences in writing improvement
between the writing-intensive and other students may not be due to their
experience in the course but rather to the selection of groups for the
experimental conditions that differ in their susceptibility to a maturation
effect.
7. (a) The primary threats to validity in the
one-shot case study are history, maturation, and attrition. (b) In the
one-group pretest-posttest design, the primary threats are history,
maturation, testing, and instrumentation. (c) In the static-group comparison,
the major threats are selection and differential attrition.
8. Random assignment in this design eliminates
the effects of selection and statistical regression by making the experimental
and control groups similar in composition. The presence of an experimental
and a control group, both of which are pretested and exposed to the
same general environment, means that the effects of history, maturation,
and testing should be felt equally in both groups. Instrumentation is
controlled provided that the measurement of the dependent variable is
the same for both groups. Finally, the effects of differential attrition
may be controlled by comparing the pretest scores of those subjects
who drop out of each group.
9. Randomization rules out (d) selection and
(e) statistical regression as threats to internal validity because it
eliminates systematic differences between experimental conditions in
the composition of subjects, which is the source of these validity threats.
However, randomization does not affect events or processes that occur
once subjects are assigned to conditions, which are the sources of (a)
maturation and (b) history effects; nor does it affect the measurement
of the dependent variable, which may produce an (c) instrumentation
effect.
10. The principal threat to external validity
is testing-X interaction, which means that the effect of the independent
variable (X) may depend upon the presence of a pretest.
11. The main advantage is that it eliminates
the possibility of testing-X interaction. It is also simpler and therefore
more economical.
12. (a) Selection-X interaction is minimized
by using heterogeneous samples of subjects and is made less plausible
by replicating an experiment with different subject populations. (b)
Maturation-X interaction may be controlled by systematically varying
conditions, such as the time of day, which could cause maturation effects,
and may be checked by replicating an experiment under varying conditions.
(c) History-X interaction also may be checked by replication, so that
an experimental outcome is subject to different historical influences.
13. Within-subjects designs (1) require fewer
subjects and (2) reduce the error associated with individual differences
when different groups of subjects experience each experimental condition.
On the other hand, participating in one experimental condition may affect
how subjects respond to another, creating possible testing and order
effects. Even though such effects can be controlled or estimated by
counterbalancing, they cannot be eliminated. Therefore, within-subjects
designs should be used with caution and not at all when it is highly
likely that participating in one condition of an experiment will influence
how subjects will respond to another.
14. The Solomon four-group design contains
two factors, each with two levels: the treatment (presence or absence)
and the pretest (presence or absence).
15. Sigall and Ostrove found an interaction
effect: the effect of the defendant’s attractiveness on recommended
sentence depended on the type of crime; that is, a shorter sentence
was recommended for the attractive than the unattractive defendant when
the crime was burglary, but a longer sentence was recommended when the
crime was a swindle. Longer recommended sentences for the burglary than
the swindle, irrespective of the defendant’s attractiveness, indicates
a main effect for type of crime.
16. Factorial designs are more cost efficient,
allow for the assessment of interaction effects, and enhance external
validity by determining the effects of one variable under various conditions
(represented by “levels” of other variables included in
the factorial design).
17. Quasi-experimental designs omit one or
more features of true experimental designs, such as randomization, a
control group, or the constancy of conditions.
18. The separate-sample pretest-posttest design
uses separate groups for a pretest and posttest; if subjects are randomly
assigned to the pretest and posttest conditions, this design controls
for selection, and the use of separate groups eliminates testing and
testing-X interaction. Nonequivalent control group designs lack randomization
but include at least one control group; the more similar the comparison
groups in recruitment and history, the more likely that this design
controls effectively for history, maturation, testing, and regression.
19. Rival explanations are ruled out in quasi-experimental
designs by (1) including special design features, (2) examining additional
data that bear on specific threats, and (3) reasoning against the plausibility
of particular validity threats. (a) The Clore et al. study (1) used
multiple measures of the dependent variable (interracial attitudes)
as well as separate-sample and multiple-group pretest-posttest designs,
(2) analyzed additional data bearing upon initial acquaintances and
allegiance to the child’s living unit, and (3) ruled out maturation
because of the brevity of the camp experience and statistical regression
because children were not selected for their extreme attitudes or behavior.
(b) Campbell and Ross (1) used interrupted and multiple time-series
designs, (2) analyzed additional data pertaining to weather conditions,
improvements in highways and automobile safety features (history), and
changes in record keeping (instrumentation), and (3) showed the implausibility
of threats such as testing, history, and statistical regression in view
of the continued steady decline in traffic deaths in Connecticut after
1956.
20. (a) The Deutsch and Collins study involves
a static-group comparison (Design 3). (b) The major internal validity
threats are selection and differential attrition. Selection may account
for the results if the tenants in the integrated projects were less
prejudiced than tenants in the segregated projects before they moved
into the projects. Differential attrition could explain the results
if a greater number of highly prejudiced tenants moved out of the integrated
projects than moved out of the segregated projects. It is also possible
that historical events unique to each of the projects or each of the
cities (e.g., gang fights between black and white youths) could account
for differences in levels of prejudice. Finally, instrumentation may
produce different prejudice levels if there were differences between
the projects in how prejudice was measured.
21. (a) The major problem with this study,
which uses the static-group comparison, is selection. The two groups
are likely to differ in numerous ways other than the use of phonics
versus word method of learning to read. There may be differences between
the towns in socioeconomic and educational levels of the children’s
parents, in the quality of schools, and in the skill and/or commitment
of teachers, all of which could affect reading ability. (b) To control
for selection biases and the effect of teachers, irrespective of teaching
method, the study ideally should randomly assign children within each
of numerous classes to either the phonics or the word method. If the
study involves more than one school system, the design should be replicated,
with both methods used in each system to control for differences between
residents and schools. Assuming that such control is not possible and
that a quasi-experimental design is necessary, it would be crucially
important to locate a set of schools or school system that uses both
methods, in order to minimize selection biases.
Survey Research
1. (1) Large probability samples, (2) systematic
questionnaire or interview procedures, and (3) computerized, quantitative
data analysis.
2. A survey study of campus drinking norms
and policy might treat colleges and universities as units of analysis,
perhaps interviewing key administrators such as the dean of students
and director of fraternity affairs as well as campus opinion leaders.
A study of the implementation of early retirement policies might use
business organizations as units and involve interviews with company
presidents or other key officials responsible for such policies.
3. Unstructured interviews have very general
and loosely defined objectives that allow interviewers considerable
freedom in questioning; structured interviews have highly specific and
well defined objectives, which are met through tight restrictions on
the order and form of questioning; and semi-structured interviews have
specific objectives, but allow some freedom with regard to the formulation
of questions.
4. The General Social Survey (GSS) is an omnibus
personal interview survey of a national probability sample conducted
since 1972 by the National Opinion Research Center. Until 1994, the
survey was conducted annually (except for 1979, 1981, and 1992) with
a sample of about 1500 respondents. Starting in 1994 the GSS shifted
to biennial surveys with twice as many respondents. The objective of
the survey is to provide high-quality data to the social science research
community.
5. Survey questions may include requests for
social background information, reports of past behavior, statements
of attitudes, beliefs, values, and behavioral intentions, and sensitive
information.
6. Relative to experiments, surveys generally
can address a wider range of topics and collect substantially more information
from much larger and more representative samples; thus, they are more
flexible and more economical in that they can address several research
questions at one time. On the other hand, surveys are less effective
in testing causal relationships than experiments, and they are limited
to self-reports of behavior, which not only are subject to self-censoring
of responses but also cannot substitute for studies of overt behavior.
7. Surveys, like experiments, are susceptible
to reactive measurement effects.
8. Contextual designs and social network designs
provide direct information about interpersonal relations and social
contexts—important objects of social research—whereas in
cross-sectional surveys such information is limited by the extent and
accuracy of individuals' reports about the people and groups with whom
they interact.
9. Both trend studies and panel studies are
longitudinal, that is, involve surveys of respondents at different points
in time. Trend studies survey separate, independent samples of respondents,
whereas panel studies survey the same respondents repeatedly over time.
Only panel studies enable one to assess individual changes.
10. Cohort studies examine influences due to
age, historical period, and membership in a particular cohort.
11. The major decision points in planning a
survey are (1) formulate research objectives, (2) review literature,
(3) select units of analysis and variables, (4) develop sampling plan,
and (5) construct survey instrument (4 and 5 are usually concurrent
activities).
12. Structured survey instruments reduce error
and increase reliability but may adversely affect validity by dampening
respondent motivation and by assuming that all respondents will interpret
questions similarly. Unstructured interviewing facilitates exploratory
research and may enhance validity; however, it requires more highly
trained interviewers and more complex data analysis, and therefore greater
cost per respondent.
13. Unless the target population is highly
concentrated geographically, face-to-face interview studies almost always
involve multistage cluster sampling. This is the most cost-efficient
sampling design in view of the time and travel required to reach respondents.
14. Interviews provide more flexibility by
allowing the researcher (or interviewer) to clarify questions, to elicit
more complete responses, to ascertain the order in which questions are
answered, to use more varied question formats, and to reach respondents
unable or unwilling to respond to a questionnaire.
15. Major problems with face-to-face interviewing
are (1) high cost per respondent, (2) difficulty in reaching some respondents,
(3) difficulties in supervising a widely dispersed staff of interviewers,
and (4) response biases introduced by interviewers.
16. Relative to face-to-face interviewing,
telephone interviewing is (1) substantially less costly and time-consuming,
(2) much simpler in terms of staff supervision, and (3) easier for making
call-backs to not-at-home respondents.
17. CAPI prompts the interviewer with instructions
and question wording in the proper order, skips questions not relevant
to particular respondents, assures that the interviewer enters appropriate
response codes for each question, and may even identify when respondents
are giving inconsistent responses. In addition to this assistance, CATI
may automatically sample and dial phone numbers, schedule callbacks,
screen and select the person to be interviewed at each sampled phone
number, record responses in a computer data file, and provide sampling
and interviewing updates to supervisors.
18. (a) Response rates and sample quality tend
to be highest in face-to-face interview studies, slightly lower with
telephone interviews, and much lower with mailed questionnaires. (b)
Face-to-face interviews generally are much more costly and time consuming
than the other survey modes, while telephone interviews generally cost
more but take less time than mailed questionnaires. (c) Face-to-face
interviews allow one to ask the most complex and sensitive questions;
telephone interviews must ask questions simple enough for respondents
to understand and retain while formulating an answer and, like mailed
questionnaires, tend to yield shorter answers (and more nonresponses)
to open-ended questions.
19. A mail survey is recommended for specialized
groups who are likely to have a high response rate, when a large sample
is desired, when costs must be kept low, and when moderate response
rates are tolerable.
20. Internet or Web surveys substantially reduce
costs, including the costs of increasing sample size, require less time
to carry out, and, like other computer-mediated methods, offer considerable
flexibility in questionnaire design. On the other hand, they are subject
to coverage error, as they can only reach those with access to the Internet,
and early research indicates that response rates for Web surveys tend
to be low, at least as low if not lower than mailed questionnaire surveys.
21. To increase the response rate in a readership
survey of a Catholic diocesan newspaper, I mixed a mail survey with
telephone interviews. Initially I sent a mail questionnaire survey to
a random sample of subscribers; after a second follow-up to the mail
survey, I conducted telephone interviews with those who failed to respond
by mail.
22. Field administration of a survey entails
(1) interviewer selection, (2) interviewer training and pretesting,
(3) gaining access to respondents, (4) interviewing and interviewer
supervision, and (5) follow-up efforts.
23. Interviewers should be neat and businesslike
in appearance, articulate, tolerant, pleasant and cooperative, good
listeners, show an interest in the survey topic, and be concerned about
accuracy and detail.
24. Interviewer training begins with a description
of the survey and sample. Interviewers then should learn basic interviewing
principles and rules, become acquainted with the interview schedule,
and engage in supervised practice in using the interview schedule.
25. The purpose of pretesting is to try out
the survey instrument on persons similar to those in the target group
in order to check for ambiguous questions, inappropriate responses options,
and the like. (See chapter 10)
26. A cover letter should (1) identify the
researcher and/or sponsor, (2) describe the general purpose of the study,
(3) show how the findings may be of benefit, (4) explain how the respondent
was selected, (5) assure confidentiality and/or anonymity, (6) indicate
how long the questionnaire or interview will take to complete, and (7)
promise to answer questions about or provide a summary of the study's
findings.
27. The principal argument in favor of standardization
is that it reduces error associated with how interviewers ask questions
and respond to respondent queries. Those who oppose strict standardization
contend that it inhibits the interviewer’s ability to establish
rapport and motivate respondents to respond fully and honestly and that
it disregards the need to detect and correct communication problems
such as the misinterpretation of questions.
28. (a) Interviewers may affect responses and
introduce error as a result of their physical characteristics, including
race, sex, and age, and by conveying expectations to respondents about
how to respond. (b) Respondents may distort or give false responses
because of poor memory, desire to make a favorable impression on the
interviewer, embarrassment, and dislike or distrust of the interviewer.
29. Interviewer supervision involves (1) distributing
materials, keeping records, and paying interviewers, (2) overseeing
the schedule of interviews, (3) collecting and checking interview schedules,
(4) regularly meeting with interviewers, (5) being available to answer
questions, and (6) sitting in on a few interviews.
30. Maintaining supervision and contact provides
a mechanism for motivating interviewers, boosting morale, and maintaining
the quality of interviewing.
31. Follow-up efforts are essential to produce
adequate response rates—to make sure that as many of the sampled
respondents are interviewed or questioned as possible. In the case of
interview refusals, more than one follow-up should not be used. In the
case of mailed questionnaires, three follow-up mailings is the norm;
with special procedures such as certified mail sometimes invoked for
the third mailing.
Survey Instrumentation
1. Respondents must (1) understand the literal
and intended meaning of the question, (2) retrieve relevant information
from memory, (3) formulate a response in accord with the question and
the retrieved information, and (4) communicate a response deemed appropriate.
2. If respondents believe that interviewers
will ask only clear (“clarity”) questions that are relevant
(“relevance”) to their personal situation, they may feel
pressure to provide prompt, but sometimes inadequate, responses rather
than telling the interviewer that the questions are vague, ambiguous,
or inappropriate.
3. Respondents are likely to expend the minimum
effort (“satisficing”) when questions are difficult for
them to answer and when their motivation level is low. Conversely, easily
answered questions and high motivation are more likely to produce maximum
(“optimizing) effort.
4. By allowing respondents freedom in answering,
open-ended questions can yield a wealth of information as well as clarify
the researcher’s understanding in areas where it is not well developed.
On the other hand, open-ended questions require more work—of respondents
in answering, of interviewers in recording answers, and of the researcher
in coding and analyzing responses—and may yield uneven responses
due to respondent differences in articulateness, verbosity, or willingness
to answer. Closed-ended questions require less effort from respondents
and interviewers, provide response options that may clarify the question
or make self-disclosure more palatable, and are easier to code and analyze.
However, they also are difficult to develop, requiring considerable
prior knowledge of respondents, may force respondents into choosing
among alternatives that do not correspond to their true feelings, and
may dampen respondents’ motivation by restraining spontaneity.
Open-ended questions work best in the early
stages of research, when less is known about respondents, and should
be used when the survey objectives are broad, respondents are highly
motivated, and respondents vary widely in their knowledge or prior thought
about the issue. They should be used sparingly in self-administered
questionnaires because writing takes more effort than speaking and because
an interviewer is not available to use probes that ask for elaboration
or clarification.
5. (a) Open; (b) closed; (c) open; (d) closed;
(e) open.
6. The open version gives the best picture
of American concerns. Sixty percent of the respondents chose one of
the four problems listed as part of the closed question even though
these problems are rarely mentioned in responses to the open question.
Thus, by containing infrequently chosen options, a closed question may
distort research findings.
7. Indirect questions may be used when respondents
are unable or unwilling to reveal certain characteristics or experiences
directly to the researcher. Responses to such questions are difficult
to code, often are open to various interpretations, and are ethically
problematical.
8. The use of questionnaire items and scales
from previous research is considered good research practice and is not
at all unethical unless one uses copyrighted material without permission.
9. A good opening question should be relatively
easy to answer, interesting, and consistent with respondent expectations,
so that it engages respondents’ interest and motivates them to
complete the survey.
10. (a) End; (b) end; (c) middle; (d) beginning.
11. Compared to Question (a), Question (b)
is more general and therefore more susceptible to order effects.
12. “(In a survey of sex-role attitudes)
Now we would like you to read some statements that describe differing
attitudes toward the role of women in society. For each statement, express
your feeling. . . .” Transitions are designed to improve the flow
of a survey, to enhance respondents’ understanding and motivation
and refocus their attention by providing a brief rationale or description
of the ensuing questions.
13. Formulate your research objectives clearly
before you begin to write questions.
14. (a) Lack of precision and, possibly, inappropriate
vocabulary (“sibling”). (How many brothers do you have?
How many sisters do you have?) (b) Double-barreled. (Do you think the
man or the woman should initiate the first date? Who do you think should
pay for the first date—the man or the woman?) (c) Leading question
(“just as much right to”). (In divorce and separation cases,
men and women should have equal right to custody of the children. Or,
in divorce and separation cases, women should have more right to custody
of the children than men.) (d) Leading question. (A woman’s place
is in the home.) (e) Ambiguous. (A series of questions is needed to
address this issue adequately.) (f) Insensitive wording. (Does your
mother work outside the home?) (g) Inappropriate vocabulary. (In general,
who has the most say in important family decisions—you or your
husband [wife]?)
15. A funnel sequence establishes a frame of
reference for specific questions by first asking general questions,
which often are open-ended, and then moving progressively to more and
more specific questions. The inverted funnel sequence reverses this
sequence, asking more specific questions first, which form the frame
of reference for a general opinion question.
16. The accounting scheme could contain some
of the same elements as the scheme for determining why students select
a particular school. Thus, we might ask the following questions: What
is your major? When did you select this as your major? Did you switch
from another major? If so, why? Were there any other areas of study
in which you were interested? What especially appealed to you about
this major? Is this major related to specific career interests? Did
your instructors, friends, parents or anyone else help you decide on
this major? Who? How much influence did they have?
17. The two problems are forgetting and distortion:
respondents may be unable to recall information and/or may not recall
it objectively. The most effective ways to increase the accuracy of
recall are (1) providing a context for answering, such as by asking
questions in life sequence, (2) providing lists, and (3) asking respondents
to check records.
18. The tendency to give socially desirable
responses may be minimized by carefully wording and placing sensitive
questions, assuring anonymity and emphasizing scientific importance,
making statements sanctioning less socially desirable responses, and
building interviewer-respondent rapport.
19. Response sets may be avoided by (1) clearly
spelling out the content of response options rather than using simple
agree-disagree categories, and (2) varying the arrangement of questions
and the manner in which they are asked (e.g., writing attitude statements
so that an “agree” response represents one end of the attitude
continuum on some items and the opposite end on other items).
20. Contingency questions are questions intended
for only part of the sample of respondents; filter questions determine
who is to answer which of subsequent contingency questions.
21. Pretesting a survey instrument involves
trying it out on a small sample of respondents. It facilitates the revision
and improvement of the instrument by identifying such problems as low
response rates to sensitive questions, lack of variation in responses,
item ambiguity, the appropriateness of response options to closed questions,
and the analytical complexity of answers to open-ended questions.
22. Cognitive interviewing techniques are used
first to diagnose question wording, ordering, and formatting problems
in a draft survey instrument. Typically, small, unrepresentative samples
of paid subjects are asked to verbalize their thought processes during
(“thinkalouds”) or after (follow up probes, paraphrasing
requests) answering each question being pretested. Then the revised
survey instrument and personnel are field pretested under realistic
interviewing conditions with a group of respondents similar to the target
population for which the survey is designed. Field pretesting supplements
cognitive interviewing techniques by identifying instrument problems
associated with subgroups of diverse target populations, with interviewer
behaviors, and with interviewer respondent interactions.
23. (1) In “thinkaloud” interviews,
respondents are asked to think aloud, reporting everything that comes
to mind, as they determine a response to pretest questions. (2) In the
probing question technique, interviewers ask follow up probes to explore
the respondents’ thought processes in formulating pretest question
responses. (3) In paraphrasing follow ups, respondents are asked to
summarize or repeat the question in their own words.
24. (1) In behavioral coding, live or taped
interviewer respondent interactions are systematically coded to identify
the frequency of problematic respondent and interviewer behaviors on
each question. (2) In respondent debriefings, structured follow up questions
at the end of pretest interviews are used to identify instrument problems
from the respondent’s perspective. Similarly, instrument problems
from the interviewer’s perspective are obtained from (3) interviewer
debriefings which usually involve focus group discussions. (4) In response
analysis, the responses of pretest respondents are tabulated and examined
for problematic response patterns. (5) Split panel tests are used to
compare instrument versions by experimental manipulations of question
ordering, wording, or formats.
Field Research
1. The term “qualitative” is misleading
because field research sometimes involves quantification; the term “observational”
is misleading because observation in some form characterizes all scientific
research and field research may involve much more than direct observation.
2. (a) A case study is a holistic analysis
of a single social phenomenon or setting, such as a particular community,
organization, or small informal group. (b) An ethnography is the description
of a particular culture. Each could be considered a specialized form
of field research.
3. Methodological empathy involves taking the
role of another person or group, trying to see things as they see them,
and using their categories of thought to organize and describe their
experience. It differs from sympathy in that sympathy suggests agreement,
but it is not necessary to agree with another’s perspective in
order to understand it.
4. Field research is particularly appropriate
when studying fleeting or dynamic situations, when it is necessary to
preserve and examine the natural order and entirety of a situation,
when methodological problems or ethics prevent the use of other approaches,
and when very little is known about the topic under investigation.
5. Compared to experiments and surveys, field
research is better suited to exploratory research and the investigation
of topics about which little is known, and it is likely to provide a
better understanding of reality from the subjects’ point of view.
On the other hand, field research is more difficult to replicate and
generally produces less generalizable findings. Surveys provide quicker,
more reliable descriptions of population characteristics, and experiments,
because of their greater control, are superior for testing causal relationships.
6. Many elements of research design are worked
out by the field researcher in the field, during the course of observation
or data collection, whereas in experiments and surveys, the research
design is carefully worked out prior to data collection.
7. Field researchers usually have little choice
but to use nonrandom selection. Neither the kinds of interactive units
studied nor the delicate problems of gaining access to settings and
the cooperation of informants allows for probability sampling.
8. Survey research ordinarily uses probability
sampling with the goal of generating accurate statistical descriptions
of a particular population. Deciding on the sampling design and selecting
units occur prior to data collection. However, the objectives of field
research—to get an insider’s view, to describe a particular
social setting, and to develop working hypotheses—do not require
probability sampling. Rather, sampling ordinarily is carried out over
the entire course of field research as a means of extending, testing,
or filling in gaps in one’s information. This is often accomplished
through some form of purposive sampling to maximize variability.
9. Unlike casual, everyday observation, field
observation is planned, methodically carried out, and involves deeper
and more complex interpretations of reality. Unlike generic scientific
observation, which may be direct or indirect and occur in a natural
or laboratory setting, field observation is limited to direct observation
with the naked eye and always takes place in natural settings.
10. Nonparticipant observation entails observing
people without interacting with them, often without their awareness
of being observed, whereas participant observation is based on an active
involvement in the lives of the people and situations under study. Participant
observation is more common in field research.
11. Field observation may vary from structured
observation involving explicit, preset plans for selecting, recording,
and encoding data, to relatively unstructured observation involving
few decisions about who, when, where, what, and how to observe. The
advantage of more structured observation is that it allows for greater
control of sampling error associated with the selection of observation
sites and times and of measurement error associated with the methods
of recording observations. This control, in turn, permits stronger generalizations
as well as checks on reliability and validity.
12. Being a participant is advantageous insofar
as one gains access to information withheld from outsiders and develops
a more complete, deeper level of understanding by seeing the world as
the participants see it.
13. Participant observers may experience stress
as a result of physical discomfort, awkward and embarrassing encounters,
hostile and suspicious challenges to their intentions, conflicts between
their roles as participant and observer, and repulsion and/or attraction
for the people they study.
14. “Going native” occurs when
the field researcher becomes so deeply involved in the setting or culture
that he or she loses sight of his or her observer role and ceases to
be an objective, independently-minded observer.
15. Interviewing informants enables the field
researcher to cross-validate observations and interpretations of events
and to fill in gaps in information, such as about events that occur
in the researcher’s absence or events to which he or she is not
privy.
16. Structured interviewing, which typifies
survey research, involves standardized questions asked in a particular
order of all respondents. Casual interviewing occurs in ordinary conversations
that are a natural extension of participant observation, such as when
researchers ask orientational questions or when they probe to expand
their information about actions and events. In-depth interviewing entails
relatively unstructured questioning and takes much longer, often requiring
more than one session.
17. (1) Selecting a research setting; (2) gaining
access to the setting; (3) presenting oneself; (4) gathering information;
(5) data analysis and interpretation.
18. “Starting where you are,” or
selecting a research setting with which one is familiar, makes access
easier, may enhance the researcher’s interest, and eases the develop
of relations with informants.
19. To gain access to (a) formal organizations
requires the permission of those in charge, which may be facilitated
by having someone vouch for the researcher; (b) public settings does
not require permission but, if the research involves prolonged observation,
should include seeking the cooperation of those who are likely to question
one’s presence; and (c) private settings usually involves developing
key contacts, who are either the gatekeepers or persons who can introduce
the researcher to the gatekeepers of such settings.
20. Gatekeepers are the persons with the authority
to decide who can or cannot be admitted to private organizations and
settings. Key informants are central figures in a community or setting
who facilitate the researcher’s acceptance by the community and
contact with gatekeepers.
21. The four master roles are complete observer,
complete participant, participant as observer, and observer as participant.
Most field research involves a combination of participation and observation.
Both the complete observer and complete participant conceal their identities
as researchers from those observed, which may raise ethical questions.
In addition, playing either of the latter roles may prevent access to
certain information—for example, insider information that is hidden
from an observer.
22. Covert research poses ethical problems
in semi-public or private settings, especially when the researcher misrepresents
his or her identity as a researcher in order to gain access.
23. As participants, participant observers
may adopt either peripheral, active, or complete membership roles. Peripheral
members remain marginal to the group, as they do not strive for full-member
status and limit their involvement in the group’s activities.
Active members adopt a functional role in the setting, but do not commit
themselves completely to the group. Complete members become fully immersed
in the setting and attain full-member status. (a) Complete members are
best able and peripheral members least able to develop trust and rapport;
(b) complete members have access to the greatest range of information,
including the most personal and intimate matters; and (c) complete membership
is likely to be the most intense and time-consuming, and to generate
the greatest conflict between the member and researcher roles.
24. Be open and honest about who you are and
what you are doing, but do not reveal the details of the study or volunteer
information beyond what is necessary to gain access and maintain good
relations in the setting.
25. Both a tape recorder and on-the-spot note
taking are obtrusive, which may evoke guarded and disingenuous behavior.
In addition, a tape recording does not include nonverbal behavior and
may produce a useless overload of information. A heavy reliance on memory
risks the loss of much information.
26. While in the field, field researchers take
field jottings—phrases, quotes, key words, and the like. At the
end of each day, or as soon after making observations as possible, they
write up field notes, based on their field jottings and memory. Field
notes should provide as detailed and complete a description of events
as possible.
27. Grounded theory refers to theory that is
generated from observation and analysis. A common product of field research
is the development of grounded theory.
28. Coding consists of assigning numbers or
symbols to observational categories. Coding provides a systematic framework
for recording observations and helps to organize and give direction
to the researcher’s analysis.
29. (1) Immediately record sudden insights
when reviewing field notes or observations; (2) Ask questions about
the data, such as what type of unit (e.g., encounter, relationship,
group) is this? What is its structure and characteristics? How frequently
does this action or event occur? (3) Count, i.e., record how often and
how consistently particular patterns occur.
30. The validity of interpretations may be
checked by generating and testing alternative interpretations of one’s
observations; by consciously looking for evidence that would disconfirm
a particular interpretation; and by getting feedback from respondents.
Field researchers should also seek corroborating evidence for observations;
check for inconsistencies between informants and find out why they disagree;
and, when possible, check reports and observations against other data
sources, such as institutional records.
Research Using Available Data
1. Experiments, surveys, and field research
all involve the firsthand collection of data, whereas in research using
available data, the data are “second hand,” having been
collected for a different research purpose or manufactured from the
myriad information deposited by societies throughout history.
2. (1) Public documents and official records;
(2) private documents; (3) mass media; (4) physical, nonverbal evidence;
(5) social science data archives.
3. (1) One might use Fortune magazine’s
list of the top 500 corporations, Who’s Who in America, and Current
Biography to study the ethnicity of business elites and to test the
hypothesis that the chief executive officers of these corporations overrepresent
white, male, English Protestants in the American population. (2) One
might use marriage records from a given city or state to test the hypothesis
that the frequency of interreligious marriages has increased over the
past two decades. (3) One might examine editorials in the campus newspaper
to study changes in student values and to test the hypothesis that students
today are less concerned with economic issues and more concerned with
issues of inequality and peace than students in the 1980s and 1990s.
4. In the state of Massachusetts, birth certificates
contain name, sex, date and place of birth, father’s name, occupation,
and birthplace, and mother’s maiden name and name at birth of
child; death certificates contain name, date of death, whether deceased
was a war veteran and of which war, sex, marital status, age, disease
or cause of death, residence at time of death, name of spouse, place
of death and place of birth, name and birthplace of both father and
mother, place of burial or cremation, and date of record.
5. The census bureau releases (1) aggregate
data from the decennial census which describe population characteristics
of states, counties, cities, and other geographic units, (2) individual
census records, called enumeration schedules, after a period of 72 years,
and (3) a sample of individual files (the Public Use Microdata Sample)
which have names and other personal identifying information removed
to protect confidentiality.
6. The manuscript census consists of the actual
enumeration schedules filled out by census takers or individual citizens.
These data are released 72 years after their collection. Since 1960,
the bureau also has released a sample of individual-level data (see
answer to question 5).
7. Private documents consist of information
recorded by individuals or organizations about their activities which
is not intended for public consumption. An example would be various
bills, such as telephone bills. College students might examine their
telephone bills to determine when they are most likely to make long-distance
calls home or to their friends. Does the timing of such calls correspond
to peak periods of anxiety, such as the period immediately preceding
midterm or final exams? (This research question is suggested by social
psychological research relating anxiety and affiliation.)
8. For a study of changes in the 20th century,
films would offer an ideal medium. Major films during five-year periods,
starting perhaps in the 1920s, could be analyzed to determine the number
and nature of the black characters portrayed. Alternatively, one might
study radio shows from the 1930s to the mid-1950s and then television
shows from mid-1950s to the present day. After obtaining lists of shows
airing on the major networks, one could identify those shows with black
characters and determine the nature of these characters, including occupation
and other key traits.
9. There are two types of data archives, each
of which derives data from a different source: data collected from large-scale
survey research, and ethnographic reports from field research in foreign
cultures.
10. Compared to other approaches, the use of
available data more often involves nonreactive measurement; lends itself
more readily to the analysis of large-scale social processes; is better
suited to studies of the distant past and social change, and to cross-cultural
research; makes it easier to increase sample size and conduct replications;
and generally costs less per case.
11. Reactivity is a major problem in laboratory
experiments, surveys, and field research because the research participants
are usually aware that they are being studied or observed. The problem
also arises in available data research involving the secondary analysis
of surveys and ethnographies or the analysis of historical documents
written by authors for the public domain. However, physical evidence
and many factual records are often nonreactive insofar as those who
produce the evidence do so without anticipation or foreknowledge of
the researcher’s particular use of it.
12. First, the research hypothesis or