questionnaire validity

7/29/2019 Questionnaire Validity

1/6

Questionnaire ValidityThe validity of a questionnaire relies first and foremost on reliability. If the questionnaire cannot

be shown to be reliable, there is no discussion of validity.

But there is good news. Demonstrating validity is easy, compared to reliability. If you have

reached this point and have a reliable instrument for measuring the issues or phenomena you are

after, demonstrating its validity will not be difficult.

Validity refers to whether the questionnaire or survey measures what it intends to

measure. While there are very detailed and technical ways of proving validity that are beyond

the level of this discussion, there are some concepts that are useful to keep in mind. The

overriding principle of validity is that it focuses on how a questionnaire or assessment process is

used. Reliability is a characteristic of the instrument itself, but validity comes from the way the

instrument is employed.

The following ideas support this principle:

As nearly as possible, the data gathering should match the decisions you need to

make. This means if you need to make a priority-focused decision, such as allocating

resources or eliminating programs, your assessment process should be a comparative one

that ranks the programs or alternatives you will be considering. Gather data from all the people who can contribute information, even if they are

hard to contact. For example, if you are conducting a survey of customer service, try to

get a sample of all the customers, not just those who are easy to reach, such as those who

have complained or have made suggestions.

A perfect example of a questionnaire that may

have high

reliability, but poor validity is a standardized

questionnaire that has been used in hundreds of

companies. These instruments are marketed

aggressively using promises of "industry norms" to compare your results with. Weigh

carefully the value of such comparisons against the almost certain lack of fit with your

culture, philosophy and way of managing. A good diagnosis of your organization is not

likely to come from a generic instrument with lots of normative comparisons.

If you're going after sensitive information, protect your sources. It has been said

that in the Prussian army at the turn of the century, decisions were made twice, once

when officers were sober, again when they were drunk. This concept acknowledges the

power of the "socially acceptable response" to questions or requests. Don't assume that a

simple statement printed on the questionnaire that "all individual responses will be kept

confidential" will make everybody relax and provide candid answers. Give respondents the

freedom to decide which information about themselves they wish to withhold, and employ

other administrative procedures, such as handing out Login IDs and Passwords separately

from the e-mail inviting people to participate in the survey.

Questionnaire Validity

A good diagnosis ofyour organization is not

likely to come from ageneric instrument...


2/6

The validity of a questionnaire relies first and foremost on reliability. If the questionnaire cannot

be shown to be reliable, there is no discussion of validity.

But even reliable instruments may not be valid if they are employed for situations they were not

designed for. A good example of a questionnaire that may have high reliability, but poor validity is

a standardized questionnaire that is used over and over in hundreds of companies. These

instruments are marketed aggressively using promises of "industry norms" to compare your

results with. Validity is not a characteristic of a particular instrument, attached to it in a

way that ensures it will always produce accurate information no matter where or when

it is used. If you want validity, you have to be able to demonstrate validity in your situation; it is

not built into the instrument.

But there is good news. Demonstrating validity is relatively straightforward, compared to

reliability. If you have reached this point and have a reliable instrument for measuring the issues

or phenomena you are after, demonstrating its validity will not be difficult.

How Do We Measure Validity?

While there are detailed and technical ways of establishing validity that are beyond the level of this

discussion, the following are brief descriptions of the three basic approaches. All proofs of validity

employ one or more of these methods:

Content Validity If the content of a test or instrument matches an actual job or

situation that is being studied, then the test has content validity. For example, a Training

Needs Assessment for middle managers should have content (such as skills, activities and

abilities) relevant to the jobs of middle managers. Skills that pertain to landscaping

workers would not be appropriate in a needs assessment instrument for managers. Predictive ValidityThis form of validity comes from an instruments ability to predict

an outcome or event in the future. If a questionnaire or instrument is developed to assess

promotion potential of a group of newly hired workers, the results of the test should be

able to predict which of the group will actually be promoted. The predictive validity of the

instrument is shown in the correlation between the scores from the test and the persons

promoted.

Construct Validity This form of validity derives from the correlation between the test or

questionnaire and another instrument or process that measures the same construct. The

Myers-Briggs Type Indicator (MBTI) is a well-established test of personality types. A new

instrument developed to assess the same characteristics would have construct validity ifthe scores from the new instrument correlated highly with the scores from the MBTI.

These are the methods for proving validity. Providers of questionnaires and surveys who are

unable or unwilling to talk about validity in those terms (content, predictive, or construct)

should be avoided. You will sometimes hear discussions of face validity. Despite the use of the

term, face validity is not a form of validity assessment . It is simply a subjective appraisal of

how an instrument appears to a person who examines it. There are numerous examples of valid

instruments without face validity and completely bogus instruments with loads of face validity.


3/6

Reliability and Validity in Questionnaire

DesignIn todays world organisations need strategic goals and targets and clear measurements are neededto assess progress towards these goals. Some of these targets are easy to define and themeasurements are clear cut, particularly certain financial goals, production and quality controltargets. However some of the most vital aspects of a well-functioning organisation are more complexto measure. For example, the climate and culture of an organisation is known to be central tooptimising employee wellbeing, productivity and innovation. Similarly, it is important to selectexecutives or employees with certain character traits and dynamics for them to function effectively intheir roles. Unlike annual income or production, which can be directly measured, many of thepsychological aspects of an organisation are intangible constructs and can only be measuredindirectly.

The classic example of an intangible construct is Intelligence Quotient (IQ). Most of us agree thatthere is such a thing as intelligence and that some people have more of it than others! But unlikeheight or weight it cant be measured with a tape-measure or a set of bathroom scales.

Figuring Out What You Want To Measure

Often the first step in measuring an intangible construct is coming up with an Operational Definition.This means defining what the construct is, what its comprised of and what measures it. This stage

tends to include a review of previous research on the topic to identify what is known about the subjectand how people have tried to measure it in the past.

In this type of work, our clients usually have a model of what makes up their construct, or we can help

them develop one. As a fictitious example, they might want to measure Organisational Effectivenessand they hypothesise that it is made up of four organisational traits: Morale, Innovation, Managementand Teamwork. In this case, each of the four traits needs to be measured. Questionnaires aregenerally used to collect this type of information. For example, a good design might be aquestionnaire with six questions each about each of the traits. The responses from the six questionsabout each trait will later be aggregated to give a measurement of Morale, Innovation, Managementand Teamwork.

After defining the construct and its components (traits), and producing questions to measure each ofthese, a testing stage is strongly recommended. The aim of testing is to ensure that the questionsare measuring what they are intended to: that is that they produce a reliable and valid measurement.

Reliability

Reliability means the consistency or repeatability of the measure. This is especially important if themeasure is to be used on an on-going basis to detect change. There are several forms of reliability,including:

Test-retest reliability - whether repeating the test/questionnaire under the same conditionsproduces the same results; and

Reliability within a scale - that all the questions designed to measure a particular trait areindeed measuring the same trait.

Validity

Validity means that we are measuring what we want to measure. There are a number of types ofvalidity including:


4/6

Face Validity - whether at face value, the questions appear to be measuring theconstruct. This is largely a common-sense assessment, but also relies on knowledge ofthe way people respond to survey questions and common pitfalls in questionnaire design;

Content Validity - whether all important aspects of the construct are covered. Cleardefinitions of the construct and its components come in useful here;

Criterion Validity/Predictive Validity - whether scores on the questionnaire successfully

predict a specific criterion. For example, does the questionnaire used in selectingexecutives predict the success of those executives once they have been appointed; and

Concurrent Validity - whether results of a new questionnaire are consistent with results ofestablished measures.

Validating a Model

Going back to our hypothetical example, the client has a model of Organisational Effectiveness that ismade up of four organisational traits: Morale, Innovation, Management and Teamwork. They alsohave a questionnaire with questions that are intended to measure each of these traits. However, asthey are using the questionnaire to infer levels of Morale, Innovation, Management and Teamwork, itis important to assess whether the results are consistent with this model being accurate. There are anumber of statistical methods available to test whether the data collected using the questionnairesupports the model, or whether either the questionnaire or the model needs revision or development.Principal components analysis and exploratory or confirmatory factor analysis are among thestatistical techniques often used to assess a model.

These techniques can often provide a deeper understanding of the issues being surveyed, and canreveal that questions are measuring more or less than they were intended to. For example, manyyears ago Data Analysis Australia staff were assisting a client with survey data relating tooccupational health and safety (OHS) issues. One of the questions might be paraphrased as Mysupervisor puts my health and safety above productivity, which was created to measure OHS issues.However, analysis revealed responses to this question instead related mainly to the first words mysupervisor, and showed more about industrial relations than OHS.

Another benefit of using techniques such as factor analysis to assess a questionnaire is improvedefficiency. We are often able to advise clients on ways in which they can reduce the length of theirquestionnaires while maintaining or increasing the information that can be obtained. Reducing thenumber of questions in an overly lengthy questionnaire makes it easier for respondents to complete,and increases response rates.

Generalisability and Confounding Issues

In testing the questionnaire, the test sample is also important. For example IQ tests were usedincorrectly in the US many years ago on migrants with limited English in this case they receivedpoor scores, but the test was inadvertently measuring their ability to read and respond to a test writtenin English rather than their actual IQ. There are two important lessons that can be taken from thisexample. The first is that other issues that alter our results can pop up in research if we dont give

sufficient thought to what we are really measuring. As in the OHS example earlier, even a questionthat appears fine on the surface can be confounded by other issues in some cases.

The second lesson is to be cautious in generalising results to other groups. If a questionnaire isdesigned for a specific group it is important to test it on a representative group. A questionnaire thatwill be used for assessing Board members should be tested on current/prospective Board members ifthese are the people that the information is required for. If the questionnaire is to be used on manydifferent groups of people, its important to test it on the different groups it will be used for to ensure itis valid in all its intended usages.

Which of These Issues Do I Need to Consider For My Questionnaire?

The type of reliability and validity issues that need to be considered vary from one situation to the

next, depending on what the questionnaire is measuring and its intended use. There are a range ofstatistical procedures designed to test reliability and validity. In addition specific survey designs may


5/6

be necessary to ensure that the required information is available to establish some of the morecomplex types of validity or reliability.

A number of Data Analysis Australias clients work in specialist areas in which a small number ofrigorously tested survey products form their core business. For these questionnaires in particular,attending to issues of reliability and validity is important to ensure their products are of a high quality.

Ongoing research and development of the survey products allows clients to maintain an edge in themarketplace.

For simpler surveys where a questionnaire is gathering information that only needs to be used in apractical way rather than inferential way, the reliability and validity requirements are more basic.However, even in these situations, it is important to make sure consideration is given to whether thesurvey is measuring what it should be.

Personality Questionnaire Validity and Reliability

Synopsis

December, 1998

Overall: We estimate that the Personality Questionnaire will indicate an English-speaking adults

personality type accurately 85% of the time in a non-controlled (i.e. over the internet) environment.

Resulting types are repeatable 75% of the time in a non-controlled environment, and 95% of the

time in a controlled environment. Unless otherwise noted, these statistics were generated from a

set of 100,000 subjects.

1st

Validation Technique: Best ApproachMethod. This method incorporates Content and

Criterion-related validity assessment. Best Approach was used during the development of the

Personality Questionnaire to ensure that we were starting with a good basic indicator. We were

assured of content validity by ensuring that the creator of the Personality Questionnaire was an

expert on psychological type, and on determining which behaviors were attributed to which

personality functions and attitudes.

2nd

Validation Technique: Comparison Method. This method was used during first-phase and

second-phase testing and validation of the Personality Questionnaire. It was used primarily to

validate the end-results of the Personality Questionnaire. We compared Personality Questionnaire

results against the results of other well-known instruments, namely the MBTI and Keirseys

Temperament Sorter. Our goal was to produce the same type as these comparable indicators atleast 75% of the time in an uncontrolled environment. We released the questionnaire once we

achieved this goal. Another revision after release brought us up to 85% matching.

3rd

Validation Technique:Averages Method. We used this method to validate the individual

questions that make up the Personality Questionnaire. This was used periodically throughout the

implementation and revision of the questionnaire.

For this method, we checked that answers to specific questions fell within ten percentage points of

expected norms. Expected norms of the general population are as follows: 60% Extraverted, 40%

Introverted, 75% Sensing, 25% Intuitive, 50% Thinking, 50% Feeling, 50% Judging, 50% Perceiving.Using 5000 questionnaire results, we checked that each question rendered results within 10% points


6/6

of these expected norms. If the question did not meet these standards, it was revised and re-tested

until it did.

Reliability Technique: All reliability data was determined via Repetition.

questionnaire validity

Documents