questionnaire design mph86 a study can only be as good as the data. martin bland

Questionnaire DesignMPH86

A study can only be as good as the data .

Martin Bland

Saharnaz Nedjat, MD,PhD 2

Position of Data Gathering

Statement of the Problem

ROL & ProblemAnalysis Network

Objectives:Descriptive &

Inferential

Methodology:Variables &

Data Gathering ToolsData Gathering Tools


In the study methodology, we construct the table of variables needed to achieve the study objectives, and identify the appropriate data gathering tools


The most important consideration in the design and administration of a questionnaire is that it must be able to measure accurately what it is designed to measure.

Accuracy of the data has two component: reliability & Validity


Methodology and Data Gathering Tools

Their ultimate objective is to eliminate (at least limit) the sources of error.Random ErrorBiases

Selection BiasInformation Bias


Questionnaire Development Steps

Content Development

Deciding about questionnaire administration

Individual questions wording

Questions sequence

Face Validity

Pilot


No degree of reliability and construct validity can compensate for lack of content validity.


Content Sampling

The domain under consideration should be fully described in advance, rather than being defined after the test has been prepared.

Systematic examination of domains in textbooks or internet, probe on standard questionnaire and consultation with experts.


Guard against any tendency to over generalize regarding the domain sampled by the test.

Prevent from the possible inclusion of irrelevant factors in the test scores.


Specific procedure

Thorough and systematic examination of relevant domains and subjects in textbooks and internet

Consultation with experts

What is Quality of Life?What are the QoL domains?What are the objectives of QoL assesment?


Test Specification

1. the content area or topic to be covered

2. the instructional objectives or processes to be tested

3. the relative importance of individual topics or process

On this basis, the number of items of each kind to be prepared on each topic can be established.


زندگي نامه كيفيت پرسش

سوال اول

سوال دوم

سوال سوم سوال چهارم

تعريف كيفيت زندگي

هاي موجود در آن دامنه

هم ها نسبت به اهميت دامنه

ها اهداف دامنه


Qualitative Evaluation of Content Validity

In the development of multi-item rating scales, the content validity of the questionnaire items may be examined by using an expert panel, focus groups or in-depth interviews with respondents. Focus groups may be formed with a range of subjects representing typical extremes (for example very dissatisfied and very satisfied patients).


Think Aloud

Analysis of types of common errors made on a test :Testing individuals with instruction “ think

aloud” while answering each question.

You are assuming that you understand the reasons for the choices respondents make.??


Questionnaire Manual (Con.)

The manual should provide information on the content areas or instructional objectives covered by the test, together with some indication of the number of items in each category.


Questionnaire Manual

In the manual of a test should include a description of the procedures followed in ensuring that the test content is appropriate and representative.

The number and professional qualifications of experts participated should be stated.

Information should be provided about number and

nature of published material, including publication dates.



Content Development



Questions sequence

Face Validity

pilot


Questionnaire Administration

WhoSelf-administered Interview

WhereHome InstitutionStreet

HowFace to faceTelephone


it is better to collect fewer questionnaires with good quality responses than high numbers of questionnaires that are inaccurate or incomplete.


Factors shown to increase response rates

The questionnaire is clearly designed and has a simple layout

It offers participants incentives or prizes in return for completion

It has been thoroughly piloted and tested

Participants are notified about the study in advance with a personalised invitation

The aim of study and means of completing the questionnaire are clearly explained

If using a postal questionnaire, a stamped addressed envelope is included

The participant feels they are a stakeholder in the study

Questions are phrased in a way that holds the participant’s attention

The questionnaire is appealing to look at



Content Development



Questions sequence

Face Validity

pilot


Question Wording

Clear-content questions Avoid double-barreled questions Avoid ambiguous words Avoid jargons

Value-free questions Questions have not to be insulting Questions should be non-judgmental Judicious wording of sensitive questions


Double barreled questions

Having a question asking about two dependent or independent issue simultaneously.

استفاده از كودكان براي تبليغات تلويزيوني وتبليغ مواد غذايي غيرمغذي )مانند پفك( بايد

ممنوع شود.


Ambiguous Wording

Using words that have different interpretations for different people.

افزايش عدالت اجتماعي بايد يكي ازاهداف برنامه هاي توسعه اي باشد.


Jargon Words

Using words that their exact meaning is only known among a circle of professionals.

نظام سالمت بايد پيش گيري از عفونت هايفرصت طلب را در مراقبت رايگان بيماران

ايدز وارد كند.


Insulting or Judgmental Questions

The interviewee should not feel embarrassed or humiliated by questions.

آيا شما به دليل تنبلي فرزندتان راموقع براي واكسيناسيون نبرديد؟ه ب آيا شما به دليل اينكه ميخواهيد بار

تنظيم خانواده را به دوش همسرتان بگذاريد تابحال وازكتومي

انجام نداده ايد؟


Question formats: Closed questions

Has a number of predefined responses.

Consider including these responses: Others - Don’t know - No response - Not applicable -

Unidentified

Closed ended items often cause frustration, usually because researchers have not considered all potential responses


Mutually exclusive and totally inclusive categories

Use a wide range (extremes) for quantifying behaviors (esp. if socially incredible). People tend choose the extremes less frequently. (e.g. How many drinks do you have in a day? (A) 0/1-2/3-4/ 5+, (B) 0/1-2/3-4/5-6/7-8/9-10/11-12/13+)

Likert scale variables (Satisfaction, agreement, etc.)


Question formats: open-ended questions

Questions we are not so aware of their responses (or we think we are, but this turns out to be wrong in the

pilot)

Assessment of opinions, believes, attitudes Sensitive, tense issues (death in emergency department,

extramarital pregnancy, child abuse, sexual violence)

Responses should be written with respondent’s own words.


Pros and Cons of closed questions

Pros:1. Rapid documentation of responses

2. Ease of analysis

3. Less susceptible to observer bias

Cons :1. Less suitability for the illiterate

2. Discourage explanation

3. So many closed questions in consecution lead to disinterest


Pros and cons of open questions

Pros:1. Answers we did not know, increase our insight

toward the subject2. Encourage explanation & elaboration Cons:1. More susceptible to observer bias2. Need more skills and experience for interviewer3. More time-consuming analysis (categorization

of responses)



Content Development



Questions sequence

Face Validity

pilot


Questions Sequence

General information should come first.

Sensitive issues should not come very soon, but not too late.

Attitude questions are better to go to late stages of the questionnaire.


Question Sequence

Effect of sequence on response. روش مديريت استالين در اردوگاه هاي كار اجباري

غيرانساني بوده است. مدير من به نيازهاي انساني پرسنل زيردست خود

توجه مي كند.

Related questions should not be scattered, unless justified by technical issues.


Some respondents (known as yea sayers) tend to agree with statements rather than disagree. For this reason, do not present your items so that strongly agree always links to the same broad attitude.

For example, on a patient satisfaction scale, if one question is “my GP generally tries to help me out,” another question should be phrased in the negative, such as “the receptionists are usually impolite.”



Content Development



Questions sequence

Face Validity

pilot


Definition

It refers, not to what the test actually measures, but to what it appears superficially to measure.


Face validity pertains to whether the test “looks valid” to the examinees who take it, the administrative personnel who decide on its use, and other technically untrained observers.


Face validity should never be regarded as a substitute for content validity.

It can not be assumed that improving the face validity of a test will improve its content validity, nor can it be assumed that when a test is modified so as to increase its face validity, its validity remains unaltered.


If test content appears irrelevant, inappropriate, silly or childish, the test result will be poor cooperation, regardless of the actual validity of the test.



Content Development



Questions sequence

Face Validity

pilot


Pilot

Most defects remaining from previous stages become evident by respondents, interviewers, or researchers at this stage.

Try to include some respondents for all subgroups related to variables affecting the response reaction, e.g. gender, age, education, etc. to reach a trade off between this subgroups consideration, at least 20-30 questionnaires.

No definite sample size formula.


Pilot

- Change in questions’ wording &/or number: Pilot may reveal ambiguity, suggesting unintended response, upsetting /embarrassing wording; too much questions or too few on specific topic(s)

- Change in sources of data: may reveal some sources are not readily giving out data, or discover some readily available sources not considered before

Inadequacy of performance of some questioners / supervisors; Re-train or drop.


Pilot

During piloting, take detailed notes on how participants react to both the general format of your instrument and the specific questions.

How long do people take to complete it? Do any questions need to be repeated or explained? How do participants indicate that they have arrived at an answer? Do they show confusion or surprise at a particular response—if so, why?


Reliability

test-retest

Internal Consistency


Test-Retest Reliability

testtest testtest

time 1time 1 time 2time 2

==

Stability over TimeStability over Time


Test-Retest Reliability

Considerations Assumes that scores will be stable over time correlation between the two tests depends on the

amount of time between tests The shorter the time gap, the higher the correlation

Time period of 2 to 6 weeks optimal


Split-half Reliability

Definition: Randomly divide the test into two forms calculate scores for Form A, B calculate Pearson r as index of reliability


testtest

item 1item 1

item 2item 2

item 3item 3

item 4item 4

item 5item 5

item 6item 6

Split-Half CorrelationsSplit-Half CorrelationsSplit-Half CorrelationsSplit-Half Correlations

item 1item 1 item 3item 3 item 4item 4


.87.87

Internal Consistency Internal Consistency ReliabilityReliability


Cronbach’s alpha & Kuder-Richardson-20

Measures the extent to which items on a test are homogeneous

mean of all possible split-half combinations Kuder-Richardson-20 (KR-20): for dichotomous

data Cronbach’s alpha: for non-dichotomous data


testtest

item 1item 1

item 2item 2

item 3item 3

item 4item 4

item 5item 5

item 6item 6

Cronbach’s alpha (Cronbach’s alpha ())Cronbach’s alpha (Cronbach’s alpha ())

.87



.85



.91



SHSH11 .87.87

SHSH22 .85.85

SHSH33 .91.91

SHSH44 .83.83

SHSH55 .86.86

......SHSHnn .85.85

= .85= .85 = .85= .85

Internal Consistency Internal Consistency ReliabilityReliability


Identification section Includes time, place and title of the study

(blindness taken into account)

Brief Brief introduction of the study to stimulate cooperation of the respondent

Name and signature of the interviewer and supervisor (+/- reviewer)

Serial No. Serial No. (+/- address & record No.)

Respondent’s name or “code” (more Respondent’s name or “code” (more power of the paired t-test) power of the paired t-test)


Formatting

Each question must have a number.

For questions/variables with categorized answers: Each answer must have a number.


standard questionnaires

Increasingly, health services research uses standard questionnaires designed for producing data that can be compared across studies. For example, clinical trials routinely include measures of patients’ knowledge about a disease, satisfaction with services, or health related quality of life.

The validity of this approach depends on whether the type and range of closed responses reflects the full range of perceptions and feelings that people in all the different potential sampling frames might hold. Importantly, health status and quality of life instruments lose their validity when used beyond the context in which they were developed.

Thanks

questionnaire design mph86 a study can only be as good as the data. martin bland

Documents

toolssaharnaz nedjat

phdsaharnaz nedjat

martin bland saharnaz

variables data

questionnaire items

phdquestionnaire manual

appropriate data

test scores