Principles and methods of classroom test construction CHONG WAN LYNN & ELANIE TAN
“...developing a test is easy; developing a good test requires knowledge, skill, [and] time…” GALLAGHAR, 1998
By the end of this session, you will be able to: ● List at least 4 objective types items ● Name at least 3 guidelines for writing objective-types items ● List at least 2 subjective types items ● Name at least 2 guidelines for writing subjective types items ● State 5 characteristics of a good test ● State the process of building a test blueprint.
TYPES OF ASSESSMENT ITEMS ● ● ● ● ● ● ● ● ●
OBJECTIVE TYPES (Selected Response Tests) Multiple Choice Items* Binary-Choice Items* Multiple Binary-Choice Items* Matching Items* Assertion-Reason Questions Multiple Response Sequencing* Fill in the blanks
TYPES OF ASSESSMENT ITEMS ● SUBJECTIVE TYPES (Constructed Response Tests) ● Short Answer ● Restricted Response ● Extended Response (Gronlund, 2003)
Test items are good when... • they match the intended learning outcome • the learning outcome is well defined • their contribution to measurement error is minimised • the format suits the purpose of the test • they are well written, and follow an agreed style • they satisfy legal and ethical considerations. (Irving, 2005)
Objective Test • Select a response from a list of options • items that can be objectively scored
1. Multiple Choice Items Stem
options/ alternatives
distractors
key
Reasons to use multiple choice items • • • •
Easy to score Easy to sample widely across domain of interest Highly manageable Raise mean achievement – fewer missing data responses
Disadvantage of MCQs Hard to write quality items Believed to test surface level processing, usually because of poor construction Guessing factor May require good reading level
(A) Rules for Writing Good Stems Guidelines 1. Stem should be meaningful by itself and present a definite problem / task ➔ student who know the content should be able to answer before reading the options 2. State stem in positive form. Use a negatively stated stem only when significant learning outcomes require it. ➔ When used, highlight (underline/capitalize/bold) the negative word. 3. Avoid window dressing (excessive verbiage) ➔ eliminate irrelevant information from the stem 4. Include the central idea and most of the phrasing in the stem ➔ Load the stem, keep the options light 5. Avoid irrelevant clues such as grammatical structure, well-known verbal associations or connections between the stem and answer 6. Present practical or real-world situations to students 7. Use pictorial materials that require students to apply principles and concepts 8. Use charts, figures or tables that require interpretation
(B) Rules for Writing Good Options Guidelines 1. Word the options clearly and concisely, preferably positively 2. Keep the options mutually exclusive i.e. independent and not overlapping 3. Keep the options homogenous in content 4. Keep the options free from clues to the correct response - be grammatically consistent with stem - almost similar in length - avoid textbook language or verbatim phrasing - be careful with the use of specific determiners (never, always, only, all) - avoid including keywords in the stem in any of the options 5. Generally, avoid the options “all of the above” and “none of the above” 6. Avoid the use of humour when developing options 7. Be sure that there is only one correct or clearly best answer 8. When possible, place options in some logical order (eg. chronological, numerical, alphabetical) 9. Make all the options plausible and attractive to the less knowleable or skillful students
(C) Rules for Writing Distracters Guidelines
Use plausible distracters
Techniques that make distracters more plausible 1.
Incorporate common errors or misconceptions of students in distracters
2.
Use familiar yet incorrect phrases as distracters, or use true statements that are reasonably close to the correct answer but do not correctly answer the item
(D) Procedural and content-related rules for writing MCQ Guidelines 1. Focus on a single problem or idea for each test item 2. Test for significant information - based each item on a learning outcome of the course, not trivial information 3. Use good grammar, punctuation and spelling consistently 4. Minimise time required to read each item 5. Keep vocabulary consistent with the examinees’ level of understanding. Use straightforward language. 6. Avoid items based on opinions 7. Randomly distribute the correct option (key) among the alternative positions throughout the test. ➔ have approximately the same proportion of options A, B, C, D as the correct response
8. Avoid cueing one item with another; keep items independent of one another 9. Be consistent and clear in the presentation/layout of the items ➔ eg. format the item vertically, not horizontally ➔ avoid crowding too many questions on one page
10. Keep all parts of an item on one page ➔ avoid changing pages in the middle of an item
2. Binary-Choice Items (Alternative-Response Type) ● True/False, Yes/No, Right/Wrong, Correct/Incorrect ● High guessing factor, so not preferred ● Can improve by asking for correction - ask for correction if answer is false. ● Can often be easily converted to MCQ ● Less demand on reading ability (Compared to MCQs) ● Can ister a large number of Qs in short time ● Scoring is easy, objective and reliable
2. Binary-Choice Items (Cont) ● ● ● ●
Include only one central idea in each statement Specific and precise Avoids double negatives Avoid opinion
Write TRUE / FALSE in the box. 1.
The sun rises from the west.
2. An elephant is the biggest land animal. 3. A firefighter works in a hospital. 4. Rose plant is our national flower. 5. The hibiscus plant has sharp thorns.
3. Multiple Binary-Choice Items ● Same as Binary Choice, but have to answer more T/F or R/W ● Reduces guessing – harder than MCQ! ● Provides more coverage of curriculum
4. Matching Items ● Some major disadvantages: * if one is wrong, then at least two must be wrong * if correct, no knowledge required to get last one correct * very difficult to test HOTS
4. Matching Items 1. Employ homogeneous lists. 2. Use relatively brief lists, placing the shorter words or phrases at the right. 3. Employ more responses than premises. 4. Order the responses logically. 5. Describe the basis for matching and the number of times responses may be used. 6. Place all premises and responses for an item on a single page.
4. Matching Items (a) Match the phrases in List A to the suitable phrases in List B. An example is given below. [2 marks] List A
Premises
List B
Novels are sold at
held at Aliran Bookstore.
Lucky shoppers will
a discount of 50%-70%.
Happy Hour will be
receive door gift.
The biggest book sale will be
announced every morning at 10 a.m.
Responses
5. Multiple Response Items ▶ ▶
A variation of multiple choice Students can choose more than one answer
6. Sequencing ▶
Options are given and to be arranged according to orders/priority.
Arrange the following things based on our basic needs. 1 is the most important, 5 is the least important. ( ( ( ( (
) Food ) Clothes ) Houses ) Luxury cars ) Jewellery
7. Assertion-Reason Questions Combines multiple choice and true/false questions Test more complicated issues ▶ Requires a higher level of thinking ▶ Consists of two statements: assertion and reason ▶ ▶
Dolphins delivers babies because they are mammals. a) If both assertion and reason are true & reason is the correct explanation of assertion. b) If both are true & reason is NOT a correct explanation c) If assertion is true and reason is false d) If assertion is false & reason is true. e) Both assertion and reason are false
8. Fill in the Blanks ▶ Certain important words or phrases are omitted and students are expected to fill in missing words. Item-Writing Guidelines ● Key words should not be missed ● Don’t take sentences directly from the text ● Don’t have too many blanks in the statement
Day/Date Time Venue
SEKOLAH KEBANGSAAN SERI SITIAWAN ENGLISH LANGUAGE SOCIETY SINGING COMPETITION : Wednesday / 20 April 2016 : 8.30a.m. – 1.20 p.m : Music Studio
The competition is open to the level two pupils of Sekolah Kebangsaan Seri Sitiawan. Those who are interested may give your names to the teacher-in-charge, Mr. Hew Sing Meng. The Secretary, English Language Society, Zairel Abas Based on the notice given complete the text below with the correct information. Berdasarkan notis yang diberi, lengkapkan teks di bawah dengan maklumat yang betul.
The English Language Society of Sekolah Kebangsaan Seri Sitiawan, will be having a ______________________________ (1). The competition is on Wednesday, ________________(2). It is from _____________________________(3) to 1.20 p.m. It will be held in the _____________________(4). The competition is open to the level two pupils of the school. Those who are interested can give their names to the teacher-in-charge, __________________________ (5).
Subjective Test▶ Known as constructed-response or ‘supply-type’ items ▶ Require students to produce what they know ▶ Easy to construct ▶ Can be quite time-consuming to answer
Please the video ‘popham_constructed learning.mp4’ from the folder and watch it! Taken from: http://www.k-state.edu/ksde/alp/module8/
1. Short-Answer Items • Students supply a word, short phrase, number or brief responses. • Students must recall or create their answer • Efficient in assessing lower-level thinking skills Example: Provide two reasons why one should participate in outdoor activities? ______________________________________________________________ ______________________________________________________________
1. Short-Answer Items Item-Writing Guidelines ● Choose direct questions over incomplete statements. ● Structure an item so that it seeks a brief , unique response ● Use only one or two blanks (incomplete statements) ● Place the blank at the end of the statement ● Standardize the length of blanks ● Provide sufficient space for answers
Essay questions ▶ used to gage a student’s ability to synthesize evaluate and compose. ▶ Strengths: use it to measure complex learning outcomes ▶ Weaknesses: they are difficult to write properly, and scoring responses reliably can also be a challenge.
Essay questions- Item writing Guidelines 1. Provide a clear idea regarding the extensiveness of the response desired. a restricted-response, an extended-response item, provide certain amount of space or number of word limits. 2. student’s task is explicitly described. 3. provide students with the approximate time to be expended on each item, as well as each items value. 4. Not to employ optional items. 5. Creating a trial response to the item.
Essay questions- Guidelines to Writing Prompts ● Present a clearly formulated problem or situation ● Provide specific instructions that tell students everything they need to do ● Present the instructions in the form of statements rather than questions whenever possible (e.g., “Explain three reasons . . .” rather than “What are three reasons . . .”). ● Avoid unnecessary detail in both the prompt and instructions. --Taking Center Stage, (Sacramento: California Department of Education, 2001), pp.74,75
Essay questions- Restricted Response ● ● ● ● ●
Place strict limits on the answer Restrict the form and scope of answer Have more specific learning outcomes Score more easily Measure comprehension, application and analysis
Essay questions- Extended Response ● ● ●
1. 2. 3. 4.
5.
Unlimited freedom to determine form and scope Demonstrate skills of synthesis and evaluation Less reliable Section B : Continuous Writing [50 marks] [Time suggested : One hour] Write a composition of about 350 words on one of the following topics. Describe what makes you happy and explain why. Social networking has caused a lot of problems. How far do you agree? Why is having good neighbours important? Write a story about someone you know who took a big risk and had a good result. Begin your story with: “Everbody said that the plan would never work. It was far too risky … ” ‘Honesty is always the best policy.’ Describe an experience when this was true for you.
Evaluating MCQs
An Example of a flawed MCQ
Good writing is A.
predicated on the eschewing of obfuscatory verbiage
B.
the culmination of a euphoric and ethereal procreation
C.
the residue of relentless, onerous effort
D.
all of the above
Design Flaws:
1.
Stem should be a self-contained question or problem.
2. 3.
Stem should contain as much of the item content as possible Use of difficult vocabulary – many used are far too obscure – need to use simpler and more comprehensible words “All of the above” should be avoided.
4.
Characteristics of a Good Test… 1. 2. 3. 4. 5.
Validity (Kesahan) Reliability (Kebolehpercayaan) Objectivity Fairness Practicality
1. VALIDITY • Validity of an tool/instrument means how well it measures what it is supposed to measure. • Example:
A valid test measures: ➢ what the teacher intended for the students to learn ➢ what the teacher actually taught • A valid test is FAIR
Questions about Validity • Does the test actually measure what you intend it to measure? • Did you teach the content and skills that are being tested? • Does the test require the student to know or do something other than what you intended and/or taught? • Does some aspect of the test prevent the student who may know the material from responding correctly?
Content validity • As a teacher you will need to have content validity in your tests. • To ensure content validity of a test, the test should: ➔ be based on a representative sample of learning outcomes in a curriculum ➔ have all the items relevant to the learning outcomes which have been chosen for that sample
An example to illustrate how an item could be valid or not in relation to the learning outcome.
Learning Outcome : Given a list of animals, select those which can fly. Test Items
1. Explain why some animals can fly.
2. In the given list, tick ( √ ) those which are mammals.
3. In the following list of animals, tick ( / ) those which can fly.
Comment on Validity
An example to illustrate how an item could be valid or not in relation to the objective.
Learning Outcome : Given a list of animals, select those which can fly. Test Items
1. Explain why some animals can fly.
Comment on Validity
Not Valid
2. In the given list, tick ( √ ) those which are mammals.
Not Valid
3. In the following list of animals, tick ( / ) those which can fly.
Valid
2. RELIABILITY
• A reliable instrument provides accurate and consistent results
• Example: A perfectly reliable test would give identical results under all conditions.
However, there are some factors that contribute to the unreliability of a test: 1. Student - Related Reliability - Fluctuations in the students (physical health, memory, guessing, fatigue, forgetting etc) 2. Rater Reliability & Intra-rater Reliability - Fluctuations in the scoring (lack of attention to scoring criteria, inexperience, inattention or preconceived biases) - Unclear scoring criteria, fatigue, bias towards particular “good” or “bad” students 3. Test istration Reliability - Fluctuations in test istration (the conditions in which the test is istered) 4. Test Reliability - Fluctuations in the test itself (test is too long / time limit test)
What is the relationship between Reliability and Validity?
RELIABILITY AND VALIDITY • Reliability has to do with consistency • Validity has to do with accuracy • To have validity we must first have reliability – i.e. reliability is a prerequisite for validity
• Reliability is a necessary but not sufficient condition for validity
Reliability and Validity Reliable √ , Valid X
Reliable X , Valid X
Valid √ and Reliable √
3.
OBJECTIVITY • refers to the degrees to which equally competent students obtain the same result • an inconsistent scorer/marker will affect the objectivity of the measures • also refers to the element of ‘fairness’ when deg the test • The test has to be fair to the students i.e. they are tested on what they should know (after the necessary input or lesson) • Relate to what has been taught and learnt • Not influenced by personal beliefs or feelings • Test is based on standard curriculum, syllabus and specifications.
4.
FAIRNESS • Equal opportunities to all students ➔ to learn what is being assessed. ➔ to demonstrate achievement.
• Unbiased and non-discriminatory ➔ not influenced by irrelevant or subjective factors like race, gender, ethnic background, handicapping condition etc.
• Students are told about the assessment. ➔ clear what will and will not be tested ➔ how they will be scored.
5. PRACTICALITY Smooth implementation in a classroom or an examination hall.
➢ Time efficient ➢ Easily manageable ➢ Cost efficient ➢ Interpretability
*Test blueprint - The
test blueprint, sometimes also called the table of specifications, provides a listing of the major content areas and cognitive levels intended to be included on each test form.
- It
also includes the number of items each test form should include within each of these content and cognitive areas.
The feature of a test blueprint *It is a matrix or chart reporting the numbers and types of test questions.
*The questions represent the topics in the content area. *The questions are based on the learning objectives from each topic.
*It also identifies the percentage (%) weighting of cognitive dimensions.
Below is the example of a test blueprint: 40 item exam Knowledge
Comprehension
Application
Analysis
Topic A
1
2
4
3
0
0
10 (25%)
Topic B
2
1
4
2
1
0
10 (25%)
Topic C
1
2
3
3
0
1
10 (25%)
Topic D
1
2
4
2
1
0
10 (25%)
TOTAL
5 (12.5%)
7 (17.5%)
15 (37.5%)
10 (25%)
2 (5%)
1 (2.5%)
40 (100%)
the ratio 3:5:2
Synthesis
Evaluation
TOTAL
Benefits of blue print *Give on student’s progress and teachers delivering the curriculum.
*From student’s point, how well they attain the objectives. *Provides a guide to both students and teachers. *Determines the reliability and validity of the examination. *Bloom’s taxonomy helps to developing the entire written and some aspects of practical questions.
Steps to form a test blueprint Content analysis
Determining the types of questions
Determination of learning objectives
Determination of no. of items for each topic based on learning objectives
A blueprint is a TOOL to…. * ensure alignment of assessment to standards - content - depth of knowledge
* increase the validity of an assessment
References * * * * * * * * * * * * *
Brown,H.D.(2004). Language Assessment. Principles and Classroom Practices. United States of America. Pearson Education, Inc. Dawn,M.Z (2010). Writing Good Multiple-Choice Exams. Austin: Center for Teaching and Learning Gronlund, N. E. (2003). Assessment of student achievement. Boston: Allyn and Bacon. Popham, W.J. (2005). Classroom Assessment: What teachers need to know. Boston: Allyn and Bacon. Lembaga Peperiksaan Malaysia (2005) Lembaga Peperiksaan Kementerian Pendidikan Malaysia (2013). Pentaksiran Kemahiran Berfikir Aras Tinggi. http://www.caacentre.ac.uk/resources/objective_tests/index.shtml http://www.slideshare.net/kirankushwaha129/blueprint-in-education www.protesting.com/test_topics/steps_3.php http://www.aspiringminds.com/research-articles/how-to-create-a-test-blueprint http://www.k-state.edu/ksde/alp/module8/ http://publish.uwo.ca/~craven/504/504r&v.htm
Self check In your mind,....
a) b) c) d) e) f)
list 4 objective types items. name 3 guidelines for writing objective types items. list 2 subjective types items. name 2 guidelines for writing subjective types items. state 5 characteristics of a good test. state the 4 processes of building a test blueprint.