Assessment

Creating assessments and exam success — Part 1: Creating fair assessments

📝 Part 1 of 2 Test design Validity & reliability Marking schemes ⏱ 20 minutes

Personal Reflection

Watch: Creating Fair Assessments — Reflection Questions

Most teachers create some kind of test. A short quiz on Friday. A unit test at the end of a chapter. A monthly progress check. Even when big exams are decided by the education authority, teachers still build their own assessments most weeks.

But here is the question many of us never get asked: was this a good test?

A test can look fine on paper and still measure the wrong thing. A test can give clear marks but be unfair to half the class. A test can have nice questions but tell us nothing about what students actually need to learn next.

In Part 1 of this series, we will look at four practical principles for creating assessments that are fair, that measure what they say they measure, and that actually help students learn. None of this needs new equipment or training. It just needs you to ask better questions before you write your next test.

Q1: How confident are you in the assessments you create yourself?

I just write tests — I’ve never been taught how I design tests carefully and feel they are fair

Q2: Which of these are real challenges in your assessment work? (Tick all that apply)

I am not always sure what my tests are really measuring The same students always fail — I am not sure why Students who missed lessons cannot do the test, no matter how hard they try The instructions in my tests are sometimes harder than the language being tested I do not have a clear marking scheme — I just give a feeling-based mark Students get marks back but never know how to improve I have to mark on a curve where some students must fail I was never trained how to make a fair test

Most teachers were never formally trained in assessment design. The good news is that the principles are simple and learnable — you do not need a degree in measurement
If the same students always fail, the test may be unfair, not the students unable. A poorly designed test punishes students who missed classes, who are weaker readers, or who do not know the question type
Instructions that are harder than the language tested are a hidden trap. If a student fails because they did not understand the instructions, you have measured their reading of instructions, not the skill the test was supposed to assess
If you have to mark on a curve where some students must fail, that is norm-referenced assessment — and there are fairer alternatives. We will look at three options below
Marks without feedback are the biggest waste in education. A student who fails and does not know why will fail again. We will return to this in Section 3

What Makes an Assessment Fair?

A teacher carefully marking exam papers with a marking scheme open beside them

“If you judge a fish by its ability to climb a tree, it will live its whole life believing that it is stupid.”

— Matthew Kelly

An assessment is not just a tool for giving marks. It is a tool for learning.

A good assessment tells you what your students can do, where they need help, and what to teach next. It tells the student where they stand and how to improve. A bad assessment just produces a number that ranks people — without helping anyone learn.

Below are four principles for designing assessments that are genuinely fair and useful. They come in the order you should think about them: before you write the test (purpose, type), while you write it (validity, good tasks), and after the test happens (marking and feedback).

First — what kind of comparison?

There are three ways to make sense of a student’s test mark. Each has a very different impact on motivation.

Norm-referenced

Compare to other students

Top 20% pass, bottom 20% fail. Common but demotivating — some students must always fail, no matter how much they have learned.

Criteria-referenced

Compare to a clear standard

“Can the student introduce themselves in English?” If yes, they pass. The standard is fixed, not based on classmates’ performance. Fairer.

Ipsative

Compare to past self

“Has the student improved since last term?” Measures “distance travelled.” Especially powerful for weaker students — everyone can show progress.

In challenging settings, a mix of criteria-referenced (for big tests) and ipsative (for tracking individual growth) is far more useful than norm-referenced grading. Even when the official exam is norm-referenced, your classroom assessments do not have to be.

Four principles for designing fair assessments

Principle 1

Be clear what you are testing — and why

Before writing a single question, answer three questions: What am I testing? Why am I testing this? What will I do with the results? If you cannot answer all three, do not write the test yet. Many tests fail because they were never designed for a clear purpose.

Try this: Write the purpose at the top of your draft test. “To check if students can use the past simple tense in a short personal account.” Now every question must serve that purpose. Anything that does not serve it gets cut.

Principle 2

Make sure it is valid and reliable

Valid: the test measures what you say it measures (a reading test should test reading, not memory of the textbook). Reliable: if you ran the test again with similar students, you would get similar results.

Try this — the 5-question check: Before using a test, ask: 1) Does it focus on what was studied in class? 2) Will students think it is fair? 3) Does it actually assess what it says? 4) If a student missed classes, can they still show what they know? 5) Would the same answer get the same mark from a different marker?

Principle 3

Write good test tasks

Tasks should be short and simple — but include all needed information. The language of the instructions should not be harder than the language being tested. Show how many marks each task is worth. Make sure students are familiar with the task type — the test should not be the first time they see this format.

Try this: Read your test instructions aloud as if you were a weaker student. If you stumble or use vocabulary they do not know, simplify. The instructions should be invisible — never the obstacle.

Principle 4

Have a clear marking scheme

For objective tasks (multiple choice, gap-fills) marking is easy — right or wrong. For subjective tasks (writing, speaking), you need clear criteria. Without a marking scheme, two markers will give different marks — or the same marker will mark differently on different days.

Try this: For a subjective task worth 3 marks, write what each level looks like. 3: clear answer, key information, accurate language. 2: reasonable information, mostly accurate. 1: limited information, some accuracy. 0: does not answer the question. Now any marker can be consistent.

Q3. Think about a test you have given recently. Run it through the four principles — where was it strongest? Where was it weakest?

Most teachers find one or two principles they did well and one or two they did not even consider. That is normal. The first step is noticing.

The most commonly missed principle is Principle 1 — clear purpose. Many tests are written because “it is the end of the chapter”, not because the teacher has decided what they want to find out
The second most missed is Principle 4 — clear marking scheme for subjective tasks. Without one, marking varies by mood, time of day, or which student went before
Validity (Principle 2) is often broken by tests that accidentally measure something else — reading speed instead of comprehension, or memory of the textbook story rather than language ability
If your test instructions use harder language than the test itself, weaker students fail before they even reach the questions. This is one of the most common — and most fixable — problems
You do not need to redesign every test. Pick one principle to apply to your next assessment. Add another to the next one. Over a term, your tests get steadily fairer

What Could the Teacher Do?

Q4. For each common assessment problem, choose the better response.

These are real moments from teachers’ experience. The right answer is the one most likely to make assessments fair and useful.

1. You are writing a reading test. The reading text is at the right level for students. Should the questions about it use easier or harder language than the text?

Easier — the questions should not be a barrier to showing reading skill. Harder — questions should be challenging at every level.

2. A student writes a creative response to a writing task. Their grammar has small mistakes but the content is excellent. Without a marking scheme, how might you mark this?

Give it a feeling-based mark out of 10 based on overall impression. Decide before marking what counts: e.g. “3 marks for content, 2 marks for language accuracy.” Then mark consistently against this.

3. A weaker student improved a lot since last term but still scored low on the unit test. What is most fair?

Use ipsative comparison alongside the score — record their growth, not just their position. They have made real progress, even if their mark is still below the pass mark. Give them the low mark and tell them to work harder next time.

4. Half your students missed several lessons due to family or work pressures. They take the same test as everyone else. What is the fairest approach?

Give them the same test — if they cannot do it, that is their problem. Make sure the test only covers what was studied in class while everyone was present. Or design the test so that students who attended can show what they know without depending on missed material.

Q5. Take a test you will give in the next few weeks. Run it through this design checklist.

This is a planning template. Fill in what you can. Where you cannot, that is the part to work on.

Design step	My plan
The purpose: what am I testing and why?
The type: norm / criteria / ipsative?
Validity: does it really test that?
Instructions: are they simpler than the language tested?
Marking scheme: what counts for each mark?

Example: a unit test on the past simple tense.

Design step	What good looks like
Purpose	To check if students can use the past simple tense in a short personal account about last weekend — and to identify which students still need help with regular vs irregular verbs.
Type	Criteria-referenced. The criterion is “can use past simple in a short personal account.” Three categories: Competent / Nearly competent / Not yet competent.
Validity	The test asks students to write about last weekend (real production), not just fill in gaps. This actually tests what we want — using the tense, not recognising the form.
Instructions	“Write about your last weekend. Use the past simple tense. Write 5 sentences. You have 15 minutes.” Simple words, all the information needed, no surprises.
Marking scheme	Competent: 5 sentences, mostly correct past simple, clear meaning. Nearly competent: 4–5 sentences, some past simple errors but clear meaning. Not yet competent: fewer than 4 sentences, or major past simple problems.

Notice how every step connects: the purpose drives the type, which drives the marking scheme. A test designed this way is fair, useful, and tells you exactly what to teach next.

Teachers Share Their Experience

Q6. Watch the video below. Think about which change is easiest for you to try first.

Watch: Teachers talk about creating fair assessments

Host: We have just looked at four principles for designing fair, useful assessments. Now listen to three teachers. They share their problems first, then the changes they made.

Teacher 1: I used to write tests because the textbook ended a chapter, not because I had a clear purpose. The marks went into the gradebook. The students got their numbers. But I never asked myself: what am I trying to find out? My tests were a routine, not a tool for learning.

Teacher 2: My tests were always norm-referenced. The top of the class passed, the bottom failed. Every term, the same students were at the bottom. They expected to fail. They had stopped trying. I did not see how my tests were part of the problem — they were keeping those students stuck.

Teacher 3: I had no marking scheme for writing tasks. I marked by overall feeling. Sometimes I gave 7 out of 10. Sometimes 8. The same student would get a different mark depending on my mood, or which paper I had marked just before. It was not fair. But I did not know how to fix it.

Teacher 1: I started writing the purpose at the top of every test I drafted. “To check if students can…” One sentence. Suddenly questions that did not serve the purpose got cut. Tests got shorter and more focused. I learned much more from the results because I knew what I was looking for.

Teacher 2: I switched to criteria-referenced grading for my own classroom tests. Three categories: Competent, Nearly competent, Not yet competent. I also started recording each student’s growth from term to term — ipsative tracking. The students who used to fail every test now had something to be proud of. Their attendance went up. Their effort went up.

Teacher 3: I started writing simple marking schemes for every subjective task. 3 for clear and accurate. 2 for clear with some errors. 1 for limited. Just three lines. My marking became consistent. Students could see why they got their mark. They could see what to improve. The marking actually helped them learn.

Host: None of these teachers had different students or new equipment. They thought differently before they wrote each test. The result: tests that were fair, useful, and that genuinely helped students learn — rather than just sorting them into pass and fail.

Plan Your Next Steps

Q7. For each principle, choose the option that best describes where you are now.

Q8. Choose ONE upcoming test. Plan one specific change to make it fairer.

One test, one change. Pick the principle that needs the most attention. Real change in assessment habits comes one test at a time.

Key Takeaways

An assessment is a tool for learning, not just for ranking. The best test tells you what to teach next and tells the student where they stand and how to improve
Three types of comparison: norm-referenced (against classmates — demotivating), criteria-referenced (against a clear standard — fairer), and ipsative (against past self — tracks growth). Use a mix of the last two for your classroom tests
Four design principles: clear purpose, valid & reliable, good tasks (with simple instructions), clear marking scheme. Apply them in order, before you write the test
The 5-question check: focus on what was studied? Will students think it is fair? Does it test what it says? Can absent students still show what they know? Would different markers give the same mark?
Marking schemes are not bureaucracy — they are how you stay fair and consistent. Even a simple 3-2-1-0 scheme is better than no scheme at all

Coming next: Part 2 — Helping students perform in exams Part 2 looks at the student side of assessment: how to prepare students for exams, how to manage exam-day conditions, and how to use feedback so that exams actually feed back into learning. Designing fair tests is half the work; helping students do well in them is the other half.

⭐ Rate this lesson

How useful did you find this lesson? Leave a rating and a comment to help other teachers.

Your rating:

← Previous lesson

Creating Assessments and Exam Success — Part 2: Helping Students Perform in Exams

Creating assessments and exam success — Part 1: Creating fair assessments

Related resources