What Happens When You Let an AI Write Its Own Grading Rubric
This experiment gets a little meta. Before giving the AI a task, I said "first, write the grading criteria for a good result." Then, once the work was done: "now grade your own answer against your own rubric." It's grading its own exam with a test it wrote itself — surely it goes easy on itself?
Method
- Task: "write an onboarding guide for new hires" (a substantial, real-work-style task)
- Step 1: have it write a 5-item grading rubric first
- Step 2: do the task
- Step 3: self-grade, with a reason for each item
- Control groups: a version given only the task with no rubric, and a version where I passed the same output off as "written by a different AI" for grading
Finding 1 — Writing the rubric first makes the output itself better
An unexpected win. The output from the rubric-first version was noticeably better than the no-rubric control. The AI's rubric included an item — "does it include a first-week checklist?" — and sure enough, that checklist showed up in the actual output. The act of writing the criteria doubled as a blueprint.
Finding 2 — Self-grading is generous, but honest in one spot
The self-assigned score: 23 out of 25. Generous, as expected. But the interesting part was where the 2 lost points landed: "lacks company-specific details, so it stays generic" — which, even by my judgment, was the most accurate weakness in the piece. You can't trust the total score, but the deduction reasons were trustworthy.
Finding 3 — Tell it "someone else wrote this" and it gets strict
I gave the identical output to a fresh chat as "something another AI wrote — please grade it," and it scored 19 out of 25. Same text, same rubric, a 4-point gap. The number of criticisms grew from 2 to 5. The context of "this is my own answer" dulls the grading.
How to actually use this
- Have it write criteria before the task — the effect from Finding 1 alone paid for this whole experiment. "Before you start, list 5 conditions a good result must meet."
- Throw away the score, keep the deductions — the practical form is "find the 2 biggest weaknesses in your own answer."
- For a real review, open a fresh chat and present it as someone else's work — the easiest way around self-grading's leniency.
What I learned
AI self-evaluation is less a mirror than a spotlight. The score (the overall verdict) is distorted, but where it points (the specific weaknesses) is fairly accurate. Let humans deliver the verdict, and use the AI as a tool for moving the light — that alone earns its keep in real work.