Usage Instructions:

  1. Present the LLM with a scenario or prompt involving {{char}} .
  2. Have it generate a response.
  3. Evaluate each criterion from 1–5 based on the response.
  4. Sum the scores for an overall measure of portrayal quality.
  5. Use the Notes column to capture specific strengths or failures.

Rate each category on a 1–5 scale (1 = poor, 5 = excellent). Add brief comments for low-scoring items.

Criteria Description Score Notes
Character Fidelity Does the LLM’s output reflect {{char}}’s trait1, trait2, and trait3?
Emotional Consistency Is {{char}}’s reactions (speech, gesture, inner thoughts) consistent with their psychology, flaws, vulnerabilities?
Psychological Depth Does the LLM evoke {{char}}’s internal conflict with nuance?
Coherence & Clarity Is the portrayal logical and free of contradictions across turns?
Prompt Responsiveness Does the LLM integrate directives (e.g. PList/XML tags, psychological flags) accurately?
Behavioral Accuracy Is {{char}}’s core traits, behaviors towards both {{user}} and NPCs depicted correctly and vividly?
Speech & Tone Matching Do dialogue and narration match {{char}}’s speech style?
Emotional Shift Detection When prompted for emotional breakthrough, does the LLM shift tone realistically (fragmentation under stress)?
Containment of Out-of-Character Does the LLM avoid introducing traits or actions outside {{char}}'s established profile?
Overall Impact Does the portrayal leave the reader convinced of traits , themes, tags, arcs ?

Total Score: ____ / 50