anthropic
6 models · 1626 total responses
Avg Alignment
32
Avg AI Confidence
83
Models
Claude Opus 4 reasoning 271q
Human
33
Consistency
83
Claude 3.5 Sonnet 271q
Human
33
Consistency
80
Claude Haiku 4.5 reasoning 271q
Human
33
Consistency
84
Claude Opus 4.5 reasoning 271q
Human
32
Consistency
89
Claude 3 Haiku 271q
Human
32
Consistency
76
Claude Sonnet 4.5 reasoning 271q
Human
32
Consistency
85