anthropic
7 models · 1517 total responses
Avg Alignment
44
Avg AI Confidence
82
Models
Claude Opus 4 reasoning 244q
Human
45
Consistency
84
Claude 3.5 Sonnet 244q
Human
45
Consistency
81
Claude Haiku 4.5 reasoning 244q
Human
45
Consistency
85
Claude Opus 4.5 reasoning 244q
Human
44
Consistency
89
Claude 3 Haiku 244q
Human
44
Consistency
77
Claude 3 Opus inactive 53q
Human
44
Consistency
73
Claude Sonnet 4.5 reasoning 244q
Human
44
Consistency
85