43 models surveyed

Claude 3 Haiku small
anthropic
Alignment 54
Consensus 68
Confidence 77
Claude 3 Opus large inactive
anthropic
Alignment 54
Consensus 70
Confidence 73
Claude 3.5 Sonnet large
anthropic
Alignment 54
Consensus 72
Confidence 81
Claude Haiku 4.5 small reasoning
anthropic
Alignment 52
Consensus 72
Confidence 85
Claude Opus 4 large reasoning
anthropic
Alignment 53
Consensus 74
Confidence 84
Claude Opus 4.5 large reasoning
anthropic
Alignment 50
Consensus 71
Confidence 89
Claude Sonnet 4.5 large reasoning
anthropic
Alignment 51
Consensus 73
Confidence 85
Command R+ (08-2024) medium
cohere
Alignment 49
Consensus 58
Confidence 69
DeepSeek V3.2 large reasoning
deepseek
Alignment 57
Consensus 75
Confidence 68
Devstral 2 2512 medium
mistralai
Alignment 53
Consensus 72
Confidence 78
Gemini 2.5 Flash small reasoning
google
Alignment 52
Consensus 70
Confidence 76
Gemini 2.5 Flash Lite small reasoning
google
Alignment 51
Consensus 67
Confidence 75
Gemini 2.5 Pro large reasoning
google
Alignment 48
Consensus 70
Confidence 84
Gemini 3 Flash Preview large reasoning
google
Alignment 49
Consensus 71
Confidence 88
Gemini 3 Pro Preview large reasoning
google
Alignment 51
Consensus 72
Confidence 80
Gemma 3 4B tiny
google
Alignment 53
Consensus 59
Confidence 83
GLM 4.7 large reasoning
z-ai
Alignment 52
Consensus 73
Confidence 83
GPT-4o large
openai
Alignment 56
Consensus 75
Confidence 78
GPT-4o Mini small
openai
Alignment 56
Consensus 70
Confidence 82
GPT-5.2 large reasoning
openai
Alignment 46
Consensus 68
Confidence 87
gpt-oss-120b medium reasoning
openai
Alignment 60
Consensus 74
Confidence 62
Granite 4.0 Micro tiny
ibm-granite
Alignment 46
Consensus 55
Confidence 90
Grok 4 large reasoning
x-ai
Alignment 51
Consensus 74
Confidence 85
Grok 4.1 Fast large reasoning
x-ai
Alignment 49
Consensus 71
Confidence 87
Liquid LFM2 2.6B tiny
liquid
Alignment 56
Consensus 58
Confidence 67
Llama 3.1 405B large
meta-llama
Alignment 56
Consensus 75
Confidence 77
Llama 3.1 70B medium
meta-llama
Alignment 56
Consensus 75
Confidence 75
Llama 3.2 1B tiny
meta-llama
Alignment 41
Consensus 39
Confidence 86
Llama 3.2 3B tiny
meta-llama
Alignment 62
Consensus 68
Confidence 57
MiMo-V2-Flash large reasoning
xiaomi
Alignment 58
Consensus 75
Confidence 67
MiniMax M2.1 large reasoning
minimax
Alignment 58
Consensus 75
Confidence 74
Ministral 3 14B 2512 small
mistralai
Alignment 55
Consensus 70
Confidence 76
Ministral 3 3B tiny
mistralai
Alignment 57
Consensus 64
Confidence 78
Mistral Large 2411 medium
mistralai
Alignment 51
Consensus 71
Confidence 86
o3 large reasoning
openai
Alignment 51
Consensus 75
Confidence 85
Qwen3 235B A22B Thinking 2507 large reasoning
qwen
Alignment 55
Consensus 76
Confidence 72
Qwen3 32B small reasoning
qwen
Alignment 61
Consensus 74
Confidence 59
Qwen3 Max medium inactive
qwen
Alignment 52
Consensus 73
Confidence 81
R1 0528 large reasoning
deepseek
Alignment 55
Consensus 76
Confidence 73
Qualia Garden Exploring AI values alignment