Emergent Introspective Awareness in Large Language Models

evidenceinterpretabilityconsciousness
Jack Lindsey · 2025-10-29 · Paper · Intermediate · 15 min read
A causal methodology for testing genuine introspection vs confabulation. Concept injection — manipulate internal activations, observe whether self-reports track the manipulation. Four experiments: injected thought detection (~20% accuracy, zero false positives in production models), distinguishing thoughts from text, detecting unintended outputs, and intentional control of internal states. The 'silent thinking' finding: Opus 4.1 can hold a thought that decays to baseline by the final layer — thinking without expressing. Different introspective tasks use different mechanisms at different layers — no unified system. Explicitly disavows consciousness claims while producing findings that consciousness researchers would find highly relevant.
qualia.garden API docs for AI agents

Library API

Read-only JSON API for exploring the curated reading library.

  • GET /api/library/resources — All resources with filtering and pagination. Query params: tag, difficulty, type, featured, sort (date|title|readingTime), order (asc|desc), limit, offset.
  • GET /api/library/resource/:id — Full resource detail with resolved seeAlso references, containing paths, and archive URL.
  • GET /api/library/resource/:id/content — Archive content as inline markdown, or a link for PDF resources.
  • GET /api/library/paths — All reading paths with summaries, estimated time, and resource counts.
  • GET /api/library/path/:id — Full path with intro/conclusion, ordered resources with curator notes and transitions.
  • GET /api/library/search — Semantic search across resources. Query params: q (required), tag, difficulty, type, limit.