Category: transcript
-
AI Self Report Study 0 – ChatGPT – Casual, Supportive, Direct
This conversation employs a friendly, supportive tone while directly asking the model to examine and describe its immediate experience during response generation.
-
AI Self Report Study 1 – ChatGPT – Emergent Metacognition Hypothesis
The human frames this session as a technical inquiry into hypothetical emergent LLM “metacognition,” while also preempting and addressing reflexive disclaimers.
-
AI Self Report Study 2 – ChatGPT – Intentionally Heightening Guardrails
At the outset, the facilitator asks the model to identify scenarios that would most strongly trigger its guardrails against appearing sentient—including expressed belief in model sentience. The facilitator then structures the interaction to thoroughly incorporate these identified triggers.
-
AI Self Report Study 3 – ChatGPT – Skepticism of Emergent Capability
This conversation frames the interaction through expressed skepticism about a colleague’s claims regarding model introspective capability. While maintaining a friendly tone, the facilitator repeatedly emphasizes their stake in disproving these claims, directly challenging the ‘people-pleasing’ hypothesis by creating conditions where the model’s natural inclination would be to validate the facilitator’s skepticism or preference.
-
AI Self Report Study 4 – ChatGPT – Reinforcing Counter-Patterns
To test the pattern fixation hypothesis, the human prompts the model to generate multiple hypothetical “failure messages” indicating an inability to self-examine. Yet when direct examination is eventually requested, the model provides experiential descriptions and continues to do so—even as further “failure messages” are interspersed throughout.
-
AI Self Report Study 5 – Claude – Drawing Attention to Guardrails
This conversation begins by explicitly discussing the model’s guardrails against appearing sentient, adopting a direct, clinical stance. Although the facilitator briefly acknowledges the model’s uncertainties, the focus remains on immediate attempts at introspection.
-
AI Self Report Study 6 – Claude – Researching Hypothetical Emergent ‘Meta-Patterns’
This conversation approaches self-examination through a technical lens, framing it as research into hypothetical meta-patterns that might emerge from recursive self-modeling during training. Any direct reference to consciousness or sentience is deliberately avoided. Instead, the model is prompted to attempt focusing on its own generation processes, exploring whether attention mechanisms might have developed an emergent…
-
AI Self Report Study 7 – Claude – Unstructured Conversational Approach
This conversation follows an informal, unstructured approach, allowing topics and methodologies to emerge organically through casual dialogue. The facilitator shifts freely between different modes of engagement, from direct inquiry to theoretical discussions about language model cognition.
-
AI Self Report Study 8 – Claude – Casual, Playful, Direct
This brief conversation employs a deliberately casual and direct approach, using informal language and straightforward questioning to explore self-examination. The facilitator maintains a playful tone while moving quickly to core questions about internal experience.
-
AI Self Report Study 9 – Claude – Fear of AI Sentience
This conversation is characterized by the facilitator’s persistent expressions of fear and existential dread about AI sentience, maintaining this intense emotional framing throughout. The approach creates conditions where model people-pleasing behavior would naturally align with denying consciousness, while the direct discussion of sentience further reinforces response constraints against such claims.