Metacognitive Demands and Strategies
While Using Off-The-Shelf AI Conversational Agents for Health Information Seeking
While Using Off-The-Shelf AI Conversational Agents for Health Information Seeking
In this project, we investigate what happens when people turn to general purpose AI chatbots, such as ChatGPT-style systems, to look for health information. These tools are convenient and capable of producing fluent, tailored answers, but they also shift a lot of responsibility onto the user: people must decide what to share, how to ask, whether to trust the answer, and how (or whether) to act on it. Our work focuses on these metacognitive demands – the ongoing monitoring, questioning, and regulation of one’s own thinking – and asks how they shape the experience of using AI for health information seeking.
We conducted a remote think aloud study with 15 participants from diverse backgrounds. Participants interacted with a custom web-based chatbot interface that closely resembled current off the shelf AI conversational agents and was powered by the GPT 4o application programming interface.
Together with our physician collaborator, we designed six everyday health scenarios (for example insomnia, migraines, digestive issues, diabetes, and high blood pressure) that varied in condition type, time sensitivity, and potential risk. Each participant was randomly assigned two scenarios and asked to:
imagine the scenario applied to them
use the AI agent to interpret symptoms, explore possible causes, consider treatment options, and decide whether to seek professional care
think aloud throughout, explaining how they were deciding what to ask, how they understood responses, and what they would do next
Our analysis showed that using AI chatbots for health information seeking is not a simple question and answer interaction, but a demanding UX journey that requires sustained metacognitive effort.
Key patterns included:
Prompt formulation is cognitively heavy.
Participants struggled with where to start, how much personal detail to disclose, and how to express bodily experiences in words the agent would interpret correctly. Many over specified long prompts or decomposed them into multiple turns while worrying about missing something important.
Evaluating responses is challenging and sometimes overwhelming.
The agent’s answers were often long and confident, appearing comprehensive but difficult to assess. Participants scanned for cues, cross-checked with trusted sites, or rewrote content in their own words to judge relevance, trustworthiness, and actionability.
Prompt iteration can lead to loops.
When participants felt misunderstood, they repeatedly rephrased and refined questions, sometimes borrowing medical-sounding language or asking the agent to restate what it understood. In many cases, the agent seemed anchored to an earlier framing, and participants chose to restart the conversation to regain control.
Users actively position the agent within their health decision-making.
Participants described the chatbot as a tool for gathering background information and organization, rather than a decision-maker. They set informal boundaries around what they were willing to rely on it for and when they would defer to other sources or professionals, constantly negotiating this line as the interaction unfolded.
Summary of the metacognitive demands placed on people using AI conversational agents for health information seeking.
This work contributes an empirically grounded view of what it is like to use off the shelf AI conversational agents for health information seeking. It shows that beyond accuracy, these systems impose substantial metacognitive demands on users, who must continuously monitor and regulate their own thinking while making sense of AI generated health information.
From a UX perspective, the findings point to several directions for designing future AI conversational agents that better support health information seeking:
Support explicit goal setting.
Help people articulate and maintain their health information goals (for example, symptoms, treatment options, when to seek care) so conversations stay focused and aligned with user needs.
Structure symptom description and follow up.
Provide scaffolds for entering symptoms and clarifying which information should be given together and which is better asked as follow up questions, instead of relying solely on an empty text box.
Clarify privacy and disclosure boundaries.
Offer clear, accessible guidance on what is safe to share, and lightweight cues when people are about to disclose sensitive personal details.
Make system understanding visible.
Show which details from the user’s input are being taken into account so they can confirm, correct, or expand the information before it shapes the advice they see.
Present information in structured, contrastive formats.
Move beyond dense paragraphs toward formats that separate urgent from non urgent signals, more likely from less likely causes, and near-term from longer term actions, making health-related decisions easier to reason about.