Health-care AI is here. We don’t know if it actually helps patients.

I don’t need to tell you that AI is everywhere.

Or that it is being used, increasingly, in hospitals. Doctors are using AI to help them with notetaking. AI-based tools are trawling through patient records, flagging people who may require certain support or treatments. They are also used to interpret medical exam results and X-rays.

A growing number of studies suggest that many of these tools can deliver accurate results. But there’s a bigger question here: Does using them actually translate into better health outcomes for patients?

We don’t yet have a good answer.

That’s what Jenna Wiens, a computer scientist at the University of Michigan, and Anna Goldenberg of the University of Toronto, argue in a paper published in the journal Nature Medicine this week.

Wiens tells me she has spent years investigating how AI might benefit health care. For the first decade of her career she tried to pitch the technology to clinicians. Over the last few years, she says, it’s as though “a switch flipped.” Health-care providers not only appear much more interested in the promise of these technologies, they have also begun rapidly deploying them.

The problem is that many providers aren’t rigorously assessing how well they actually work.

Take “ambient AI” tools, for example. Also known as AI scribes, they “listen” to conversations between doctors and patients, then transcribe and summarize them. Multiple tools are available, and they are already being widely adopted by health-care providers.

A few months ago, a staffer at a major New York medical center who develops AI tools for doctors told me that, anecdotally, medics are “overjoyed” by the technology—it allows them to focus all their attention on their patients during appointments, and it saves them from a lot of time-consuming paperwork. Early studies support these anecdotes and suggest that the tools can reduce clinician burnout.

That’s all well and good. But what about patient health outcomes? “[Researchers] have evaluated provider or clinician and patient satisfaction, but not really how these tools are affecting clinical decision-making,” says Wiens. “We just don’t know.”

The same holds true for other AI-based technologies used in health-care settings. Some are used to predict patients’ health trajectories, others to recommend treatments. They are designed to make health care more effective and efficient.

But even a tool that is “accurate” won’t necessarily improve health outcomes. AI might speed up the interpretation of a chest X-ray, for example. But how much will a doctor rely on its analysis? How will that tool affect the way a doctor interacts with patients or recommends treatment? And ultimately: What will this mean for those patients?

The answers to those questions might vary between hospitals or departments and could depend on clinical workflows, says Wiens. They might also differ between doctors at various stages of their careers.

Take the AI scribes, as another example. Some research on AI use in education suggests that such tools can impact the way people cognitively process information. Could they affect the way a doctor processes a patient’s information? Will the tools affect the way medical students think about patient data in a way that impacts care? These questions need to be explored, says Wiens. “We like things that save us time, but we have to think about the unintended consequences of this,” she says.

In a study published in January 2025, Paige Nong at the University of Minnesota and her colleagues found that around 65% of US hospitals used AI-assisted predictive tools. Only two-thirds of those hospitals evaluated their accuracy. Even fewer assessed them for bias.

The number of hospitals using these tools has probably increased since then, says Wiens. Those hospitals, or entities other than the companies developing the tools, need to evaluate how much they help in specific settings. There’s a possibility that they could leave patients worse off, although it’s more likely that AI tools just aren’t as beneficial as health-care providers might assume they are, says Wiens.

“I do believe in the potential of AI to really improve clinical care,” says Wiens, who stresses that she doesn’t want to stop the adoption of AI tools in health care. She just wants more information about how they are affecting people. “I have to believe that in the future it’s not all AI or no AI,” she says. “It’s somewhere in between.”

This article first appeared in The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, and read articles like this first, sign up here.
 

Opinion: I started medical school at 69 and will begin residency at 72. Here’s what I learned

Since I was 7, my goal has been to become a doctor. But life had other plans. I grew up in a blue-collar family in Levittown, N.Y., in the 1950s and ’60s, so it often felt like the world ended in Jersey. When I landed in Lansing, Mich., to attend Michigan State, I expected the Rocky Mountains to be visible. I ended up getting a degree in nursing, but I always had another goal: to become an M.D.

This year, at the age of nearly 73, my dream will finally come true. Soon after, I will start my residency in family medicine. My perspective on medical school and medicine is unique not only because I attended late in life, but because it came after more than 40 years as a nurse practitioner.

Read the rest…

Psychedelics get a boost from the White House

President Trump recently signed an executive order which aims to increase access to psychedelic drug treatments. He was joined at the signing by podcaster Joe Rogan, who said he’ ha’d messaged the president about research on the psychedelic ibogaine. 

In this week’s STATus Report, host Alex Hogan chats with STAT Washington correspondent Daniel Payne about what the executive order does and doesn’t do. Hogan also looks at why ibogaine, and psychedelic drugs more broadly, are increasingly being taken seriously for stubbornly hard-to-treat conditions like addiction, depression, and PTSD.

Opinion: The local news crisis is also a public health crisis

The past four months have been a whirlwind for Pittsburgh’s journalism landscape. On Jan. 7, the Pittsburgh Post-Gazette, Western Pennsylvania’s largest news organization, announced it would cease publication on May 3 after nearly 240 years. Then, on April 14, just over two weeks before that closure date, the Baltimore-based Venetoulis Institute for Local Journalism said it would acquire the paper’s assets and continue publication.

Like many Pittsburghers, I experienced the emotional rollercoaster of anger, disappointment, hope, and relief tied to these announcements. I grew up in the Pittsburgh area, where I vividly remember running barefoot down my driveway as a child to grab the Post-Gazette. Years later, I interned there as a health and science reporter and have since contributed as a freelancer.

Read the rest…

Growing use of guest editors has turned some journals into a ‘playground of bad science’

Should academic journals begin to second guess guest editors? 

That question gained new urgency last week when the British Medical Journal’s publishing group retracted nearly its entire guest-edited special edition of the Journal of Medical Genetics, dedicated to cancer immunotherapies. In the retraction note, the journal writes that it was, in part, because of “compromised peer review in almost all articles.” The notice garnered attention for its scope, but also because it exemplified larger concerns that research integrity advocates have with guest-edited editions, which are also called special issues in some journals. 

Read the rest…

Ensemble-based working memory updating and its computational rules.

Psychological Review, Vol 133(3), Apr 2026, 515-533; doi:10.1037/rev0000569

Manipulation plays a critical role in working memory, wherein understanding how items are represented during manipulation is a fundamental question. Previous studies on manipulation have primarily assumed independent representations by default (independent hypothesis). Here, we propose the ensemble hypothesis to challenge this conventional notion, suggesting that items are represented as ensembles undergoing updating during manipulation. To test these hypotheses, we focused on working memory updating in accordance with new information by conducting three delayed-estimation tasks under addition, removal, and replacement scenarios (Study 1). A critical manipulation involved systematically manipulating the mean orientation of all memory stimuli, either increasing (clockwise) or decreasing (counterclockwise) after the updating process. Following the independent hypothesis, memory errors would be similar under both conditions. Conversely, considering the biasing effect of the ensemble on individual representations, the ensemble hypothesis predicts that memories of individual items would be updated, aligning with the ensemble’s change direction. Namely, memory errors would be more positive in the increase-mean condition compared to the decrease-mean condition. Our results supported the ensemble hypothesis. Furthermore, to investigate the mechanisms underlying ensemble computations in updating scenarios, we conducted three ensemble tasks (Study 2) with similar designs to Study 1 and developed a computational model to quantify the contributions of each memory item. The results consistently demonstrated that addition involved complete updating, while removal led to incomplete updating. Across these three research parts, we propose that items are represented as dynamic ensembles during working memory updating processes. Furthermore, we elucidate the computational principles underlying ensembles throughout this process. (PsycInfo Database Record (c) 2026 APA, all rights reserved)

Simplicity and complexity of probabilistically defined concepts.

Psychological Review, Vol 133(3), Apr 2026, 560-583; doi:10.1037/rev0000563

Human concept learning is known to be impaired by conceptual complexity: Simpler concepts are easier to learn, and more complex ones are more difficult. However, the simplicity bias has been studied almost exclusively in the context of deterministic concepts defined over Boolean features and is comparatively unexplored in the more general case of probabilistic concepts defined over continuous features. This article reports a series of experiments in which subjects were asked to learn probabilistic concepts defined over a novel 2D continuous feature space. Each concept was a mixture of several distinct Gaussian components, and the complexity of the concepts was varied by manipulating the positions of the mixture components relative to each other while holding the number of components constant. The results confirm that the positioning of mixture components strongly impacts learning, independent of the intrinsic statistical separability of the concepts, which was manipulated independently. Moreover, the results point to an information-theoretic basis framework for quantifying the complexity of probabilistic concepts, centered on the notion of compressive complexity: Simple concepts are those that can be approximately recovered from a projection of the concept onto a lower dimensional feature space, while more complex concepts are those that can only be represented by combining features. The framework provides a consistent, coherent, and broadly applicable measure of the complexity of probabilistic concepts. (PsycInfo Database Record (c) 2026 APA, all rights reserved)

Adaptive computation as a new mechanism of dynamic human attention.

Psychological Review, Vol 133(3), Apr 2026, 534-559; doi:10.1037/rev0000572

A key role for attention is to continually focus visual processing to satisfy our goals. How does this work in computational terms? Here we introduce adaptive computation—a new computational mechanism of human attention that bridges the momentary application of perceptual computations with their impact on decision outcomes. Adaptive computation is a dynamic algorithm that rations perceptual computations across objects on-the-fly, enabled by a novel and general formulation of task relevance. We evaluate adaptive computation in a case study of multiple object tracking (MOT)—a paradigmatic example of selection as a dynamic process, where observers track a set of target objects moving amid visually identical distractors. Adaptive computation explains the attentional dynamics of object selection with unprecedented depth. It not only recapitulates several classic features of MOT (e.g., trial-level tracking accuracy and localization error of targets), but also captures properties that have not previously been measured or modeled—including both the subsecond patterns of attentional deployment between objects, and the resulting sense of subjective effort. Critically, this approach captures such data within a framework that is in-principle domain-general, and, unlike past models, without using any MOT-specific heuristic components. Beyond this case study, we also look to the future, discussing how adaptive computation may apply more generally, providing a new type of mechanistic model for the dynamic operation of many forms of visual attention. (PsycInfo Database Record (c) 2026 APA, all rights reserved)

Rational causal induction from events in time.

Psychological Review, Vol 133(3), Apr 2026, 584-618; doi:10.1037/rev0000570

A longstanding focus in the causal learning literature has been on inferring causal relations from contingencies, where these abstract away from time by collating independent instances or by aggregating over regularly demarcated trials. In contrast, individual causal learners encounter events in their daily lives that occur in a continuous temporal flow with no such demarcation. Consequently, the process of learning causal relationships in naturalistic environments is comparatively less understood. In this article, we lay out a rational framework that foregrounds the role of time in causal learning. We work within the Bayesian rational analysis tradition, starting by considering how causal relations induce dependence between events in continuous time and how this can be modeled by stochastic processes from the Poisson–Gamma distribution family. We derive the qualitative signatures of causal influence and the general computations needed to infer structure from temporal patterns. We show that this rational account can parsimoniously explain the human preference for causal models that invoke shorter, more reliable, and more predictable causal influences. Furthermore, we show this provides a unifying explanation for human judgments across a wide variety of tasks in the reanalysis of seven experimental data sets. We anticipate the framework will help researchers better understand the many manifestations of continuous-time causal learning across human cognition and the tasks that probe it, from explicit causal structure induction settings to implicit associative or reinforcement learning settings. (PsycInfo Database Record (c) 2026 APA, all rights reserved)

Computation-limited Bayesian updating: A resource-rational analysis of approximate Bayesian inference.

Psychological Review, Vol 133(3), Apr 2026, 619-635; doi:10.1037/rev0000573

Data and computational capacity are essential resources for any intelligent system that update its beliefs by integrating new information. However, both of these resources are inherently limited. Here, we introduce a new resource-rational analysis of belief updating that formalizes these constraints using information-theoretic principles. Our analysis reveals an interaction between data and computational limitations: when computational resources are scarce, agents may struggle to fully incorporate new data. The resource-rational belief updating rule we derive provides a novel explanation for conservative Bayesian updating, where individuals tend to underweight the likelihood of new evidence. Our theory also generates predictions consistent with several process models, particularly those based on approximate Bayesian inference. (PsycInfo Database Record (c) 2026 APA, all rights reserved)