|
AI is reshaping healthcare fast. Below are 3 key AI developments, 2 studies, and 1 takeaway for this week to help you better lead with AI. Target read time: 5 minutes.
|
Two announcements this week illustrate the diverging strategies for getting AI into health plan operations. Anterior, a clinician-led AI platform, closed a $40M Series B (NEA, Sequoia, Kinnevik) to build direct payer relationships from scratch. Their pitch: 99.24% clinical accuracy across prior auth, payment integrity, and risk adjustment, independently validated by KLAS. Meanwhile, Palantir partnered with Cognizant to embed its AI platform directly into TriZetto — the claims engine that processes over 1 billion claims annually for 300+ health plans including Anthem and multiple BCBS networks.
So what?
Health plan sales cycles are long — 12 to 24 months is common. To survive, AI startups need either deep capital reserves (Anterior's $64M war chest) or instant distribution through entrenched infrastructure (Palantir riding TriZetto's installed base). Both paths are viable. Ultimately, distribution is the key bottleneck to overcome.
Read about Anterior → · Read about Palantir + TriZetto →
Cleveland Clinic, HCA Healthcare, Stanford, Highmark Health, MedStar, and a coalition of other major systems launched the AI Care Standard — the first operational standard focused on AI that communicates directly with patients. The standard defines 10 Core Pillars spanning safety, equity, governance, and real-world usability. It applies to patient portal chatbots, care navigation tools, and any AI that interacts with patients or families.
So what?
With federal AI regulation still evolving and state rules fragmented, the industry is self-organizing. Health plans, hospitals and even patient engagement vendors evaluating patient-facing AI now have a concrete framework to reference.
Read the announcement →
Digital health consolidation is accelerating. Sword Health acquired Kaia Health for $285M (MSK expansion + German market entry). Spring Health acquired Alma (mental health scale). NOCD acquired Rebound Health (trauma care). Notably, venture-backed companies — not health systems or payers — are doing the buying. The common thread: acquirers are filling AI capability gaps, accelerating market access and/or increasing runway.
So what?
Not every M&A deal is a sign of strength. Some are survival moves dressed up as strategy. To parse: does the acquirer gain a specific capability (AI, geography, clinical depth) — or just more time? With flat 2027 MA rates on the horizon, the difference matters.
Read the full story →
|
A randomized study of 1,298 UK adults tested three leading LLMs (GPT-4o, Llama 3, Command R+) as patient-facing medical assistants. Tested in isolation, the models identified relevant conditions in 94.9% of cases. But when real patients used those same models to assess their own symptoms, accuracy collapsed — fewer than 35% identified the right condition, no better than the control group using a standard internet search. Participants also consistently underestimated condition severity regardless of whether they used AI.
Why it matters
This is the largest RCT to date testing LLMs as direct-to-patient health tools. The 94.9% vs. <35% gap is a reality check for the "AI doctor" narrative: the model's knowledge doesn't automatically transfer to the patient's decision.
Read the study →
A single-blind RCT with 60 licensed physicians in Pakistan found that doctors using an LLM for diagnostic assistance scored 71.4% on clinical vignettes — compared to 42.6% for those using conventional resources alone. The key: physicians first completed a 20-hour AI-literacy curriculum covering LLM capabilities, appropriate use, and limitations before the trial began.
Why it matters
The 20-hour AI-literacy curriculum is the underrated detail here. It wasn't just "here's a chatbot" — physicians were trained on LLM capabilities, limitations (including hallucinations), appropriate use, and the necessity of output verification before using them clinically.
Read the study →
|
Same AI. Different Users, Different Results.
In the span of four days, two studies tested the same class of large language models in healthcare — and reached opposite conclusions.
On February 6, Nature Health published results from Pakistan showing that physicians using LLMs dramatically improved their diagnostic accuracy — by 27.5% in a single-blind RCT with 60 licensed doctors.
On February 10, Nature Medicine showed the flip side. When 1,298 UK patients used those same LLMs to assess their own symptoms, they performed no better than Googling. The models aced medical exams in isolation (94.9% accuracy) — but that knowledge evaporated when real people tried to use it. Patients couldn't formulate the right questions, evaluate the AI's outputs, or calibrate severity.
The difference isn't the algorithm. It's who's using it — and how.
Takeaway
The question isn't "does medical AI work?" — it clearly does. The question is "work for whom, and under what conditions?" For health plan and system leaders, this reframes the entire deployment conversation. Patient-facing AI without guardrails and guidance is likely to underperform. Clinician-facing AI with even modest training can transform care. The investment in training and workflow design may matter more than the investment in the model itself.
|
|