JUNE 2025
7 Studies from July: Education, Health, and Sound as the New Interface
From AI diagnosticians that outperform doctors at a fraction of the cost to sound as UX and the future of universities. July's key research, explained.
Angelina Zaitseva
In 2022, Sri Lanka’s government collapsed within just a few months. Fiscal irresponsibility and the lingering effects
01. The One-Man-Band Doctor: Microsoft Teaches AI to Diagnose (and Save Money)
Microsoft is peering into the immediate future of medical diagnostics — one that will determine how much you pay for tests, how long you spend in line, and how quickly you hear the correct diagnosis. Microsoft AI assembled SDBench — a "testing ground" of 304 real clinical case discussions, but with an important distinction: as in real life, every step costs time and money. The metric is not guessing the disease, but $ per correct Dx — how many dollars are spent to arrive at the correct diagnosis when the doctor (or AI) sequentially: asks questions → orders tests → draws a conclusion. On top of the base models, the authors deployed MAI-DxO — an "orchestrator," i.e., a set of coordinated AI roles: one generates hypotheses, another selects tests, a third pushes back on expensive procedures, a fourth monitors costs, and a fifth follows a checklist. In this setup, AI becomes both more accurate and more affordable.

How the study was conducted: each case was converted into a chain of "visits"; prices were based on U.S. billing rates. Diagnostic accuracy was verified by an "LM judge" — another model that scored outputs according to a rubric (disease core, cause, localization, specificity, completeness). Where text could be ambiguous, agreement with physicians was measured; credit was given at 4 out of 5 points or higher. Physician participants (n=21) achieved 19.9% diagnostic accuracy at $2,963 per case; GPT-4o — 49.3% at $2,745; o3 — 78.6% at $7,850 (expensive, but more frequently ordered appropriate tests). With MAI-DxO on top of o3, the result was 81.9% at $4,735; in "budget mode" — 79.9% at $2,396; a model ensemble raised the bar to 85.5% at $7,184. Results held on a concealed validation set. An illustrative episode: in a poisoning case, the base model "latched onto" an incorrect hypothesis and "burned" $3,431; the orchestrator clarified the source (hand sanitizer) → ordered a single targeted test → reached the diagnosis for $795. For completeness, the authors compared their AI against other top models (OpenAI, Gemini, Claude, Grok, DeepSeek, Llama).

If confirmed in real-world practice, clinical workflows will shift from "chatbots" to agentic orchestrators with built-in budget logic: patient triage and consultations will more often land on the correct diagnosis at lower testing costs; insurers and regulators will gain a transparent metric — "dollars per correct diagnosis" — and the "black box" will open as an auditable step log; medical training will move from isolated exercises to sequential reasoning simulators. But the limitations matter: NEJM cases represent a complex "academic" selection (it is unclear how the model would perform on routine, straightforward patients); prices reflect U.S. rates and do not account for logistics, wait times, or invasiveness; physicians worked without consultations; and the "LM judge" assessment, while close to physician judgment, still requires validation against real outcomes. In other words: the potential is substantial — but before changing protocols, prospective clinical trials are needed.
Within the next 1–2 years, "smart" features will be embedded in nearly every service — from banking to navigation — affecting prices, privacy, and the quality of decisions around us. The Artificial Intelligence Index 2025 (Stanford HAI) reports that the cost of running a model response (inference) has dropped roughly 280-fold to ~$0.07 per 1 million tokens. This means AI will become a mass-market "lining" of digital life, but for now business sees modest returns (operational cost savings most often <10% and revenue growth <5%), so the competition is not for the "biggest AI" but for the right balance of cost and utility while adhering to safety standards.

If the authors' conclusions hold, we can expect more on-device AI powered by NPUs (built-in neural processing units) and a growing role for SLMs — small, specialized models that run faster and cheaper than "giants." Interfaces will feature more "explainability" modes and accountability labels, while content will carry authenticity watermarks (invisible markers indicating the synthetic origin of text or images). The competitive edge will go to ecosystems that can carefully blend real and synthetic data (AI-generated data used for training) without degrading model quality.
03. A Diagnosis for Humanity from the WHO
The annual World Health Statistics 2025 report reveals a troubling fork: the share of households whose out-of-pocket medical expenses consume >10% of their budget remains at 13.5%, and in 2019, medical payments pushed 344 million people into extreme poverty. At the same time, after two decades of progress, healthy life years (HALE — healthy life expectancy) have dropped by 1.5 years, to 61.9. In other words, universal health coverage (UHC) is growing too slowly, and the cost and accessibility of treatment continue to hit households hard.

Between 2000 and 2019, HALE rose by 5.4 years, but between 2019 and 2021 it fell to 61.9 — due to COVID and its indirect effects; mental health disorders additionally contributed approximately –0.12 years. The world is falling short of UN targets: maternal mortality stands at 197 per 100,000 (far from the target of <70 by 2030); child mortality — U5MR at 37 per 1,000 (deaths under age 5) and NMR at 17 per 1,000 (neonatal deaths), with enormous disparities across regions. Premature mortality from non-communicable diseases (heart attack, stroke, cancer, diabetes) is barely declining: to meet the target of "–⅓ by 2030," an average annual reduction rate of ~2.7% is needed, whereas the actual rate is only ~0.5–1.3%. Road traffic injuries claim 1.18 million lives per year (with the highest rates in Africa and low-income countries); suicides — 727,000; homicide rates are highest in the Americas. Air pollution is linked to 6.7 million deaths (2019); water, sanitation, and hygiene (WASH) issues — to 1.4 million. HIV incidence has declined by 48% since 2010; tuberculosis is trending downward; malaria has been rising since 2015; antimicrobial resistance is a growing threat. Of the WHO's "Triple Billion" initiative (an additional +1 billion people with healthier lives; +1 billion with UHC; +1 billion protected from health emergencies), only the first target has been exceeded; on UHC, the 2025 gain is only ~+500 million against a target of +1 billion; emergency protection — ~+697 million.

Some data lags (especially 2021–2023); cause-of-death classifications during the pandemic are sometimes conflated; the quality of national registries varies; projections are sensitive to policy and economic conditions. This is a global panorama — for action, it needs to be broken down into local cross-sections. But the overall signal is clear: without accelerating UHC, reducing inequalities, and investing in primary care, we will lose more years of healthy life, and millions of families will continue to overpay for basic medicine.
04. Haven't Read It, but I Disapprove
The Digital News Report 2025 finds that social media feeds and short-form video are becoming the primary "gateway" to the news agenda, and the first AI chatbots are already serving as a news source (on average for 7% of the audience, and 15% of those under 25). Trust in media hovers around ~40%, while influencers and national politicians are named as the leading sources of potential disinformation. Willingness to pay for news is limited to an average of 18% across 20 countries.

News increasingly arrives via platforms — Facebook (36% use it for news weekly), YouTube (30%), Instagram/WhatsApp (19% each), TikTok (16%), X (formerly Twitter) (12%). Video consumption has risen to 72% weekly; social video — to 65% (up from 52% in 2020). Concern about distinguishing truth from falsehood is felt by 58% of respondents (up to 73% in Africa and the U.S.). In this environment, audiences expect AI in newsrooms to deliver "faster and cheaper," but fear opacity and errors — which is why the working strategy for publishers is video-first products, subscription bundles, and "human in the loop" when deploying AI (an editor verifies, explains, and takes responsibility).

If the researchers are to be believed, the media ecosystem will become even more platform-dependent: the winners will be video newsrooms and creators who know how to work with TikTok/YouTube and chat-search, as well as brands with transparent moderation standards; the losers will be "text by default" and standalone paywall models. But the report has important limitations: it is an online survey, which means it poorly captures people without stable internet access, older demographics, and lower-income groups. In several countries (India, Kenya, Nigeria, South Africa), the sample consisted primarily of English-speaking users aged 18–50 — this is not the country's entire audience. The sample is non-random, so small differences (of 1–2 percentage points) should not be interpreted as genuine shifts. And: some sections were produced with the support of industry partners (including platforms) — this is worth bearing in mind. But the overall signal is clear: news is increasingly consumed via video and social media, artificial intelligence is becoming a new "gateway" to news, and trust in sources needs to be rethought and rebuilt.
Sound has already become an everyday interface — from voice assistants and short-form videos to podcasts and notifications. If you are building a product, media outlet, or research project, the academic volume Listening in: Perspectives on Sound, Voice, and (Popular) Music Studies will help you hear context: where sound "resonates" (evokes a response), how voice is packaged in cultural and gender codes, and what exactly platforms impose on us. In other words, this is a manual for getting tone, voice, and audio UX right.

Inside are five sections covering different aspects of working with sound: RESONATING (why some sounds are "accepted" and others are not), HEARING (field acoustics and "sound mapping" of a cathedral; the attempt to "hear" even static images), PERFORMING (improvisation and historical practices for understanding the meaning of performance), GENDERING (how voice constructs and contests gender — from The Magic Flute to punk and the hijra community), DIGITALIZING (how smartphone interfaces, Spotify, and voice assistants encode behavior; what "fan labor" in K-pop is — fans' contributions to promotion and content production).

The researchers forecast that voice products will move away from "universal" timbre toward inclusive and context-sensitive voices; "like/dislike" metrics will give way to measurements of resonance and trust; media and brands will begin designing audio together with fan communities (co-creation) rather than on top of them. The winners will be those who design the context and relationships around sound. Limitations: chapters are uneven and localized (some are student case studies); these are interpretations, not rigorous causal proofs; the data is insufficient for "panoramic statistics"; findings need to be validated against your own audiences and use cases. But if you have never thought about sound as a social interface, this volume is an excellent entry point into the subject.
06. The University of the Future
If the conclusions of the EDUCAUSE Horizon Report 2025 come to pass, we can expect universities where AI is integrated into the learning process under clear rules: every tool has well-defined boundaries of application, every assignment has transparent criteria, and your degree competencies are recorded not only through grades but also through digital badges (micro-credentials for specific skills) and CLR (Comprehensive Learner Record — a "through-line portfolio" of achievements). Courses will become more blended (offline + online), micro-learning will emerge (short modules in place of long lectures), and LMS platforms (Learning Management System — a course's digital "gradebook") will host AI assistants with human-in-the-loop oversight.

According to the report's authors, most universities will adopt registries of "approved AI" (curated catalogs of vetted tools and permitted uses); courses will be designed "around skills" rather than solely "around credit hours"; and digital badges/CLR (machine-readable evidence of specific competencies) will become a fixture in student résumés. IT departments will shift from "block by default" to flexible AI governance, and instructors will receive short, structured upskilling programs on GenAI instead of ad hoc workshops. Simultaneously, defenses against attacks on the EdTech perimeter (everything connecting learning services and student data) will be strengthened, and AI agents will take on the role of course assistants — but with action logging and a clear "access button" to a human. In the market, universities with a fast "governance loop" and teaching support centers will win; those that bet on "total bans" or lock into a single AI vendor will lose.

If you are a student — expect more AI tools alongside your courses and more responsibility for fact-checking. If you are an instructor — prepare for short but regular GenAI upskilling and for clear rules on AI use in assignments and assessment. If you run a program — without AI governance, cyber hygiene, and critical digital literacy, there will be no resilience, even if the tools are "state of the art."
The special issue of the journal Educational Technology & Society shows that if schools and universities integrate AI with smart pedagogy, the average student will learn measurably better (mean effect g ≈ 0.57–0.69: outcomes shift from the "middle" of the group to approximately the 70th–76th percentile).

The editors selected 11 peer-reviewed studies from 53 submissions on GenAI in education — spanning K-12 through university. Two meta-analytic papers found a sustained "medium-to-large" positive effect: overall learning outcomes improved across three domains simultaneously — cognitive (knowledge and skills, g ≈ 0.60), behavioral (learning engagement, g ≈ 0.70), and affective (motivation/confidence, g ≈ 0.48). Field experiments show that AI is particularly beneficial when wrapped in reflection and structured feedback.

According to the authors, learning courses will undergo a mass transition to a "Human-GenAI symbiosis" model (human + AI together): for programming — GenAI-assisted debugging, for writing — "AI as editor and coach," for argumentation — an AI interlocutor that challenges and asks for justification. Schools and universities will begin standardizing assessment to avoid inflating results through "in-house" tests, and will introduce learning analytics (the secure analysis of AI interaction logs to understand what actually helps). Narrow "grammar checkers" will be replaced by agentic AI assistants, and curricula will include dedicated modules on critical digital literacy (how to verify facts, identify errors, and detect AI "hallucinations").

That said, effects remain heterogeneous for now: studies are short, samples are often small, and many are quasi-experiments (transferability to other schools and courses is limited). Where researcher-designed tests are used, the effect is frequently higher than on standardized measures — meaning that part of the "success" may be attributable to the assessment method itself. In programming, the impact on computational thinking (the ability to decompose problems, abstract, and model) remains ambiguous.
WHAT's next
Made on
Tilda