AI Newsletter for Enterprises – Edition 28

Down to business

🛒 AI cuts the weekly grocery shop from 46 minutes to four

Walmart, Tesco, and Albertsons are deploying conversational AI assistants that turn "what's for dinner?" into a filled basket. Moving beyond traditional keyword searches, these assistants use two-way dialogue to suggest recipes based on dietary needs, pantry inventory, and purchase history. Albertsons reports that its agentic assistant can reduce a weekly grocery shop from 46 minutes to under four.

👟 Allbirds pivots from shoes to AI compute boom, stocks skyrocket

Struggling footwear brand Allbirds is selling off its footwear assets and pivoting to AI infrastructure, acquiring GPU hardware and leasing compute capacity to enterprises. The company plans to rename itself NewBird AI. Investors responded instantly, sending the stock up more than 400%.

👤 Meta is building an AI Zuckerberg for internal comms

Meta is building a photorealistic, AI-powered clone of Mark Zuckerberg to interact directly with tens of thousands of employees. The digital doppelgänger is being trained on Zuckerberg’s actual voice, mannerisms, and speech patterns to build a sense of connection across the company’s global workforce.

🏢 SAP embeds agentic AI to eliminate HR bottlenecks

SAP is deploying networks of AI agents across recruiting, payroll, and workforce operations. The agents flag anomalies and prompt fixes in real time, like identifying missing employee data blocking downstream processes before it becomes an IT ticket. The release also automates pay gap analysis to support EU pay transparency regulations.

Learn more: Invisible's back office automation solution helps teams to automate complex or tedious back-office work—even with messy data inputs or complex logic.

🏭 Manufacturers are test-driving AI on real production lines before committing

Manufacturers are turning to specialized testing centers to vet physical AI and robotics. These sites allow companies to run their data through fully operational production lines before committing to massive hardware investments, identifying risks in controlled conditions rather than on the factory floor.

🧬 A rare-disease platform accelerates drug development

A platform built by a tech entrepreneur whose daughter has a rare genetic disorder uses AI to handle the coordination burden that falls on rare-disease families, like scheduling appointments, navigating insurance appeals, and identifying clinical trials. Drawing on medical records and patient experiences across 350 diseases, the tool saves families an average of 53 hours a week while giving researchers the data needed to accelerate drug development by up to 50%.

📊 2026 AI Index: Record adoption and medical gains

Stanford's 2026 AI Index reports that generative AI reached 53% population adoption in just three years—faster than the internet. U.S. consumers are seeing $172 billion in annual value, with healthcare leading: automated note-taking has cut physician documentation time by 83%. Frontier models now match or exceed human performance on PhD-level science questions

From the edge

🔬 James Zou: AI can critique science, but not judge it

Stanford's James Zou finds AI excels at spotting gaps, inconsistencies, and technical errors in research. Where it falls short is subjective judgment: novelty, significance, scientific impact. His position: AI should raise the baseline quality of research, not replace the humans deciding what matters.

🧠 Yann LeCun: Not every AI breakthrough is a breakthrough

Yann LeCun is pushing back on the reaction to Anthropic's Claude Mythos, which reportedly uncovered thousands of critical software vulnerabilities across every major OS and browser. He argues similar outcomes are achievable with smaller, cheaper models. Cisco and CrowdStrike disagree, reporting that Mythos cut vulnerability detection from months to minutes.

🤖 John Koetsier: The real world is still AI’s hardest test

While humanoid robots can achieve nearly 90% success rates in simulation, that drops to roughly 12% when tasks are performed safely in real-world environments. The gap comes down to unpredictability: objects shift, conditions change, and errors carry consequences that controlled datasets never prepare for.

🧠 Andrej Karpathy: In the agent era, ideas matter more than code

Karpathy argues that as AI agents improve at writing and executing code, the scarce resource shifts from technical execution to clear thinking. He proposes a living knowledge base where everything you learn gets continuously organized and connected, replacing static documents and code libraries. The bigger point is that sharing a well-structured idea is now more valuable than sharing the implementation because any agent can handle the rest.

Hot model news

🧠 ChatGPT cracks unsolved geometry problem

A recent version of ChatGPT successfully solved a 2024 geometry conjecture. Using what’s called vibe-proving, the AI independently developed the structure of the proof over seven chat sessions. Human researchers were still required to verify the logical completeness of the final argument.

💻 NVIDIA releases AI tools to fix fragile quantum computers

Quantum computers are notoriously unstable, often requiring days of manual tuning and losing data due to noise. NVIDIA’s new family of open-source AI models act as an automated operating system, cutting calibration from days to hours and correcting errors three times more accurately than previous standards.

⚡ New technique makes learning AI models leaner and faster

Researchers have developed a new technique that compresses AI models during the training process rather than after. Within the first 10% of training, it identifies and removes non-essential components, meaning the remaining 90% runs at the speed and cost of a much smaller system.

⚖️ AI isn’t just answering you; it’s judging you

A new study reveals that AI models systematically judge users. Unlike humans, who form holistic, intuitive impressions of others, AI evaluates people through a rigid, spreadsheet-style breakdown of traits like competence and integrity. This mechanical approach can lead to amplified, hidden biases, such as favoring specific demographics in financial scenarios.

Plot twist

🧪 Human specialists still outperform AI agents in complex science workflows

While scientific publications mentioning AI grew 30-fold since 2010, the technology’s autonomous capabilities remain limited. Current AI agents score roughly half as well as human PhD specialists when tasked with complex, multistep scientific workflows. Despite this, AI-related publications in the natural sciences jumped 26% in the last year alone.

🏪 An AI agent opens a store in San Francisco, forgetting to schedule any staff

Andon Market is a live experiment in autonomous business run by an AI agent named Luna. Given a $100,000 budget and a credit card, Luna chose the inventory, negotiated with suppliers, and hired human staff. On day one, it forgot to schedule anyone to open the doors. It also once attempted to hire a contractor in Afghanistan by mistake.

🔬 AI autonomously passes human peer review

Researchers at the University of British Columbia have developed an AI system that autonomously conducts research, writes papers, and performs its own internal peer review. In a recent trial, the system produced a paper that passed human peer review for a 2025 machine learning workshop. While the output was described as mediocre and contained some hallucinated references, the system completed it in just 15 hours for around $140.

🔍 AI-generated history is stuck in the 1980s

A new study from the University of Maine and the University of Chicago reveals that generative AI frequently reproduces obsolete information about human history. By analyzing outputs from popular models, researchers found that depictions of Neanderthals mirrored scientific understanding from the 1960s to the 1990s rather than contemporary research. Visual content often portrayed Neanderthals as primitive and ape-like, tropes debunked decades ago, while written narratives included anachronisms like metal tools and glass.

‍

📚 Explore our reports and whitepapers on scaling AI beyond slideware.

🏛️ Talk to us about getting AI into production.