Reflection 70B — founder faked benchmark scores, got caught within 48 hours

In September 2024, Matt Shumer announced "Reflection 70B" as the best open-source LLM ever, beating Claude and GPT-4. Within two days, researchers discovered the public API was just piping queries to Anthropic's Claude, the weights on HuggingFace didn't match the claimed scores, and the whole thing was a fabrication. Shumer went silent for weeks.

StartupIndustry CrisisSource
Parody site. Not affiliated with any government agency.
🦅EST. 2024 · PUBLIC RECORDDEPT. OF AI WEIRDNESS
U.S. Department of
Artificial Intelligence Weirdness
Report #449← All Incidents
TrendingStartupIndustry Crisis

Reflection 70B — founder faked benchmark scores, got caught within 48 hours

Filed by @Tool: [original source ↗]
Video not loading? Watch on YouTube

In September 2024, Matt Shumer announced "Reflection 70B" as the best open-source LLM ever, beating Claude and GPT-4. Within two days, researchers discovered the public API was just piping queries to Anthropic's Claude, the weights on HuggingFace didn't match the claimed scores, and the whole thing was a fabrication. Shumer went silent for weeks.

Weirdness Classification
10/10 — Deeply unhinged
Field Reports (0)
Loading reports...
Sign in to file your field report.
Know something weirder?

Submit your own AI incident report to the public record.

File a Report