OpenAI’s “Open” AI Models Drop After Six Years of Silence

OpenAI breaks six-year open-source silence with dual model release, but 49-53% hallucination rates expose the trade-offs of accessible AI.

Al Landes Avatar
Al Landes Avatar

By

Our editorial process is built on human expertise, ensuring that every article is reliable and trustworthy. AI helps us shape our content to be as accurate and engaging as possible.
Learn more about our commitment to integrity in our Code of Ethics.

Image credit: Wikimedia

Key Takeaways

Key Takeaways

  • OpenAI releases first open-weight models since 2019, responding to competitive pressure
  • Two models available: enterprise-grade 120B and consumer-friendly 20B parameters
  • High hallucination rates reveal significant gaps compared to proprietary alternatives

Racing to catch up with DeepSeek and Alibaba’s Qwen models, OpenAI just dropped gpt-oss-120b and gpt-oss-20b—their first open-weight releases since GPT-2 made headlines in 2019. The timing isn’t coincidental.

Chinese labs have been eating OpenAI’s lunch in the open-source space while policymakers demand more accessible AI infrastructure. This dual release targets different deployment scenarios with surgical precision.

The heavyweight gpt-oss-120b packs 117-120 billion parameters but demands an 80GB GPU—think industrial-strength reasoning for enterprises with serious hardware budgets. Meanwhile, gpt-oss-20b runs comfortably on your laptop’s 16GB RAM, bringing advanced reasoning to developers working from coffee shops.

Both models ship under Apache 2.0 licensing, meaning you can modify, commercialize, and deploy them without OpenAI’s permission slip. Performance benchmarks tell a compelling story: gpt-oss-120b scored 2622 on Codeforces, trailing OpenAI’s proprietary o4-mini by less than 100 points.

Yet these models stumble badly on factual accuracy. Hallucination rates hit 49-53% compared to just 16% for OpenAI‘s closed o1 model—like switching from Netflix‘s curated originals to YouTube’s wild west of user content.

You’re getting reasoning capabilities, but fact-checking becomes your responsibility. The “open” label also carries asterisks: training data and full code remain locked away, citing copyright lawsuits and security concerns.

If you’re developing AI applications that prioritize local deployment over perfect accuracy, these models represent a genuine breakthrough. The 120B version particularly shines at coding and mathematical reasoning tasks where hallucination matters less than logical structure.

However, enterprises requiring high-stakes factual reliability should probably stick with OpenAI’s premium offerings—at least until the open-source community works its usual magic on reducing those error rates. This release signals OpenAI‘s recognition that the future belongs to whoever controls both the premium and accessible AI markets, not just the cutting edge.

Share this

At Gadget Review, our guides, reviews, and news are driven by thorough human expertise and use our Trust Rating system and the True Score. AI assists in refining our editorial process, ensuring that every article is engaging, clear and succinct. See how we write our content here →