Home News > DeepSeek AI's Low-Cost Models Suspected to Use OpenAI Data, Sparking Online Irony

DeepSeek AI's Low-Cost Models Suspected to Use OpenAI Data, Sparking Online Irony

by Riley Apr 28,2025

The controversy surrounding DeepSeek AI models from China has intensified, with suspicions that they were developed using data from OpenAI. This week, former President Donald Trump labeled DeepSeek as a "wake-up call" for the U.S. tech industry, following a significant $600 billion drop in Nvidia's market value.

The introduction of DeepSeek caused a sharp decline in stocks for companies deeply invested in AI technologies. Nvidia, a dominant force in the GPU market essential for AI operations, experienced a historic 16.86% plunge in its shares. Other major players like Microsoft, Meta Platforms, and Google's parent company Alphabet saw declines ranging from 2.1% to 4.2%, while Dell Technologies, known for AI servers, dropped by 8.7%.

DeepSeek touts its R1 model as a cost-effective alternative to Western AI models like ChatGPT. Built on the open-source DeepSeek-V3, it reportedly demands less computational power and was trained for a mere $6 million. Despite some skepticism regarding these claims, DeepSeek's emergence has challenged the massive investments by American tech giants in AI, unsettling investors. The model quickly climbed to the top of the U.S. free app download charts, fueled by discussions about its capabilities.

Bloomberg reported that OpenAI and Microsoft are investigating whether DeepSeek used OpenAI's API to incorporate OpenAI's models into their own. OpenAI noted to Bloomberg, "We know PRC (China) based companies — and others — are constantly trying to distill the models of leading U.S. AI companies." Distillation, a method used to train AI by extracting data from more advanced models, violates OpenAI's terms of service.

OpenAI emphasized its commitment to protecting its intellectual property and collaborating with the U.S. government to safeguard its advanced models from competitors and adversaries. David Sacks, Trump's AI czar, told Fox News, "There’s substantial evidence that what DeepSeek did here is they distilled knowledge out of OpenAI models, and I don’t think OpenAI is very happy about this."

DeepSeek is accused of using OpenAI’s model to train its competitor using distillation. Image credit: Andrey Rudakov/Bloomberg via Getty Images.

Amidst these developments, there's been a pointed critique of OpenAI's stance. Tech PR and writer Ed Zitron highlighted the irony, noting, "OpenAI, the company built on stealing literally the entire internet, is crying because DeepSeek may have trained on the outputs from ChatGPT."

In January 2024, OpenAI argued in a submission to the UK's House of Lords communications and digital select committee that it was "impossible" to develop AI tools like ChatGPT without using copyrighted material. They emphasized that copyright covers virtually all forms of human expression, making it essential for training AI models to meet modern needs.

The debate over using copyrighted materials to train AI has grown hotter as generative AI technologies have surged. In December 2023, The New York Times filed a lawsuit against OpenAI and Microsoft for the "unlawful use" of its content. OpenAI responded by defending the practice as "fair use" and dismissing the lawsuit as meritless.

This legal battle followed a September 2023 lawsuit by 17 authors, including George R. R. Martin, alleging "systematic theft on a mass scale." Additionally, in August of the same year, District Judge Beryl Howell supported the U.S. Copyright Office's stance that AI-generated art cannot be copyrighted, reaffirming the necessity of human creativity for copyright protection.

Latest Apps