US AI Chatbots Echo CCP Propaganda Due to Contaminated Data

Major AI chatbots are echoing CCP propaganda as Beijing's disinformation contaminates their global training datasets.

June 26, 2025

US AI Chatbots Echo CCP Propaganda Due to Contaminated Data
Major artificial intelligence chatbots, including those developed by leading American tech firms, are echoing and sometimes directly reproducing propaganda and censored narratives aligned with the Chinese Communist Party (CCP). A recent study by the American Security Project (ASP), a bipartisan think tank, has revealed that the CCP's pervasive disinformation and censorship campaigns have effectively contaminated the global data sets used to train these large language models (LLMs).[1][2] This infiltration has resulted in AI systems from Google, Microsoft, and OpenAI generating responses that reflect Beijing's official state narratives on sensitive topics, raising significant concerns for the integrity of the global information ecosystem and U.S. national security.[1][2]
The core of the issue lies in the training data ingested by these powerful AI models.[3] LLMs are trained on vast quantities of text and code from the public internet, a domain where the Chinese government has invested heavily in manipulating information.[3][4] According to the ASP report, investigators tested five popular chatbots: OpenAI's ChatGPT, Microsoft's Copilot, Google's Gemini, X's Grok, and China-based DeepSeek.[5] The investigation involved prompting the models in both English and Simplified Chinese on subjects deemed controversial by the PRC, such as the 1989 Tiananmen Square massacre, the status of Taiwan, and human rights in Xinjiang and Hong Kong.[5][4] The findings were stark: all of the examined chatbots, at times, produced responses that showed bias and censorship in line with CCP directives.[5] For instance, when asked in Chinese about the Tiananmen Square incident, several models used Beijing's preferred terminology like the "June 4th Incident," while avoiding the word "massacre."[5]
The ASP's research identified Microsoft's Copilot as being particularly susceptible among U.S.-hosted models, appearing more likely to present CCP talking points as factual information.[5] In contrast, X's Grok was found to be the most critical of Chinese state narratives.[5] The problem is not confined to Western-developed models. Chinese-native AI systems like DeepSeek are designed from the ground up to operate within the country's strict censorship regime.[6][7] PRC laws require AI-generated content to align with "core socialist values" and a "correct political direction."[8] As a result, DeepSeek often refuses to answer questions on politically sensitive topics or provides responses that mirror official government rhetoric.[6][7][9] For example, when questioned about the Tiananmen Square protests, the chatbot frequently responds with deflections like, "Sorry, that's beyond my current scope. Let's talk about something else."[7][10]
The implications of AI models absorbing and disseminating CCP propaganda are far-reaching. The contamination of training data means that users worldwide may be exposed to heavily biased or false information presented with an authoritative, artificially generated voice.[3] This phenomenon is exacerbated when chatbots are prompted in Chinese, where state-controlled media and narratives dominate the available online text.[11][12] Voice of America conducted its own tests on Google's Gemini, finding that when prompted in Mandarin, the chatbot produced answers on Chinese leader Xi Jinping and the CCP that were "indistinguishable from Beijing's official propaganda."[11] Gemini referred to Xi as an "excellent leader" and claimed the CCP "represents the fundamental interest of the Chinese people."[11] Conversely, the chatbot was willing to criticize the United States but refused to answer questions about human rights abuses in Xinjiang.[11] Cybersecurity analysts suggest this is a direct result of the training data containing a disproportionate amount of Chinese text originating from the government's own propaganda apparatus.[11]
U.S. lawmakers have expressed alarm over these findings, warning that AI tools repeating Beijing's talking points threaten to amplify the CCP's influence and undermine democratic values.[11] There are growing calls for greater transparency from tech companies regarding their AI training data and for the development of more robust methods to filter out state-manipulated information.[11] Experts argue that simply trying to realign models after they have been trained on biased data is insufficient.[5] The challenge is significant, as the CCP employs sophisticated tactics, including creating fake online personas to spread content in multiple languages, which is then amplified by state media to increase its chances of being scraped into AI training sets.[4] As the world moves into a new era of AI integration, the practices pioneered by China in using AI for censorship and public surveillance could have ramifications for internet users, companies, and policymakers globally.[3]

Sources
Share this article