China's AI Deception: The Essentialness of Deep-Hidden Aspects in Open-Source Development

In the rapidly evolving world of Artificial Intelligence (AI), the concept of open source has become a buzzword. However, the AI community's use of the term often falls short of the full open source ideal.

Currently, the Chinese AI community is asking global developers to put their blind faith in models they cannot truly understand or investigate. These models, while accessible through "open weight" releases, lack the transparency and accountability that comes with full open source practices.

Open source participation is a critical driver for startups and spurs entrepreneurial and economic growth worldwide, as demonstrated by Android's victory over iOS. In the context of large language models (LLMs), "open weight" refers to the actual trained model parameters being publicly available for download and use, while "open source" encompasses a full set of code, weights, and often training processes openly available.

True open source AI matters because it enables transparency, trust, ethical auditing, collaborative oversight, wider innovation, and equitable access to powerful AI technologies.

Trust and Transparency

When all parts of the model’s creation and behavior are openly accessible, it enables the wider community to examine and understand how the model works and what data it was trained on, addressing hidden biases or ethical concerns effectively.

Accountability and Governance

Open models invite collaborative oversight and democratize control away from a few companies, allowing better regulation and governance that aligns with public interest.

Innovation and Collaboration

Open source accelerates innovation by allowing researchers everywhere to build on shared tools rather than duplicating closed-door efforts, fostering rapid improvements and diverse contributions.

Equity and Digital Sovereignty

Openness lowers barriers for under-resourced communities and countries, enabling them to adapt AI tech to local needs without dependence on proprietary foreign platforms.

However, open weights alone do not solve all issues of equity or bias because substantial computational resources and expertise remain necessary to train or use large models. Supporting infrastructure must also be considered part of the "true" openness needed.

Recent developments in the AI industry have shown the consequences of incomplete visual data and the lack of a safety mechanism. A fatal 2023 Tesla Full-Self-Driving (FSD) crash video exposed these issues, leading to a woman's death. Similarly, over 1,000 URLs containing verified Child Sexual Abuse Material were uncovered in the LAION 5B dataset, which is foundational for AI text-to-image generation models.

The open nature of the LAION 5B dataset allowed the AI community to uncover dangerous content and motivate a fix, preventing the production of illicit photorealistic images. This incident underscores the need for true open source AI, where developers can inspect and understand the data used in the models.

Baidu, Moonshot, and Alibaba have recently released their large language models DeepSeek, ERNIE 4.5, Kimi K2, and Qwen3, respectively, as open source. These releases, while a step in the right direction, are actually "open weight," meaning they can be downloaded and used but cannot be inspected in a meaningful way without the data.

The AI industry has a habit of using the term open source to refer to free or low-priced releases that are made available with limited access to the puzzle, rather than fully open source AI. This practice hinders the creation of unbiased, reliable, and safe AI, as demonstrated by the uncovering of troubling issues in datasets.

Establishing trust by creating unbiased, reliable, and safe AI has never been more critical as AI systems are driving cars and offering medical assessments. The transparency and collaboration of true open source is essential for building unbiased, reliable, and safe AI.

Innovations like Qwen3's latest update, inspired by open source community feedback, and Baidu's ERNIE 4.5 models, which can potentially spur innovation and collaboration with developers looking to create smaller, powerful applications, highlight the potential benefits of true open source AI.

As the AI industry continues to grow and evolve, it is crucial that we strive for true open source practices to ensure the development of unbiased, reliable, and safe AI technology that benefits everyone.

The Chinese AI community's call for global developers to trust models they cannot inspect or understand highlights the importance of true open source AI, which includes a full set of code, weights, and training processes openly available, as this transparency allows the community to address hidden biases or ethical concerns effectively.
The incident involving the discovery of Child Sexual Abuse Material in the LAION 5B dataset underscores the need for true open source AI, where developers can examine and understand the data used in models, to ensure that dangerous content is uncovered and addressed before it can cause harm.

China's AI Deception: The Essentialness of Deep-Hidden Aspects in Open-Source Development