AI is quietly poisoning itself and pushing models toward collapse - but there's a cure

2 hours ago 1
Computer trash symbol on dynamic digital background. Glowing digital data delete icon abstract 3d illustration. Bright recycling sign.
Arkadiusz Warguła via iStock / Getty Images Plus

Follow ZDNET: Add us as a preferred source on Google.


ZDNET's key takeaways

  • When AI LLMs "learn" from other AIs, the result is GIGO.
  • You will need to verify your data before you can trust your AI answers.
  • This approach requires a dedicated effort across your company.

According to tech analyst Gartner, AI data is rapidly becoming a classic Garbage In/Garbage Out (GIGO) problem for users. That's because organizations' AI systems and large language models (LLMs) are flooded with unverified, AI‑generated content that cannot be trusted. 

Model collapse

You know this better as AI slop. While annoying to you and me, it's deadly to AI because it poisons the LLMs with fake data. The result is what's called in AI circles "Model Collapse." AI company Aquant defined this trend: "In simpler terms, when AI is trained on its own outputs, the results can drift further away from reality." 

Also: 4 new roles will lead the agentic AI revolution - here's what they require

However, I think that definition is much too kind. It's not a case of "can" -- with bad data, AI results "will" drift away from reality.  

Zero trust

This issue is already apparent. Gartner predicted that 50% of organizations will have a zero‑trust posture for data governance by 2028. These enterprises will have no choice, because unverified AI‑generated data is proliferating across corporate systems and public sources. 

The analyst argued that enterprises can no longer assume data is human‑generated or trustworthy by default, and must instead authenticate, verify, and track data lineage to protect business and financial outcomes.

Ever try to authenticate and verify data from AI? It's not easy. It can be done, but AI literacy isn't a common skill. 

Also: Got AI skills? You can earn 43% more in your next job - and not just for tech work

As IBM distinguished engineer Phaedra Boinodiris told me recently: "Just having the data is not enough. Understanding the context and the relationships of the data is key. This is why you need to have an interdisciplinary approach to who gets to decide what data is correct. Does it represent all the different communities that we need to serve? Do we understand the relationships of how this data was gathered?" 

Making matters worse, GIGO now operates at AI scale. This situation means that flawed inputs can cascade through automated workflows and decision systems, producing worse results. Yes, that's right, if you think AI result bias, hallucinations, and simple factual errors are bad today, wait until tomorrow. 

To counter this concern, Gartner said businesses should adopt zero‑trust thinking. Originally developed for networks, zero-trust is now being applied to data governance in response to AI risks. 

Also: Deploying AI agents is not your typical software launch - 7 lessons from the trenches

Stronger mechanisms

Gartner suggested many companies will need stronger mechanisms to authenticate data sources, verify quality, tag AI‑generated content, and continuously manage metadata so they know what their systems are actually consuming. The analyst proposed the following steps:

  • Appoint an AI governance leader: Establish a dedicated role responsible for AI governance, including zero-trust policies, AI risk management, and compliance operations. However, this individual can't do the work by themselves. They must work closely with data and analytics teams to ensure AI-ready data and systems can handle AI-generated content.
  • Foster cross-functional collaboration: Cross-functional teams must include security, data, analytics, and other relevant stakeholders to conduct comprehensive data risk assessments. I'd add representatives of any department in your company that uses AI. Only the users can tell you what they really need from AI. This crew's job is to identify and address AI-generated business risks.
  • Leverage existing governance policies: Build on current data and analytics governance frameworks and update security, metadata management, and ethics-related policies to address AI-generated data risks. You'll have more than enough work without reinventing the wheel. 
  • Adopt active metadata practices: Enable real-time alerts when data is stale or requires recertification. I've already seen many examples where old data is wrong. For example, I asked several AI chatbots the other day what the default scheduler was in Linux today. The common answer: the Completely Fair Scheduler (CFS). Yes, CFS is still in use, but starting with 2023's 6.6 kernel, it was retired in favor of the Earliest Eligible Virtual Deadline First (EEVDF) scheduler. My point is that anyone other than someone like me, who knows Linux pretty well, would never get the right answer from AI. 

So, will AI still be useful in 2028? Sure, but ensuring it's useful and not heading down a primrose path to a bad answer will require a lot of good, old-fashioned people work. However, this role will at least be a new job generated by the so-called AI revolution.

Read Entire Article