
AI Code Assistants Pose Hidden Dangers for Developers | Image Source: analyticsindiamag.com
SAN ANTONIO, Texas, April 7, 2025 – As artificial intelligence tightened its control over the software development industry, new research from the University of Texas in San Antonio (UTSA) led to a sober control of reality. In one of the most advanced AI-assisted coding studies, researchers identified a critical defect in large language models (LLM): a phenomenon called ”package hallucination”. This error, while easy to ignore, can introduce serious security vulnerabilities that could compromise systems and data, especially for developers who trust these tools without review.
The UTSA study, conducted by Dr. Joe Spacklen and a multi-institutional team, examined how AI models, particularly LLM, sometimes suggest software packages that simply do not exist. These hallucinated packages, when acting by unsuspected developers, can serve as a trap for cyber criminals. According to the newspaper, which was accepted at the prestigious USENIX 2025 Security Symposium, these mistakes are not anomalies, they are worrying. Researchers found that of the 2.23 million code samples, over 440,000 contained references to false packages.
What is a hallucination package? Simply put, a package hallucination occurs when an AI tool, while generating code, refers to a third-party library or package that does not exist in any known repository. This problem is not only a theoretical nuisance; it presents a real risk to safety. As Dr. explained Murtuza Jadliwala, director of UTSA SpriTELab, malicious actors can exploit this phenomenon by creating packages with the same name as hallucinates. When a user blindly installs the suggested package, he can run without knowing the malicious code on his machine.
Why are they hallucinating the packages?
LLM, like those developed by OpenAI or Meta, are trained in massive data sets ripped off the Internet. This includes examples of open source code, forum discussions and even incomplete or erroneous fragments of code repositories. By generating answers, these models sometimes improvise by making package names based on models they saw during training. The result? Non-existing units that appear to be convincing but that may lead to safety deficiencies.
According to research, GPT models had the lowest hallucination rate of 5.2%, while open source alternatives such as DeepSeek and Mistral AI were well below 21.7%. Interestingly, the Python code was less likely to include hallucinations than JavaScript, probably due to the different complexities and ecosystems of packages between the two languages.
Q: To what extent is AI-assisted coding common today? A: The study indicates that nearly 97% of developers now use AI coding tools at a certain capacity, and that up to 30% of the code written today is generated by AI.
Increasing confidence
One of the most alarming conclusions is not only the presence of hallucinations, but also the way developers react. Since LLMs are generally accurate, users often rely implicitly on their products. This distrust can have terrible consequences. “We talked to many developers,” Spacklen said, “and almost everyone admitted that they had found packages hallucinated before, but never considered how it could be armed.”
This widespread dependence on open source deposits like Python or npm for JavaScript exacerbates the problem. These platforms are open to public contributions, which means that a bad actor could easily download a malicious package with a name that matches a mentioned suggestion.
“You put a lot of implicit trust in the package editor that the code they shared is legitimate and not malicious,” Spacklen said. “But every time you download a package, you download potentially malicious code and give it full access to your machine. “
Not just a security threat: a productivity puzzle
The account of AI-assisted coding has long been an increase in productivity. Developers often rent these tools to write boiler code or generate quick solutions. But not everyone accepts that it is always a positive network. Birgitta Boeckler, world leader in AI-assisted software delivery in Thinkworks, shared her experience with tools such as Cursor, Windsurf and Cline. Instead of accelerating things, these tools have sometimes slowed down their workflow.
Q: Can coding with AI help stop development? A: Yes, according to Boeckeller. He indicated that he often had to correct the production of AI or discard it completely, which increased the time to compromise and interrupt the flow of equipment.
It classified the impact into three cubes: reduced development speed, disruption of collaborative workflows, and increased long-term maintenance burden. For example, he reported an episode where AI misdiagnosed a Docker building problem, blaming architecture settings when the real problem was using incorrect packages.
Short-term gas, long-term consequences
The code generated by AI often lacks modularity, making future changes more difficult. This creates a challenge not only for current tasks, but for future developers who inherit the code base. ”IA is not big with border cases,” said Mehul Gupta, data scientist DBS Bank. “In complex projects, you can introduce subtle errors that are difficult to detect. Risk jumps when working with an unknown language or framework.”
Gupta highlighted a key remuneration. While AI quickly generates boiler code, it also takes more time to examine and debug. This burden is particularly heavy for beginners, who may lack basic knowledge to detect defects or hallucinations.
“The AI coding certainly increased my workflow,” Gupta said. “But the time saved in the writing code often becomes reassigned to review and correct.”
How can developers protect them?
Consciousness is the first line of defense. Recognizing that hallucinations are a known problem can help developers be more critical with AI suggestions. UTSA researchers recommend cross reference packages with verified master lists before installing anything. However, they stress that the final solution is to improve the models themselves.
The team shared its findings with key LLM developers such as OpenAI, Meta, DeepSeek and Mistral AI. They urge these companies to improve their training data and to incorporate mechanisms to verify hallucinations during inference. They also advocate awareness raising to inform the community of promoters of this still significant risk.
Q: What can companies do to prevent hallucination-based attacks? A: Companies should invest in model improvements, provide clear warnings for unverified suggestions and encourage best practices among developers, such as package verification and code review.
Even seemingly minor improvements can go far. The introduction of automated verifiers in development environments that indicate non-existing packages or encourage AI tools to include warning labels when confidence is low could prevent developers from falling into the hallucination trap.
The implications of this study go far beyond the academic world. Since AI coding assistants are more involved in developers’ workflows, the industry has to deal with the paradox they present: the comfort involved in the potential risk. Like any powerful tool, its safety depends on how they are used responsibly and how their limits are understood.
Final reflections The LLM will not leave soon. In any case, its use will only increase. But this research reminds us that sophistication is not equal to infallibility. The more we trust AI, the more careful we have to be. Developers, educators and businesses should treat IV as a partner, not as a panacea – a panacea that always requires surveillance, review and above all a good dose of scepticism.