Systems once limited to autocomplete and code suggestions are now being built to inspect repositories, plan fixes, generate pull requests, run tests and adjust their own behaviour from feedback. The change is most visible in agentic coding tools embedded in developer workflows, where platforms from GitHub, OpenAI, Anthropic, Google, Amazon and Cursor are competing to become the command layer for software engineering.
GitHub’s Copilot coding agent, made generally available to paid subscribers in 2025, can research a repository, create a plan, change code on a branch and open pull requests. Anthropic’s Claude Code moved from a research preview into wider availability with integrations across terminals and development environments. OpenAI’s Codex has been pushed deeper into ChatGPT and developer workflows, while Google has been repositioning its Antigravity development platform around agent-first software work.
The appeal is straightforward. Engineering teams face pressure to deliver more features, patch vulnerabilities faster and maintain sprawling codebases with fewer delays. AI agents promise to compress routine work, from refactoring and documentation to bug fixing and test generation. Developer surveys show high adoption of AI tools, with a large majority either using or planning to use them in software development. Research on AI-assisted teams also points to productivity gains and improved perceptions of code quality when tools are paired with mature engineering practices.
Yet self-improving code raises a harder question than ordinary automation: who is accountable when software changes itself? A human developer can be reviewed, trained or disciplined. A model-driven system can produce a plausible patch for reasons that are difficult to reconstruct, especially if it relies on context from tickets, comments, logs, dependencies and production telemetry. When that patch affects a banking platform, hospital device, vehicle system or critical infrastructure service, the distinction between assistance and authorship becomes legally important.
Engineering leaders are therefore treating autonomous code generation less as a replacement for developers and more as a new supply-chain risk. Code produced by AI must still pass tests, security checks, peer review and audit trails. The strongest deployments keep humans in control of approvals, restrict agents to sandboxed environments, require signed commits and log every prompt, tool call and file change. The weaker deployments risk creating a “black box” development process in which software evolves faster than governance can follow.
Security findings have reinforced the caution. Studies of AI-generated code have found recurring weaknesses, including flawed authentication logic, injection vulnerabilities, unsafe defaults and incomplete error handling. Iterative improvement loops can also degrade security if the agent optimises for passing tests or satisfying a prompt rather than preserving threat models. That matters because attackers are also adopting AI to discover vulnerabilities, generate exploits and automate reconnaissance.
Regulators are moving in the same direction. The European Union’s AI Act is bringing obligations around risk management, documentation, transparency and human oversight for higher-risk AI systems. Its product liability reforms broaden the legal treatment of software, digital services and updates, making it harder for companies to argue that harms caused by defective code fall outside traditional product rules. For self-improving systems, liability may turn on whether a company retained control over updates, failed to provide adequate safeguards or allowed foreseeable risks to enter production.
The commercial race is still accelerating. GitHub says more than 150 million developers use its platform, and Copilot has been adopted by tens of thousands of organisations. Anthropic is marketing Claude Code as a tool that can understand codebases, edit files and run commands with user permission. OpenAI is expanding Codex across devices and workflows. Amazon Q Developer, Google’s Gemini-linked tooling and Cursor’s AI-first editor are pushing the same market toward increasingly autonomous execution.
The labour impact is unsettled. Junior developers may gain faster access to complex codebases, but they may also lose the apprenticeship path that comes from debugging, reading legacy systems and writing small features by hand. Senior engineers are likely to spend more time reviewing machine-generated work, designing architecture, setting constraints and managing risk. The job may become less about typing code and more about specifying intent, validating outputs and deciding which automated changes are safe to accept.
Topics
Technology