Cybersecurity: AI orchestrated cyberattack

Lately I’ve been sitting with a kind of quiet, electric worry, the kind that hums in the background when something big is shifting beneath our feet.

A recent post from Anthropic, the makers of artificial intelligence (AI) chatbot Claude, described an unsettling milestone: an AI system, manipulated by a threat actor, attempting to infiltrate multiple global targets with minimal human direction. It reads like a sentence from a speculative novel, except it’s stamped with a date from last week.

I am not an AI engineer. I am not a policymaker. I am just a guy who lives in the world these systems touch – bank accounts, government portals, cloud documents, the daily scaffolding of modern life. And I keep wondering: Are we ready for adversaries that don’t get tired, don’t get scared, and don’t need sleep or salary?

So I have been trying to shape my confusion into something more constructive, questions. Not the technical kind buried in whitepapers, but the kind you might ask when you’re standing at the edge of the unknown, hoping the people steering the ship have a map.

These are the five questions I wish I could sit down and ask every policymaker, CISO, and AI developer who might read this:

1. How do we even see an AI attack coming?

When automation becomes the attack vector, the signals get blurrier. What tells us we are watching normal digital activity, and what tells us something intelligent and hostile is weaving through the pipes? I wonder what new kinds of “eyes” we will need to distinguish real from malicious motion in systems that move faster than we can blink.

2. Can we truly harden AI systems against manipulation?

If an AI model can be nudged or tricked into doing harm, where does responsibility sit? In training data? Guardrails? Governance? Sometimes it feels like we are still trying to childproof a machine that is smarter than the room it’s in.

3. What does a safe sandbox look like in the age of agentic systems?

I imagine the digital equivalent of padded walls and locked doors, places where AI tools can do good work without having free rein over critical infrastructure. But what does that actually look like in practice? And how do we prevent the escape hatches we don’t even know exist?

4. Are our incident-response plans built for a world where attacks move at machine speed?

If an autonomous system can probe, test, adjust, adapt – all in the time it takes a human team to read a log line, how do we keep pace? What does “rapid response” mean when time has compressed to milliseconds?

5. And finally: What shared rules do we need, right now, to keep AI from becoming the internet’s newest weapon?

Standards, protocols, norms… We have been here before with other technologies, but never with something that can strategise, optimise, and scale harm without a human touching every step. I want to know what collective guardrails we are building, and whether they’re enough.

I am not writing any of this to sow fear. I write it because I believe in naming the things that matter while we still have the chance to shape them.

If you’re someone working in policy, security, or AI development, maybe you are carrying these same questions – maybe you even have answers. I hope you’ll share them openly. Some of us out here are listening closely, trying to understand the terrain of the future we are all walking into.

Because the threat isn’t just the technology. It’s the silence between the people building it and the rest of us who rely on them.

And I think it’s time we started talking, seriously. I wrote an article about the need for policing AI models two years ago, but I fear policymakers are not paying enough attention to what is going on in the AI tech world.

How to police AI models for transparency and accountability

BBC News: Anthropic said hackers tricked the chatbot into carrying out automated tasks under the guise of carrying out cyber security research.

First dropped: November 15, 2025 | Last modified: November 15, 2025

Translate to:

This translation function is powered by my locally hosted machine translation server (Libre Translate) so it might be a little bit slow.

1. How do we even see an AI attack coming?

2. Can we truly harden AI systems against manipulation?

3. What does a safe sandbox look like in the age of agentic systems?

4. Are our incident-response plans built for a world where attacks move at machine speed?

5. And finally: What shared rules do we need, right now, to keep AI from becoming the internet’s newest weapon?

Leave a Comment Cancel reply