Will AI replace application security testing? Since Anthropic announced Claude Mythos in April, I’ve been having the same conversation almost every day. With customers, with prospects, with our own field teams. The model that was “too dangerous to release,” restricted to a small circle of organizations through Project Glasswing, triggered a wave of questions that go right to the heart of our industry.
In April I wrote about how to frame AppSec in the AI era (AI Didn’t Break Application Security. It Exposes the Next Evolution.). That post laid out the landscape. This one answers the three questions I now get in nearly every conversation, and the answers come from real testing, not speculation.
Question 1: Will AI replace SAST?
Probably not soon, and definitely not by itself. Two reasons.
First, a model is not a security solution. An LLM is a building block, more like a database engine than an application. To turn raw model intelligence into something an AppSec team can rely on, you need a lot around it: a harness that guarantees the entire codebase actually gets analyzed (coding agents are surprisingly bad at this on their own), repeatable results, findings reported in a structured format rather than scraped from chat output, deduplication against what your existing tools already found, and a platform for portfolio management, reporting, and CI/CD integration. Ask a coding agent to “find security issues in my project” and it will find some. It will not give you a program.
Second, economics. Token prices are rising as providers correct years of below-cost pricing. In our own testing of agentic code analysis, a scan of a realistic business application costs hundreds of dollars in tokens on a premium model. On Mythos pricing, published at $25 per million input tokens and $125 per million output tokens, five times the cost of Claude Opus, you’d clear $1,000 for a single scan. That’s fine as an occasional deep review. You would pay more for a pen test. It is not viable on every commit, across every application, every day.
So the realistic pattern looks like this: traditional SAST remains the daily first line of defense, fast, deterministic, and cheap enough to run constantly. AI-powered agentic analysis becomes the periodic deep sweep, run weekly or monthly, much the way pen testing complements DAST today. We call this hybrid analysis, and we think it’s where the whole market lands.
The interesting part is what agentic analysis finds that SAST cannot. In our testing, two examples stand out. One is architectural flaws, like an authentication design that stores the username in an unsigned cookie with no server-side session. Any attacker can forge that cookie. No dataflow rule catches it, because the problem isn’t a flow, it’s the design. (I once found exactly this flaw in a real banking application during a pen test. It happens. Admittedly, a long time ago.) The other is persistent cross-site scripting, which involves two separate data flows, user input into a database, then database content out to a page. Traditional SAST handles reflected XSS well because it’s a single flow; the two-flow version is where AI’s broader code comprehension genuinely adds value.
That added value is real. It’s not hype. But it’s complementary to SAST, not a replacement for it.
Question 2: We don’t have Mythos access. Are we exposed?
Less than the headlines suggest. Here’s the finding from our work that surprised people most: deep agentic analysis does not require a Mythos-class model. In our prototype testing, broadly available models handle core detection well, with a premium model doing scoping and validation of results. Public benchmarks tell the same story: Mythos represents a gradual improvement over the best generally available models, not a fundamental leap.
What customers actually need, the ability to run deep AI-powered review of their own code inside their existing AppSec program, is achievable today with models anyone can access. That’s exactly what we’re building into the Fortify portfolio: an agentic analysis capability, currently in development with availability planned for the second half of this year, that plugs into SSC and Fortify on Demand, reports only findings your existing scans missed, and works with a wide range of LLMs. When you do get Mythos access, you’ll be able to use it. You won’t be waiting for it.
And much of Fortify’s AI capability is shipped and in production right now. Fortify Remediation Aviator uses AI to audit SAST findings and propose fixes; we run it on OpenText’s own development organization, 7,000 developers and more than 2,000 applications, with over a million issues audited to date. We’ve been shipping detection rules for the OWASP LLM Top 10 since late 2023. And our free, open-source MCP server and Agent Skills put Fortify directly inside the coding agents where developers now live.
Question 3: What if the attackers have Mythos?
Assume they will, whether it’s Mythos itself or an equivalent capability. The realistic attack patterns are clear. Open source code is the natural target, precisely because it’s open to analysis: AI-driven hunting for zero days at scale, and reverse-engineering freshly published security patches into working exploits. Neither technique is new. What changes is speed and scale, and that changes the game.
The defense, though, is not some exotic new technology. It’s established application security practice, executed with more discipline and urgency. Software composition analysis with aggressive remediation of impactful findings, because that’s where AI-accelerated attacks will land first. And testing all code before it ships, because the other side of this coin is developers using AI to write code. Independent benchmarks like BaxBench show that even top models given explicit security instructions still produce insecure implementations a meaningful share of the time, roughly one in five functionally correct solutions in recent results. SAST analyzes AI-generated code exactly as it analyzes human-written code. The practice is more relevant now, not less.
That’s the conclusion I keep coming back to. The current AI threat landscape doesn’t make application security obsolete. It makes it more important than it has ever been. AI creates real new AppSec risk and real new AppSec capability at the same time, and the organizations that win will be the ones that build both into a single program: the efficiency and determinism of traditional analysis, combined with the depth of agentic methods, on one platform.
If you’re wrestling with these questions in your own organization, talk to us. I’m having these conversations every day, and I’m happy to have one with you.