Incorporating AI agents and multi-modal analysis into SAST significantly decreases the problematic false positives commonly found in conventional and rules-based SAST solutions.
About a year ago, I authored an article titled “Selecting the Optimal SAST Tool,” which explored the advantages and disadvantages of two distinct eras of static application security testing (SAST):
- First-generation SAST (Traditional): Offers extensive coverage through thorough scans, yet introduces significant bottlenecks due to prolonged execution times.
- Second-generation SAST (Rules-based): Emphasized developer efficiency with quicker, adaptable rules, but its scope was confined to explicitly defined parameters.
Back then, these were essentially the sole choices available. Frankly, neither approach was particularly effective. Fundamentally, both generations were designed to flag code vulnerabilities largely resolved by other advancements (e.g., compiler and framework improvements eradicated entire categories of CWEs). These tools have failed to keep pace with contemporary application development. They primarily depend on syntactic pattern matching, occasionally augmented by intraprocedural taint analysis. However, modern applications are far more intricate, frequently leveraging middleware, frameworks, and infrastructure to mitigate security risks.
Consequently, even as the burden of addressing weaknesses shifted to other layers of the stack (thanks to innovations in memory safety, frameworks, and infrastructure), SAST tools continue to generate an abundance of false positives (FPs) at the granular code level. Regardless of whether you employ first or second-generation SAST, between 68% and 78% of reported findings turn out to be FPs. This necessitates extensive manual review by security teams. Compounding the issue, current code vulnerabilities are more often rooted in logic errors, misuse of valid features, and contextual misconfigurations. Regrettably, regex-based SAST tools cannot adequately comprehend these types of problems. Thus, in addition to FPs, you also encounter a high incidence of false negatives (FNs). Moreover, with the increasing adoption of AI code assistants, we anticipate a rise in logic and architectural flaws that traditional SAST tools are ill-equipped to detect.
Is AI Capable of Resolving the SAST Challenge?
As the cybersecurity field began leveraging AI to tackle previously intractable issues, a compelling question frequently arose: Can AI contribute to developing a truly effective SAST?
Indeed, it can. This gave rise to the third generation of SAST:
- Third-generation SAST (AI SAST): Employs AI agents and a multi-modal analytical approach to pinpoint business logic vulnerabilities and dramatically minimize false positives.
Let’s be precise! A high-caliber AI SAST must transcend being merely a first or second-generation tool augmented with a ChatGPT interface. For optimal performance, the tool requires an understanding of your code and architectural context. However, resist the urge to feed your entire code repository into a large language model (LLM). This would consume an excessive number of tokens and rapidly become financially unsustainable at an enterprise level.
When assessing AI SAST offerings, I recommend seeking out a multi-modal analysis strategy that integrates rules, dataflow analysis, and LLM-driven reasoning. This comprehensive approach mirrors the manual workflow employed by security professionals: interpreting code, tracking data flow, and evaluating business logic.
Syntax-Based Rule Enforcement
Rules are obsolete, long live rules!
Fixed-pattern examinations (through rules) remain an effective method for identifying specific vulnerabilities with minimal runtime overhead. To paraphrase a cybersecurity adage, an effective AI SAST will adopt a layered defense strategy, utilizing rules to detect clear security flaws, with AI deployed subsequently in the analysis pipeline. For instance, a rule can swiftly pinpoint the deployment of an obsolete encryption algorithm or the omission of input validation on a crucial API endpoint.
As you examine AI SAST solutions, investigate the origin of their rules:
- Are they simply general-purpose linters, or is a dedicated research group refining them for precision?
- Are these rules validated against actual codebases?
- Do the identified issues come with comprehensive context and suggestions for resolution?
- Does the system allow you to incorporate natural language rules? (This is crucial, as rule writing can be quite tedious.)
Each of these aspects can significantly enhance AI-driven triage at scale by minimizing the token consumption required to analyze a codebase.
Analyzing Data Flow
Imagine a scenario where a rule identifies a susceptible encryption function used at two different points in the code. Merely detecting these vulnerabilities doesn’t confirm them as genuine threats. This is precisely where dataflow analysis becomes invaluable. The AI SAST monitors data flow across various files and functions, scrutinizing the source code under examination to execute a taint analysis, tracking inputs from their origins to their destinations. The objective of this phase is to eliminate or lower the priority of findings that are not practically exploitable. (This is somewhat akin to reachability in software composition analysis, or SCA.) While AI is capable of performing this, it’s also advantageous for the tool to possess a non-AI method for program analysis to accelerate the process.
During your assessment of AI SAST solutions for their dataflow analysis capabilities, inquire about the following:
- How is the analysis conducted? Is it powered by AI, conventional program analysis, or a combination of both?
- Is the tool capable of performing analysis across multiple files and functions?
- What substantiation is offered to confirm the exploitability of the code?
- What proportion of false positives can this analysis identify?
You should anticipate the tool illustrating the potential attack vector an adversary might exploit within your application’s environment, thereby transforming theoretical problems into practical insights. Dataflow analysis also represents a strong application for AI agents, so their involvement at this stage is to be expected.
LLM-Powered Reasoning
Not long ago, a blend of rules and analysis might have seemed sufficient. However, this approach persistently produces false positives (FPs) because the tool merely flags potential vulnerabilities without comprehending any existing compensatory controls. Frequently, the limitation stems from a SAST tool’s incapacity for cross-file analysis, and regrettably, introducing more rules can be counterproductive. This is because an increase in patterns leads to more detected issues, but without adequate context, many of these findings will be of inferior quality. Furthermore, these legacy tools are inherently incapable of identifying intricate logic flaws.
This is precisely where AI SAST truly shines, offering greater value by indicating the criticality of a finding. Through AI-driven triage, the tool can evaluate discoveries within the complete codebase and any pertinent metadata, mirroring a human security specialist’s approach, to establish definitive conclusions and priority levels. This concluding triage phase can uncover logic errors, remove further FPs, or potentially lower the severity of issues based on specific runtime setups, inter-component dependencies, or subtle business logic intricacies.
Key inquiries to pose to an AI SAST provider include:
- Does the tool possess an understanding across multiple files?
- What types of files or documentation are provided to the tool for analysis?
- Is the tool capable of identifying sophisticated logic vulnerabilities?
- Can the tool improve its performance based on feedback from engineers?
Vendor’s Data Management Practices
Ultimately, prior to committing to an AI SAST solution, it is imperative to thoroughly comprehend the vendor’s data handling policies. Inquire about:
- What is the scope of the analysis?
- Will my source code be stored?
- Is my information utilized for model training purposes?
- What features can I decline, and how might that affect the solution’s precision?
You might consider adopting a ‘Bring Your Own LLM’ (BYO LLM) approach, which seems like a straightforward solution. However, managing your own LLM demands substantial infrastructure, a task that is neither simple nor cost-effective. A feasible middle ground could involve supplying your own API key, perhaps even for something as straightforward as AZURE_OPENAI_API_KEY=your_azure_openai_api_key.
Is AI SAST the Right Choice for Your Needs?
If SAST has evolved into a burdensome compliance item within your organization, causing frustration among both developers and security engineers, then it’s certainly worth investigating if an AI SAST solution aligns with your requirements. As AI coding tools advance, we envision a future where design, architectural, and logical vulnerabilities represent the primary remaining security concerns. Eventually (and possibly sooner than anticipated), your current first or second-generation SAST might cease to identify the inherent risks in your codebase. AI SAST could effectively equip you for this evolving landscape.
Below is a concise comparison table outlining the advantages and disadvantages of each option.
| Conventional SAST (Gen 1) | Rule-Driven SAST (Gen 2) | AI-Powered SAST (Gen 3) | |
| Summary | Deliberate yet Precise | Rapid but Prone to Noise | Swift and Accurate |
| Advantages | – Achieves maximum possible coverage | – Swift, integrates with CI/CD – Highly adaptable with custom rules – Developer-centric, smooth workflow integration |
– Identifies intricate logic defects (minimal FNs) – Comprehends code environment (few FPs) – Agents can potentially learn crucial insights |
| Disadvantages | – Lagging, unsuitable for agile methods, late in SDLC – Restricted personalization capabilities – Comprehensive coverage often means more FPs – Unable to identify advanced business logic flaws (FNs) – Necessitates additional tools or procedures |
– Relies on rules, may need specialized knowledge – Demands verification that rules suit specific scenarios (e.g., language compatibility) – Velocity often traded for higher FNs and FPs – Incapable of spotting complex business logic defects (FNs) |
– Requires comfort with LLM accessing proprietary code |
—
The Tech Innovation Hub offers a platform for prominent figures in technology—including suppliers and external experts—to delve into and debate new enterprise technologies with unparalleled depth and scope. The chosen topics are subjective, reflecting our assessment of technologies deemed significant and most appealing to InfoWorld’s audience. InfoWorld does not publish promotional materials and retains the authority to revise any submitted contributions. For all questions, please reach out to doug_dineley@foundryco.com.