Modern backends aren't C or legacy Java. They're FastAPI/Flask/Django and Express/NestJS/Next.js. Yet we still judge detection tools with sink-centric, synthetic benchmarks that ignore framework semantics. We built the Unsafe Code Detection Benchmark, a reproducible way to score both SAST and LLMs on intentionally vulnerable, minimal micro-apps across today's web frameworks.
Our benchmark couples an open corpus with a single harness, unified ground truth and a failure-mode taxonomy mapped to CWE/OWASP. It measures precision/recall and cost/latency, controls for prompt/temperature variance and includes "appears-vulnerable-but-safe" scenarios to stress false-positives.
Initial results may surprise: on source-proximate issues common in modern stacks (parameter merging/polllution, middleware/decorator-order authz bypasses, subtle type coercion), state-of-the-art general purpose LLMs outperform industry leading SASTs in their default configuration – a gap we trace to weak framework awareness and imprecise source modeling. The twist: with simple, framework-aware custom rules SAST surpasses LLMs, showing why deterministic, organization-specific rules remain a force multiplier. LLMs provide strong raw recall but exhibit prompt sensitivity and a tendency to conflate stylistic "best practices" with real vulnerabilities.
Attendees will leave with a practical methodology and tooling to evaluate their own SAST and LLMs on modern stacks, concrete guidance to raise real-world detection rates and a lear path to extend and rerun the benchmark internally. We will release the benchmark specification, the harness for running selected SAST tools and LLMs as well as the open-source corpus.
By:
Andrew Konstantinov | Security Engineer
Irina Iarlykanova | Student
https://ift.tt/DR1Juzp
source https://www.youtube.com/watch?v=0v3pnoR8IyY
Subscribe to:
Post Comments (Atom)
-
Unmasking State-Sponsored Mobile Surveillance Malware from Russia, China, and North Korea – Threat Actors, Tactics, and Defense Strategies S...
-
Germany recalled its ambassador to Russia for a week of consultations in Berlin following an alleged hacker attack on Chancellor Olaf Scho...
No comments:
Post a Comment