Video: How Apiiro Uses LLMs to Detect Risks at the Design Stage

At the recent PyData Meetup in Tel Aviv, Apiiro Senior Data Scientist Arnon Dagan shared how Apiiro is using open-source large language models (LLMs) to automate risk detection at the earliest possible stage—before a single line of code is written.

Arnon explained the cost of late-stage risk detection: the later a security issue is found (especially in production), the more expensive and disruptive it is to fix. By identifying risks at the design stage, organizations strengthen security and cut remediation costs.

The Challenge: Identifying Risks Before Code is Written

Threat modeling is crucial for catching security risks early, but it’s a slow, manual process that requires deep security expertise. As development speeds up, threat modeling becomes a bottleneck as security teams struggle to keep up—leading to more vulnerabilities going undetected until later stages, when they are harder and more expensive to fix.

How Apiiro uses LLMs for Risk Detection

Apiiro’s AI-driven approach automates risk detection, using two open-source LLM models:

  1. Risk classification: A fine-tuned LLM (Microsoft’s Phi) analyzes design documents including Jira tickets to classify potential security risks.
  2. Contextual analysis and threat modeling: If a risk is detected, a second LLM (Microsoft’s Orca) categorizes the issue, generates summaries, and suggests mitigation strategies–automating structured security insights.

With this system, security teams can catch potential risks before the code that introduces them is even written—eliminating costly fixes later.

Overcoming LLM Deployment Challenges

Deploying LLMs for structured security analysis presents unique challenges, from ensuring reliable outputs to optimizing large-scale performance. To overcome these, Apiiro implemented the following:

Schema-based output validation: Using structured frameworks like Outlines to maintain consistent LLM-generated results.

Optimized GPU utilization: Leveraging VM-based architectures to improve inference efficiency.

Distributed model serving: Implementing Ray Serve to manage large-scale LLM inference and Kubernetes integrations. 

Advanced evaluation techniques: Utilizing LLM-based evaluation methods (e.g., Microsoft’s G-Eval) to assess accuracy and consistency.

Scaling security with LLMs

Apiiro’s LLM-powered risk detection is already live, analyzing design documents and identifying risks in real-time. By automating key aspects of threat modeling, Apiiro enables security teams to prevent security issues before they even exist, without slowing down development.
Interested in the full 20-minute talk? Watch the video here.