Build vs Buy for AI-based Security Products
When to be careful when building agentic products in house
Should I buy an AI product from a vendor, or just build it myself in house?
I’ve been having a lot of conversations with security leaders recently about this.
Many have been underwhelmed (fairly) with a lot of what they’ve seen from security vendors so far. Most AI products they’ve tried have overpromised and undelivered. They have access to Gemini, Claude, and Langchain and are tempted to just build something themselves e.g. AI SOC, AI for vuln management, AI pen testers, etc.
My advice in these scenarios is always the same: pick a narrowly defined use case and go try it out. You’ll be able to build a prototype fast thanks to the tools out there and will learn so much as you do it.
The first problem you’re likely to run into when building in house is accuracy. LLMs are non deterministic and when you string them together into agents, results can become unpredictable quickly. Most security use cases require both huge scale (e.g. review every vulnerability in my environment) and high accuracy (e.g. never tell me something isn’t a risk when it is). With enough effort, accuracy can be managed and agents can become extremely reliable. But to get there, huge amounts of time need to be invested into reviewing results, iterating on the agent(s), connecting new tools, and driving accuracy upwards.
The less obvious problem people run into when building in house is inference costs. When we ran the first prototype of Maze for one day in a big enterprise’s cloud environment, we calculated it would cost $millions of dollars per month to run the product for real. Since then we’ve optimised like crazy, keeping accuracy constant whilst engineering away the costs. Again it’s possible to manage, it just takes a ton of time to experiment with different approaches, measure costs, and then iterate until they become reasonable. Most in house teams don’t have the time for this.
So when should you build in house? If you’re looking at a use case that is relatively low scale and where the margin for error is moderate, building in house can make a lot of sense. A good example here would be reviewing bug bounty reports - the scale is small enough that you can keep costs controlled and have a human in the loop to review and catch errors.
Otherwise, for security use cases that need to operate at large scale and require high accuracy, proceed with caution.
It’s not impossible to build in house, just be ready to spend a lot of time optimising accuracy and cost. Or be ready for an awkward conversation with your CFO...


