The Build vs. Buy Question Nobody Frames Correctly

The conversation usually goes like this. A vendor pitches a polished AI solution. Someone in the room says, "We could probably build this ourselves." Someone else asks what it would cost. Someone else mentions control and customization. Within ten minutes, the meeting has turned into a debate about engineering capacity and licensing models, and nobody has touched the question that actually matters.

My attempt at visualizing my thesis — but like a joke you have to explain, maybe a bit obscure? The visual argument here is one of inversion: the commodity is being lovingly hand-crafted, and the differentiator is being assembled like fast food. Exactly the mistake the article describes - building what you should buy, and buying what you should build.

That question isn't whether you should build or buy. It's whether the problem in front of you is generic or proprietary. Most build-vs-buy decisions go wrong because the framing is wrong from the start.

The Wrong Question

Cost and control sound like reasonable axes. They aren't, at least not as the primary frame.

Cost is misleading because it pretends the build option has a stable price tag. It doesn't. The build path's true cost is the engineering opportunity cost over a multi-year horizon plus the ongoing maintenance burden plus the risk of organizational dependency on whoever wrote the code. Most internal estimates only count the first six weeks, which is why so many "we'll just build it" projects end up costing five times more than the buy option they rejected.

Control is misleading because organizations rarely need it for the things they think they do. Control matters where your edge lives. Everywhere else, control is just a tax — you're spending engineering attention defending an internal version of something you could have rented for less than the cost of one engineer's quarter.

When cost and control are the dominant frame, organizations consistently buy what they should have built and build what they should have bought. Both mistakes are expensive in different ways.

The Right Question

The cleaner frame is this: is the problem you're solving generic or proprietary?

A generic problem is one that looks broadly the same across organizations in your space. Email parsing, document classification, common transcription, off-the-shelf NLP, standard recommendation flows. These are problems where your competitors have the same shape of need, the same shape of data, and the same shape of acceptable solution. Your edge doesn't come from solving them better. Your edge comes from somewhere else, and these systems just need to work reliably and cheaply.

A proprietary problem is one that depends on data only you have, judgments only your organization makes, or workflows that don't translate to anyone else's environment. The shape of the answer depends on the shape of your business. A vendor walking in cold would need months just to understand the context, and even then they'd be implementing your team's expertise rather than bringing their own.

Generic problems should almost always be bought. Proprietary problems usually have to be built, at least the parts where the proprietary nature actually lives.

Three Tests for "Generic"

When I'm sitting with a team trying to make this call, I usually run three tests.

Could a vendor sell this same product to ten of your competitors? If the answer is yes, you're looking at a generic problem. The vendor's investment in the product is amortized across that whole market, which is why their version will almost always be better and cheaper than what you can build internally for your single use case.

Is the data largely standardized across your industry? Standard formats, standard schemas, standard meanings. If the data isn't proprietary in a meaningful way, the system processing it usually shouldn't be either. You're not getting an advantage from a custom build; you're just paying to maintain something a third party already maintains.

Does your competitive edge come from the solution itself, or from how it interacts with your proprietary data and processes? This is the test that matters most. If the edge is in the solution, you have to own it. If the edge is in everything around the solution, the solution is a commodity input, and commodity inputs should be bought.

If the answers point to generic, buy. The discipline is to actually buy, not to talk yourself into building because your team finds the problem interesting.

Three Tests for "Proprietary"

The same three tests, run from the other direction.

Does the problem depend on data only you have? If the system's value comes from working with your specific corpus, your specific operational logs, your specific customer interactions, then no vendor can match what an internal team can build. The data is the moat, and the system has to be designed around how the moat actually behaves.

Does the answer depend on workflows, definitions, or judgments specific to your organization? A claim severity classifier that uses your underwriters' judgment patterns isn't the same product as a generic claims classifier. Neither is a recommendation system tuned to your pricing logic, your inventory dynamics, and your specific customer behaviour. These systems encode institutional knowledge. Vendors can't sell that to you because you're the only one who has it.

Would a vendor solving this need three months of deep customization just to understand the situation? If yes, you're not really buying a product. You're paying premium consulting rates to teach a vendor how your business works, and then renting back the system you helped them build. That's a worse deal than building it yourself with a team that actually retains the knowledge.

If the answers point to proprietary, build. And invest properly. Proprietary AI systems built half-heartedly are the most common source of the failed projects we get called in to rescue.

Where Organizations Get It Wrong

The two failure modes look different but stem from the same framing error.

The first is buying what they should have built. A mid-market firm rolls out an off-the-shelf customer intelligence platform because it was faster and cheaper than building one. Two years later, the platform's outputs don't align with how their sales team actually qualifies opportunities, the vendor won't customize the model logic without an enterprise contract, and the organization quietly stops using the system. They've spent the equivalent of a small engineering team's salary on a product that gave them no advantage.

The second is building what they should have bought. An internal team spends a year building a generic document classification pipeline because nobody wanted to commit to a vendor. The system works adequately. It also costs three engineers a quarter every year to maintain, while a commercial product that does the same job better is available for a fraction of the loaded cost. The team has built infrastructure that doesn't differentiate the business and now owns it forever.

Both organizations did the work. Both spent the money. Neither got an advantage. The framing produced the wrong answer in both directions.

The Hybrid Case

Most real AI problems aren't purely generic or purely proprietary. They're a stack, and different layers of that stack have different answers.

A retrieval-augmented generation system over your proprietary documents is a classic example. The vector database is generic infrastructure, so buy it. The embedding model is largely generic, so use a strong off-the-shelf one. The retrieval logic and the way you chunk, score, and rerank against your specific corpus is proprietary, so build it. The orchestration and the production wrapper might be either, depending on your scale.

The pattern that works is to be ruthlessly precise about which layer is which. Buy the infrastructure. Build the IP layer. Don't conflate them, and don't let your engineering team's enthusiasm push you into building infrastructure or your CFO's caution push you into buying away your differentiator.

A Better Ownership / Boardroom Question

The next time the build-vs-buy conversation starts, replace it with a sharper one. "What's our edge here, and where exactly does it live in the stack?"

That question forces the strategic thinking that the cost-and-control frame avoids. It separates the parts of the system where ownership matters from the parts where it doesn't. It produces a clear architecture rather than a compromised middle position. And it tends to expose, very quickly, the cases where the team can't actually articulate where their edge is. That itself is useful information, because you shouldn't be building or buying anything until you can.

Build versus buy isn't an architectural argument. It's a strategic one. The teams that get this right are the ones that know which parts of their stack are differentiators and which parts are infrastructure, and who treat those two categories with the discipline they each deserve.

Next
Next

Why One Data Scientist Isn't Enough