Do You Actually Have a Vector Database Problem?
Part of our AIAppliedRight series where we reviewed an AI first Architecture and are working towards understanding “When does this system actually Need AI“.
Blog 1 argued that most “AI” features in a content-heavy platform are really lookups, ranking, and classification — and that, in that engagement, embeddings survived the architecture review but the vector database didn’t. This post is about that second decision: when a vector database is the right tool, and when it’s simply the wrong size for the problem.
In the earlier posts we kept repeating one phrase: at this size, you don’t need a vector database. The careful part was “at this size.” This wasn’t a verdict on the technology. We weren’t rejecting embeddings. We were rejecting the assumption that embeddings automatically imply a vector database. It was a statement about a number — and this post is about that number.
What a vector database is actually for
A vector database does one job very well: it finds the most similar items across a very large collection, fast.
The emphasis is on large. When you have millions of items, comparing a request against all of them on every search becomes too slow, so you need specialized infrastructure that takes clever shortcuts to stay fast at that scale. That is what a vector database is for. It is a solution to the problem of scale.
Which means the first question isn’t “should we use a vector database.” It’s “do we have a scale problem at all?”
The platform didn’t
The system we reviewed had a few thousand products and recipes. At that size, finding similar items is a trivial amount of work — fast enough, done simply, that a customer would never notice the difference. No specialized infrastructure required.
To put it in proportion: vector databases tend to earn their place as collections move from tens of thousands into hundreds of thousands and beyond, particularly when low-latency similarity search becomes important. Tellingly, the vendors who sell them say the same — the common industry guidance is that below roughly ten thousand items, ordinary in-memory search is enough. This catalog was orders of magnitude below that.
The proposed design wasn’t sized to the data the business actually had. It was sized to the examples the technology usually arrives wrapped in.
Buying scale you don’t have isn’t free
Reaching for a vector database early looks like harmless future-proofing. It isn’t.
It’s another vendor and another bill. It’s another moving part that can fail or fall out of sync and has to be monitored. And it adds steps and dependencies to something that, at this size, was a simple, instant operation. You take all of that on to handle a scale you may not reach for years, if ever — while the simpler approach would have been faster, cheaper, and easier to support the entire time.
When we’d build one without hesitation
None of this is an argument against vector databases. It’s an argument for matching the tool to the size of the problem.
If this catalog grew into the hundreds of thousands or millions of items. If similarity search became a constant, high-volume part of every request. If “find me similar things” stopped being instant and started being something customers could feel. Any of those, and the calculation flips — and a vector database becomes exactly the right answer. At that point we’d build one without hesitation and tune it to the workload. That’s the job too.
The discipline isn’t avoiding the tool. It’s knowing the size at which it becomes the right one — and being honest with the client about which side of that line they’re actually on today.
A vector database is a strong answer to the problem of scale. The mistake is reaching for it before you’ve checked whether you actually have a scale problem.
Team Cennest
Notes for the technically curious
These figures are illustrative — actual numbers depend on your hardware, embedding size, and latency target, and the right move is always to benchmark your own setup. But the broad thresholds are well established:
- Under ~10,000 items, with modest query volume, in-memory brute-force search is generally enough — you don’t need a dedicated vector database. A vector database starts to make sense at tens of thousands of items and a need for consistent sub-100ms latency at scale. (Redis, “Vector databases: what you need to know,” 2026)
- Exact (brute-force) search is 100% accurate but scales linearly — it checks the query against every item. It stays trivial on small sets; at around 10 million vectors of ~1,536 dimensions, it becomes too slow for real-time queries. (MachineLearningMastery, “Vector Databases Explained in 3 Levels of Difficulty,” 2026)
- As a concrete anchor, one independent benchmark clocked a single brute-force query over ~1 million vectors at roughly 2,700 milliseconds on a single CPU thread — fine offline, far too slow in a live request path.
- Approximate search trades a little accuracy for a large speed gain at scale — finding the top 10 of a million is nearly identical to the exact top 10, but can run on the order of a thousand times faster. That trade only pays off once “check everything” has actually become too slow. (MachineLearningMastery, “The Complete Guide to Vector Databases,” 2025)
- The index isn’t free, either — a common approximate index (HNSW) can use 30–50% more memory than the raw vectors it sits on. (Zilliz / Milvus documentation, 2025)
A useful neutral starting point for comparing real-world performance is the public ANN Benchmarks project, though results still vary by configuration and hardware.