AIONBD Blog

Vector AI Search at the Edge: Exact vs IVF

This page focuses on Vector AI Search strategy selection for edge deployments, including exact, IVF, and auto mode behavior under production constraints.

Back to main page Open Docs Hub Open Blog Hub Read Vector AI use case

Vector AI Search Modes

exact: full candidate scoring and stable quality baseline.
ivf: candidate pruning for higher throughput with recall tradeoff.
auto: policy-based mode selection when workload behavior changes.

Batch Query Strategy

For throughput-heavy Vector AI Search, use /search/topk/batch. Keep single-query endpoints for latency-sensitive low-volume integration paths.

Runtime Guards for Stability

Set AIONBD_MAX_TOPK_LIMIT to control fanout cost.
Set AIONBD_MAX_CONCURRENCY based on p95/p99 behavior.
Set AIONBD_MAX_BODY_BYTES to prevent request spikes.
Track fallback and cache metrics to tune IVF behavior.

Practical Tuning Workflow

Start with default safety-first durability profile.
Run benchmark and soak baselines with production-like data shapes.
Tune read path knobs, then write path knobs, then runtime limits.
Rollback quickly if 5xx ratio or p95/p99 materially regress.

Reference: performance_tuning.md