Vector Database

Milvus

A database designed to quickly search and find similar items in large collections of unstructured data like images, text, and audio by comparing their numerical representations.

Milvus vector database similarity search vector embeddings unstructured data
Created: December 18, 2025

What is Milvus?

Milvus is an open-source, cloud-native vector database purpose-built for scalable, high-performance similarity search on massive unstructured datasets. Developed by Zilliz and governed under the Apache 2.0 license, Milvus efficiently stores, indexes, and queries high-dimensional vector embeddings—numerical representations of data generated by AI and machine learning models.

The platform is designed for elastic scaling from laptop prototyping to enterprise production deployments managing tens of billions of vectors across distributed architectures. Milvus powers applications in semantic search, recommendation systems, retrieval-augmented generation (RAG), computer vision, and anomaly detection—enabling organizations to build AI applications requiring fast, accurate similarity search on unstructured data.

Core Concepts

Vector Embeddings

High-dimensional arrays (e.g., 128, 768, 4096 dimensions) encoding semantic or structural information of unstructured data including text, images, and audio. Generated by embedding models like OpenAI, Hugging Face transformers, and other neural networks, embeddings translate complex data into format suitable for efficient mathematical comparison.

Semantically similar items have embeddings located close together in high-dimensional space, enabling similarity search through distance calculations.

Unstructured Data

Data without predefined schema or structure—free-form text, images, audio, videos. Unlike relational data, unstructured data is difficult to process and analyze with traditional databases. Vector embeddings represent this data as fixed-length vectors enabling efficient indexing, search, and retrieval.

Similarity Search and ANN

Similarity Search: Finding items in dataset most similar to query item based on vector distance metrics (Euclidean, cosine similarity, inner product).

Approximate Nearest Neighbor (ANN): Family of algorithms rapidly retrieving items whose vectors are closest to query vector, trading small amount of accuracy for significant speed gains—essential for billion-scale datasets.

Architecture

Microservices-Based Design

Milvus implements multi-layered, microservices architecture with disaggregated storage and compute. Design follows data plane and control plane separation promoting independent scalability and operational flexibility.

Major Components:

Access Layer: Stateless proxies handling client requests and APIs (RESTful, SDKs), validating requests, aggregating results.

Coordinator Services: Orchestrates load balancing, metadata management, system state, DDL/DCL operations, task scheduling.

Worker Nodes: Stateless executors for search, data insertion, indexing.

  • Streaming Node: Handles real-time data ingestion and streaming consistency
  • Query Node: Loads and queries historical (sealed) data
  • Data Node: Background tasks like compaction and index-building

Object Storage: Persists vector data, indexes, logs. Supports MinIO, AWS S3, Azure Blob.

Meta Storage: Uses etcd for metadata and cluster state.

WAL Storage: Write-Ahead Log for data durability and recovery (Kafka, Pulsar).

Deployment Options

ModeDescriptionUse Case
Milvus LitePython library via pip; runs embeddedPrototyping, local dev
StandaloneDocker-based single-node deploymentTesting, small production
DistributedKubernetes-based with horizontal scalingEnterprise, large-scale
Zilliz CloudFully managed SaaS with 10x performance accelerationProduction, hassle-free

Scalability

Horizontal Scaling: Compute and storage scale independently. Stateless microservices allow elastic recovery orchestrated by Kubernetes.

Hardware Optimization: AVX512, SIMD, GPU acceleration (NVIDIA CUDA, Cagra), NVMe SSD support.

Billion-Scale Support: Proven stability for datasets with tens of billions of vectors used in production by major enterprises.

Key Features

Supported Data Types

Dense Vectors: float32, float16, int8 arrays (from BERT, CLIP, ResNet).

Sparse Vectors: Efficient for high-dimensional data with many zeros (text search, recommendation).

Binary Vectors: Compact, bit-packed representation for hashing or vision tasks.

Primitives: Integer, float, string, boolean.

JSON/Array/Set: Semi-structured metadata and multi-modal modeling.

Indexing Algorithms

AlgorithmDescriptionUse Case
HNSWHierarchical Navigable Small World; graph-basedVersatile, high-dimension
IVFInverted File System; partitions vector spaceBalanced speed/cost
DiskANNOn-disk index for massive datasetsBillions of vectors, SSD
FlatLinear scan for highest precisionSmall datasets, evaluation
CagraGPU-optimized graph-based indexHigh-throughput, GPU infra

Key Concepts:

  • Graph-based indexes (HNSW) outperform IVF for low-k, high recall queries
  • IVF optimal for large top-k queries
  • DiskANN ideal for SSD-backed, billion-scale datasets
  • Quantization (SQ8, PQ) compresses vectors for memory efficiency

Search Capabilities

ANN Search: Find top-K vectors most similar to query.

Filtering Search: Combine vector search with metadata filtering (tags, ranges).

Range Search: Retrieve vectors within distance threshold.

Hybrid Search: Use multiple vector fields/modalities in query.

Full-Text Search: BM25-based search for textual fields.

Reranking: Refine initial results with secondary algorithms.

Fetch by ID: Retrieve items by primary key or complex expressions.

Data Operations

Collections & Partitions: Organize data hierarchically for efficient access.

Schema Evolution: Update collection schemas without downtime.

CRUD Operations: Insert, update, delete, upsert vectors and metadata.

Batch Processing: Bulk import/export tools.

Multi-Tenancy: Isolation by database, collection, or partition key.

Consistency and Security

Configurable Consistency: Strong, bounded staleness, session, eventual consistency models.

Authentication & RBAC: User authentication, role-based access control, fine-grained permissions.

TLS Encryption: Secure data-in-transit.

Tiered Storage: Hot/cold storage for cost-efficient performance.

Integration Ecosystem

SDKs and APIs

Language Support: Python (PyMilvus), Java, Go, Node.js, C#, RESTful API.

AI Framework Integrations: LangChain, LlamaIndex, OpenAI, Hugging Face, DSPy, Haystack, Ragas, MemGPT.

Data Processing: Apache Spark connector for ML pipelines.

Observability: Prometheus and Grafana for monitoring.

Admin Tools: Attu (GUI), Birdwatcher (debugging), Milvus Backup & CDC, Vector Transmission Services (migration).

Example: OpenAI Integration

from pymilvus import MilvusClient

# Connect to Milvus
client = MilvusClient("milvus_demo.db")

# Create collection
client.create_collection(
    collection_name="demo_collection",
    dimension=5
)

# Insert vectors
vectors = [[0.1, 0.2, 0.3, 0.4, 0.5]]
client.insert(collection_name="demo_collection", data=vectors)

# Perform similarity search
query_vector = [0.1, 0.2, 0.3, 0.4, 0.5]
results = client.search(
    collection_name="demo_collection",
    data=[query_vector],
    top_k=1
)

Use Cases

Retrieval-Augmented Generation (RAG): Connects LLMs to external knowledge bases via vector search, enabling accurate, contextually relevant AI responses grounded in retrieved documents.

Recommendation Systems: Surfaces content, products, ads based on user preference embeddings and item features. Used in e-commerce, streaming, news feeds.

Computer Vision: Image similarity search, object detection, classification using visual embeddings. Enables reverse image search, medical image retrieval, retail visual search.

Natural Language Processing: Semantic search, document clustering, chatbot retrieval using text embeddings. Used for legal document search, contextual chatbots, FAQ systems.

Fraud & Anomaly Detection: Vectorizes transaction patterns or network events for real-time anomaly detection in financial fraud and cybersecurity.

Scientific Research: Molecular similarity search, genomic analysis, materials science applications.

Industry Adoption

Organizations using Milvus include: NVIDIA, Salesforce, eBay, Walmart, IBM, Shopee, Tokopedia, AT&T, PayPal, ZipRecruiter, SmartNews, LINE, Bosch, Intuit, Roblox, Compass, OMERS, New Relic for diverse AI and analytics workloads.

Comparison with Other Vector Databases

FeatureMilvusPineconeWeaviateQdrantChroma
Open SourceYes (Apache)No (SaaS)YesYesYes
DeploymentSelf, Cloud, K8sSaaSSelf, CloudSelf, CloudSelf, Cloud
ScalabilityExcellentManagedGoodGoodLimited
Index TypesHNSW, IVF, DiskANN, CagraProprietaryHNSWHNSW, IVFHNSW, Annoy
Vector TypesDense, sparse, binaryDenseDenseDenseDense
Metadata FilteringAdvancedBasicGraphQLAdvancedBasic
GPU AccelerationYes (CUDA, SIMD, AVX)SomeNoNoNo

Milvus Advantages: Rich index diversity, proven billion-scale performance, open community, broad SDK support, hybrid and multi-modal search, enterprise-grade security.

References

Related Terms

Pinecone

A cloud database that stores and searches AI-generated data patterns to quickly find similar informa...

Weaviate

An open-source database designed to store and search AI-generated data representations, enabling sma...

Chroma

An open-source database designed to store and search AI-generated numerical data, enabling applicati...

×
Contact Us Contact