vector database

scroll ↓ to Resources

Note

  • Vector databases are not databases, but search engines
  • A vector database indexes and stores vector embedding, for fast search and optimized storage
  • Provides the ability to compare multiple things (semantically) at the same time
  • Helps machine learning models remember past data better, making them more useful for search, recommendations, and text generation
  • Currently, all modern vector databases contain swiss-knife-level of instrument set to perform vector search, data storage, similarity measurement, reranking, etc.

Vendors

How to choose a vector database

  • see Resources
  • support for role-based access control, multi-tenancy isolation
  • having multiple embeddings per document
  • algorithmic details
    • sparse algorithms beyond BM25: SPLADE
    • automatic switching between ANN and brute force search
      • can also be implemented manually in code using if n_docs<N: switch to full search
  • self-hosted version
  • Costs:
    • free tier
    • 50k\500k\1m… vectors with
  • Performance:
  • community and forward support (in case of open-source)

Resources


table file.inlinks, filter(file.outlinks, (x) => !contains(string(x), ".jpg") AND !contains(string(x), ".pdf") AND !contains(string(x), ".png")) as "Outlinks" from [[]] and !outgoing([[]])  AND -"Changelog"