What is Haystack?

Haystack is an open-source framework for building search systems that work intelligently over large document collections. Recent advances in NLP have enabled the application of question answering, retrieval and summarization to real world settings and Haystack is designed to be the bridge between research and industry.

  • NLP for Search: Pick components that perform retrieval, question answering, reranking and much more.

  • Latest models: Utilize all transformer based models (BERT, RoBERTa, MiniLM, DPR) and smoothly switch when new ones get published.

  • Flexible databases: Load data into and query from a range of databases such as Elasticsearch, Milvus, FAISS, SQL and more.

  • Scalability: Scale your system to handle millions of documents and deploy them via REST API.

  • Domain adaptation: All tooling you need to annotate examples, collect user-feedback, evaluate components and finetune models.

image

Haystack is designed to take your search to the next level. Keyword search is effective and appropriate for many situations, but Machine Learning has enabled systems to search based on word meaning rather than string matching. As new language processing models are developed, new styles of search are also possible. In Haystack, you can create systems that perform:

This is just a small subset of the kinds of systems that can be created in Haystack.

How Haystack Works

Haystack is geared towards building great search pipelines that are customizable and production ready. There are 3 different levels on which you can interact with the components in Haystack.

  • Nodes
  • Pipelines
  • REST API

To find out more, visit our Documentation

Documentation