Type object
Schema URL https://catalog.lintel.tools/schemas/schemastore/haystack-pipeline/_shared/latest--haystack-pipeline-1.10.0.schema.json
Parent schema haystack-pipeline
Type: object

Haystack Pipeline YAML file describing the nodes of the pipelines. For more info read the docs at: https://haystack.deepset.ai/components/pipelines#yaml-file-definitions

Properties

components DeepsetCloudDocumentStoreComponent | ElasticsearchDocumentStoreComponent | FAISSDocumentStoreComponent | GraphDBKnowledgeGraphComponent | InMemoryDocumentStoreComponent | InMemoryKnowledgeGraphComponent | Milvus2DocumentStoreComponent | OpenDistroElasticsearchDocumentStoreComponent | OpenSearchDocumentStoreComponent | PineconeDocumentStoreComponent | SQLDocumentStoreComponent | WeaviateDocumentStoreComponent | AnswerToSpeechComponent | AzureConverterComponent | BM25RetrieverComponent | CrawlerComponent | DensePassageRetrieverComponent | Docs2AnswersComponent | DocumentToSpeechComponent | DocxToTextConverterComponent | ElasticsearchFilterOnlyRetrieverComponent | ElasticsearchRetrieverComponent | EmbeddingRetrieverComponent | EntityExtractorComponent | EvalAnswersComponent | EvalDocumentsComponent | FARMReaderComponent | FileTypeClassifierComponent | FilterRetrieverComponent | ImageToTextConverterComponent | JoinAnswersComponent | JoinDocumentsComponent | MarkdownConverterComponent | MultiModalRetrieverComponent | MultihopEmbeddingRetrieverComponent | OpenAIAnswerGeneratorComponent | PDFToTextConverterComponent | PDFToTextOCRConverterComponent | ParsrConverterComponent | PreProcessorComponent | PseudoLabelGeneratorComponent | QuestionGeneratorComponent | RAGeneratorComponent | RCIReaderComponent | RouteDocumentsComponent | SentenceTransformersRankerComponent | Seq2SeqGeneratorComponent | SklearnQueryClassifierComponent | TableReaderComponent | TableTextRetrieverComponent | Text2SparqlRetrieverComponent | TextConverterComponent | TfidfRetrieverComponent | TikaConverterComponent | TransformersDocumentClassifierComponent | TransformersQueryClassifierComponent | TransformersReaderComponent | TransformersSummarizerComponent | TransformersTranslatorComponent[] required

Component nodes and their configurations, to later be used in the pipelines section. Define here all the building blocks for the pipelines.

pipelines object[] required

Multiple pipelines can be defined using the components from the same YAML file.

version string required

Version of the Haystack Pipeline file.

Constant: "1.10.0"
extras string

To be specified only if contains special pipelines (for example, if this is a Ray pipeline)

Values: "ray"

One of

1. object object
pipelines
2. object object
extras enum required
Values: "ray"

Definitions

AnswerToSpeechComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "AnswerToSpeech"
params object

Each parameter can reference other components defined in the same YAML file.

6 nested properties
audio_params object | null
devices string | string[] | null
generated_audio_dir string
Default: "generated_audio_answers"
format=path
model_name_or_path string | string
Default: "espnet/kan-bayashi_ljspeech_vits"
progress_bar boolean
Default: true
transformers_params object | null
AzureConverterComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "AzureConverter"
params object

Each parameter can reference other components defined in the same YAML file.

10 nested properties
credential_key string required
endpoint string required
add_page_number boolean
Default: true
following_context_len integer
Default: 3
id_hash_keys string[] | null
merge_multiple_column_headers boolean
Default: true
model_id string
Default: "prebuilt-document"
preceding_context_len integer
Default: 3
save_json boolean
Default: false
valid_languages string[] | null
BM25RetrieverComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "BM25Retriever"
params object

Each parameter can reference other components defined in the same YAML file.

5 nested properties
document_store string required
all_terms_must_match boolean
Default: false
custom_query string | null
scale_score boolean
Default: true
top_k integer
Default: 10
CrawlerComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "Crawler"
params object

Each parameter can reference other components defined in the same YAML file.

10 nested properties
output_dir string required
crawler_depth integer
Default: 1
crawler_naming_function string | null
Default: null
extract_hidden_text
Default: true
filter_urls array | null
id_hash_keys string[] | null
loading_wait_time integer | null
overwrite_existing_files
Default: true
urls string[] | null
webdriver_options string[] | null
DeepsetCloudDocumentStoreComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "DeepsetCloudDocumentStore"
params object

Each parameter can reference other components defined in the same YAML file.

9 nested properties
api_endpoint string | null
api_key string
duplicate_documents string
Default: "overwrite"
embedding_dim integer
Default: 768
index string | null
label_index string
Default: "default"
return_embedding boolean
Default: false
similarity string
Default: "dot_product"
workspace string
Default: "default"
DensePassageRetrieverComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "DensePassageRetriever"
params object

Each parameter can reference other components defined in the same YAML file.

17 nested properties
document_store string required
batch_size integer
Default: 16
devices string | string[] | null
embed_title boolean
Default: true
global_loss_buffer_size integer
Default: 150000
max_seq_len_passage integer
Default: 256
max_seq_len_query integer
Default: 64
model_version string | null
passage_embedding_model string | string
Default: "facebook/dpr-ctx_encoder-single-nq-base"
progress_bar boolean
Default: true
query_embedding_model string | string
Default: "facebook/dpr-question_encoder-single-nq-base"
scale_score boolean
Default: true
similarity_function string
Default: "dot_product"
top_k integer
Default: 10
use_auth_token boolean | string | null
use_fast_tokenizers boolean
Default: true
use_gpu boolean
Default: true
Docs2AnswersComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "Docs2Answers"
params object

Each parameter can reference other components defined in the same YAML file.

1 nested properties
progress_bar boolean
Default: true
DocumentToSpeechComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "DocumentToSpeech"
params object

Each parameter can reference other components defined in the same YAML file.

4 nested properties
audio_params object | null
generated_audio_dir string
Default: "generated_audio_documents"
format=path
model_name_or_path string | string
Default: "espnet/kan-bayashi_ljspeech_vits"
transformers_params object | null
DocxToTextConverterComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "DocxToTextConverter"
params object

Each parameter can reference other components defined in the same YAML file.

4 nested properties
id_hash_keys string[] | null
progress_bar boolean
Default: true
remove_numeric_tables boolean
Default: false
valid_languages string[] | null
ElasticsearchDocumentStoreComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "ElasticsearchDocumentStore"
params object

Each parameter can reference other components defined in the same YAML file.

33 nested properties
analyzer string
Default: "standard"
api_key string | null
api_key_id string | null
aws4auth
ca_certs string | null
content_field string
Default: "content"
create_index boolean
Default: true
custom_mapping object | null
duplicate_documents string
Default: "overwrite"
embedding_dim integer
Default: 768
embedding_field string
Default: "embedding"
excluded_meta_data array | null
host string | string[]
Default: "localhost"
index string
Default: "document"
index_type string
Default: "flat"
label_index string
Default: "label"
name_field string
Default: "name"
password string
Default: ""
port integer | integer[]
Default: 9200
recreate_index boolean
Default: false
refresh_type string
Default: "wait_for"
return_embedding boolean
Default: false
scheme string
Default: "http"
scroll string
Default: "1d"
search_fields string | array
Default: "content"
similarity string
Default: "dot_product"
skip_missing_embeddings boolean
Default: true
synonym_type string
Default: "synonym"
synonyms array | null
timeout integer
Default: 30
use_system_proxy boolean
Default: false
username string
Default: ""
verify_certs boolean
Default: true
ElasticsearchFilterOnlyRetrieverComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "ElasticsearchFilterOnlyRetriever"
params object

Each parameter can reference other components defined in the same YAML file.

4 nested properties
document_store string required
all_terms_must_match boolean
Default: false
custom_query string | null
top_k integer
Default: 10
ElasticsearchRetrieverComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "ElasticsearchRetriever"
params object

Each parameter can reference other components defined in the same YAML file.

4 nested properties
document_store string required
all_terms_must_match boolean
Default: false
custom_query string | null
top_k integer
Default: 10
EmbeddingRetrieverComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "EmbeddingRetriever"
params object

Each parameter can reference other components defined in the same YAML file.

16 nested properties
document_store string required
embedding_model string required
api_key string | null
batch_size integer
Default: 32
devices string | string[] | null
emb_extraction_layer integer
Default: -1
embed_meta_fields string[]
Default:
[]
max_seq_len integer
Default: 512
model_format string | null
model_version string | null
pooling_strategy string
Default: "reduce_mean"
progress_bar boolean
Default: true
scale_score boolean
Default: true
top_k integer
Default: 10
use_auth_token boolean | string | null
use_gpu boolean
Default: true
EntityExtractorComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "EntityExtractor"
params object

Each parameter can reference other components defined in the same YAML file.

14 nested properties
add_prefix_space boolean | null
aggregation_strategy string
Default: "first"
Values: "simple" "first" "average" "max"
batch_size integer
Default: 16
devices string | string[] | null
flatten_entities_in_meta_data boolean
Default: false
ignore_labels string[] | null
max_seq_len integer
model_name_or_path string
Default: "elastic/distilbert-base-cased-finetuned-conll03-english"
model_version string | null
num_workers integer
Default: 0
pre_split_text boolean
Default: false
progress_bar boolean
Default: true
use_auth_token boolean | string | null
use_gpu boolean
Default: true
EvalAnswersComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "EvalAnswers"
params object

Each parameter can reference other components defined in the same YAML file.

4 nested properties
debug boolean
Default: false
open_domain boolean
Default: true
sas_model string
skip_incorrect_retrieval boolean
Default: true
EvalDocumentsComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "EvalDocuments"
params object

Each parameter can reference other components defined in the same YAML file.

3 nested properties
debug boolean
Default: false
open_domain boolean
Default: true
top_k integer
Default: 10
FAISSDocumentStoreComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "FAISSDocumentStore"
params object

Each parameter can reference other components defined in the same YAML file.

18 nested properties
duplicate_documents string
Default: "overwrite"
ef_construction integer
Default: 80
ef_search integer
Default: 20
embedding_dim integer
Default: 768
embedding_field string
Default: "embedding"
faiss_config_path string | string
faiss_index string | null
Default: null
faiss_index_factory_str string
Default: "Flat"
faiss_index_path string | string
index string
Default: "document"
isolation_level string
n_links integer
Default: 64
progress_bar boolean
Default: true
return_embedding boolean
Default: false
similarity string
Default: "dot_product"
sql_url string
Default: "sqlite:///faiss_document_store.db"
validate_index_sync boolean
Default: true
vector_dim integer
FARMReaderComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "FARMReader"
params object

Each parameter can reference other components defined in the same YAML file.

22 nested properties
model_name_or_path string required
batch_size integer
Default: 50
confidence_threshold number | null
context_window_size integer
Default: 150
devices string | string[] | null
doc_stride integer
Default: 128
duplicate_filtering integer
Default: 0
force_download
Default: false
local_files_only
Default: false
max_seq_len integer
Default: 256
model_version string | null
no_ans_boost number
Default: 0.0
num_processes integer | null
progress_bar boolean
Default: true
proxies Record<string, string>
return_no_answer boolean
Default: false
top_k integer
Default: 10
top_k_per_candidate integer
Default: 3
top_k_per_sample integer
Default: 1
use_auth_token boolean | string | null
use_confidence_scores boolean
Default: true
use_gpu boolean
Default: true
FileTypeClassifierComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "FileTypeClassifier"
params object

Each parameter can reference other components defined in the same YAML file.

1 nested properties
supported_types string[]
Default:
[
  "txt",
  "pdf",
  "md",
  "docx",
  "html"
]
FilterRetrieverComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "FilterRetriever"
params object

Each parameter can reference other components defined in the same YAML file.

5 nested properties
document_store string required
all_terms_must_match boolean
Default: false
custom_query string | null
scale_score boolean
Default: true
top_k integer
Default: 10
GraphDBKnowledgeGraphComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "GraphDBKnowledgeGraph"
params object

Each parameter can reference other components defined in the same YAML file.

6 nested properties
host string
Default: "localhost"
index string | null
password string
Default: ""
port integer
Default: 7200
prefixes string
Default: ""
username string
Default: ""
ImageToTextConverterComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "ImageToTextConverter"
params object

Each parameter can reference other components defined in the same YAML file.

3 nested properties
id_hash_keys string[] | null
remove_numeric_tables boolean
Default: false
valid_languages string[] | null
Default:
[
  "eng"
]
InMemoryDocumentStoreComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "InMemoryDocumentStore"
params object

Each parameter can reference other components defined in the same YAML file.

11 nested properties
devices string | string[] | null
duplicate_documents string
Default: "overwrite"
embedding_dim integer
Default: 768
embedding_field string | null
Default: "embedding"
index string
Default: "document"
label_index string
Default: "label"
progress_bar boolean
Default: true
return_embedding boolean
Default: false
scoring_batch_size integer
Default: 500000
similarity string
Default: "dot_product"
use_gpu boolean
Default: true
InMemoryKnowledgeGraphComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "InMemoryKnowledgeGraph"
params object

Each parameter can reference other components defined in the same YAML file.

1 nested properties
index string
Default: "document"
JoinAnswersComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "JoinAnswers"
params object

Each parameter can reference other components defined in the same YAML file.

4 nested properties
join_mode string
Default: "concatenate"
sort_by_score boolean
Default: true
top_k_join integer | null
weights number[] | null
JoinDocumentsComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "JoinDocuments"
params object

Each parameter can reference other components defined in the same YAML file.

4 nested properties
join_mode string
Default: "concatenate"
sort_by_score boolean
Default: true
top_k_join integer | null
weights number[] | null
MarkdownConverterComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "MarkdownConverter"
params object

Each parameter can reference other components defined in the same YAML file.

4 nested properties
id_hash_keys string[] | null
progress_bar boolean
Default: true
remove_numeric_tables boolean
Default: false
valid_languages string[] | null
Milvus2DocumentStoreComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "Milvus2DocumentStore"
params object

Each parameter can reference other components defined in the same YAML file.

21 nested properties
connection_pool string
Default: "SingletonThread"
consistency_level integer
Default: 0
custom_fields array | null
duplicate_documents string
Default: "overwrite"
embedding_dim integer
Default: 768
embedding_field string
Default: "embedding"
host string
Default: "localhost"
id_field string
Default: "id"
index string
Default: "document"
index_file_size integer
Default: 1024
index_param object | null
index_type string
Default: "IVF_FLAT"
isolation_level string
port string
Default: "19530"
progress_bar boolean
Default: true
recreate_index boolean
Default: false
return_embedding boolean
Default: false
search_param object | null
similarity string
Default: "dot_product"
sql_url string
Default: "sqlite:///"
vector_dim integer
MultiModalRetrieverComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "MultiModalRetriever"
params object

Each parameter can reference other components defined in the same YAML file.

14 nested properties
document_embedding_models Record<string, string | string> required
document_store string required
query_embedding_model string | string required
batch_size integer
Default: 16
devices string | string[] | null
document_feature_extractors_params Record<string, object>
Default:
{
  "text": {
    "max_length": 256
  }
}
embed_meta_fields string[]
Default:
[
  "name"
]
progress_bar boolean
Default: true
query_feature_extractor_params object
Default:
{
  "max_length": 64
}
query_type string
Default: "text"
scale_score boolean
Default: true
similarity_function string
Default: "dot_product"
top_k integer
Default: 10
use_auth_token boolean | string | null
MultihopEmbeddingRetrieverComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "MultihopEmbeddingRetriever"
params object

Each parameter can reference other components defined in the same YAML file.

16 nested properties
document_store string required
embedding_model string required
batch_size integer
Default: 32
devices string | string[] | null
emb_extraction_layer integer
Default: -1
embed_meta_fields string[]
Default:
[]
max_seq_len integer
Default: 512
model_format string
Default: "farm"
model_version string | null
num_iterations integer
Default: 2
pooling_strategy string
Default: "reduce_mean"
progress_bar boolean
Default: true
scale_score boolean
Default: true
top_k integer
Default: 10
use_auth_token boolean | string | null
use_gpu boolean
Default: true
OpenAIAnswerGeneratorComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "OpenAIAnswerGenerator"
params object

Each parameter can reference other components defined in the same YAML file.

11 nested properties
api_key string required
examples array | null
examples_context string | null
frequency_penalty number
Default: -2.0
max_tokens integer
Default: 13
model string
Default: "text-curie-001"
presence_penalty number
Default: -2.0
progress_bar boolean
Default: true
stop_words array | null
temperature number
Default: 0.2
top_k integer
Default: 5
OpenDistroElasticsearchDocumentStoreComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "OpenDistroElasticsearchDocumentStore"
params object

Each parameter can reference other components defined in the same YAML file.

33 nested properties
analyzer string
Default: "standard"
api_key string | null
api_key_id string | null
aws4auth
ca_certs string | null
content_field string
Default: "content"
create_index boolean
Default: true
custom_mapping object | null
duplicate_documents string
Default: "overwrite"
embedding_dim integer
Default: 768
embedding_field string
Default: "embedding"
excluded_meta_data array | null
host string | string[]
Default: "localhost"
index string
Default: "document"
index_type string
Default: "flat"
label_index string
Default: "label"
name_field string
Default: "name"
password string
Default: "admin"
port integer | integer[]
Default: 9200
recreate_index boolean
Default: false
refresh_type string
Default: "wait_for"
return_embedding boolean
Default: false
scheme string
Default: "https"
scroll string
Default: "1d"
search_fields string | array
Default: "content"
similarity string
Default: "cosine"
skip_missing_embeddings boolean
Default: true
synonym_type string
Default: "synonym"
synonyms array | null
timeout integer
Default: 30
use_system_proxy boolean
Default: false
username string
Default: "admin"
verify_certs boolean
Default: false
OpenSearchDocumentStoreComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "OpenSearchDocumentStore"
params object

Each parameter can reference other components defined in the same YAML file.

34 nested properties
analyzer string
Default: "standard"
api_key string | null
api_key_id string | null
aws4auth
ca_certs string | null
content_field string
Default: "content"
create_index boolean
Default: true
custom_mapping object | null
duplicate_documents string
Default: "overwrite"
embedding_dim integer
Default: 768
embedding_field string
Default: "embedding"
excluded_meta_data array | null
host string | string[]
Default: "localhost"
index string
Default: "document"
index_type string
Default: "flat"
knn_engine string
Default: "nmslib"
label_index string
Default: "label"
name_field string
Default: "name"
password string
Default: "admin"
port integer | integer[]
Default: 9200
recreate_index boolean
Default: false
refresh_type string
Default: "wait_for"
return_embedding boolean
Default: false
scheme string
Default: "https"
scroll string
Default: "1d"
search_fields string | array
Default: "content"
similarity string
Default: "dot_product"
skip_missing_embeddings boolean
Default: true
synonym_type string
Default: "synonym"
synonyms array | null
timeout integer
Default: 30
use_system_proxy boolean
Default: false
username string
Default: "admin"
verify_certs boolean
Default: false
PDFToTextConverterComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "PDFToTextConverter"
params object

Each parameter can reference other components defined in the same YAML file.

5 nested properties
encoding string | null
Default: "UTF-8"
id_hash_keys string[] | null
keep_physical_layout boolean
Default: false
remove_numeric_tables boolean
Default: false
valid_languages string[] | null
PDFToTextOCRConverterComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "PDFToTextOCRConverter"
params object

Each parameter can reference other components defined in the same YAML file.

3 nested properties
id_hash_keys string[] | null
remove_numeric_tables boolean
Default: false
valid_languages string[] | null
Default:
[
  "eng"
]
ParsrConverterComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "ParsrConverter"
params object

Each parameter can reference other components defined in the same YAML file.

11 nested properties
add_page_number boolean
Default: true
extractor string
Default: "pdfminer"
Values: "pdfminer" "pdfjs"
following_context_len integer
Default: 3
id_hash_keys string[] | null
parsr_url string
Default: "http://localhost:3001"
preceding_context_len integer
Default: 3
remove_page_footers boolean
Default: false
remove_page_headers boolean
Default: false
remove_table_of_contents boolean
Default: false
table_detection_mode string
Default: "lattice"
Values: "lattice" "stream"
valid_languages string[] | null
PineconeDocumentStoreComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "PineconeDocumentStore"
params object

Each parameter can reference other components defined in the same YAML file.

15 nested properties
api_key string required
duplicate_documents string
Default: "overwrite"
embedding_dim integer
Default: 768
embedding_field string
Default: "embedding"
environment string
Default: "us-west1-gcp"
index string
Default: "document"
metadata_config object
Default:
{
  "indexed": []
}
pinecone_index string | null
Default: null
progress_bar boolean
Default: true
recreate_index boolean
Default: false
replicas integer
Default: 1
return_embedding boolean
Default: false
shards integer
Default: 1
similarity string
Default: "cosine"
validate_index_sync boolean
Default: true
PreProcessorComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "PreProcessor"
params object

Each parameter can reference other components defined in the same YAML file.

13 nested properties
add_page_number boolean
Default: false
clean_empty_lines boolean
Default: true
clean_header_footer boolean
Default: false
clean_whitespace boolean
Default: true
id_hash_keys string[] | null
language string
Default: "en"
progress_bar boolean
Default: true
remove_substrings string[]
Default:
[]
split_by string
Default: "word"
Values: "word" "sentence" "passage"
split_length integer
Default: 200
split_overlap integer
Default: 0
split_respect_sentence_boundary boolean
Default: true
tokenizer_model_folder string | string | null
PseudoLabelGeneratorComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "PseudoLabelGenerator"
params object

Each parameter can reference other components defined in the same YAML file.

10 nested properties
question_producer string | object[] required
retriever string required
batch_size integer
Default: 16
cross_encoder_model_name_or_path string
Default: "cross-encoder/ms-marco-MiniLM-L-6-v2"
devices string | string[] | null
max_questions_per_document integer
Default: 3
progress_bar boolean
Default: true
top_k integer
Default: 50
use_auth_token boolean | string | null
use_gpu boolean
Default: true
QuestionGeneratorComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "QuestionGenerator"
params object

Each parameter can reference other components defined in the same YAML file.

17 nested properties
batch_size integer
Default: 16
devices string | string[] | null
early_stopping boolean
Default: true
length_penalty number
Default: 1.5
max_length integer
Default: 256
model_name_or_path string
Default: "valhalla/t5-base-e2e-qg"
model_version string | null
no_repeat_ngram_size integer
Default: 3
num_beams integer
Default: 4
num_queries_per_doc integer
Default: 1
progress_bar boolean
Default: true
prompt string
Default: "generate questions:"
sep_token string
Default: "<sep>"
split_length integer
Default: 50
split_overlap integer
Default: 10
use_auth_token boolean | string | null
use_gpu boolean
Default: true
RAGeneratorComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "RAGenerator"
params object

Each parameter can reference other components defined in the same YAML file.

14 nested properties
devices string | string[] | null
embed_title boolean
Default: true
generator_type string
Default: "token"
max_length integer
Default: 200
min_length integer
Default: 2
model_name_or_path string
Default: "facebook/rag-token-nq"
model_version string | null
num_beams integer
Default: 2
prefix string | null
progress_bar boolean
Default: true
retriever string | null
Default: null
top_k integer
Default: 2
use_auth_token boolean | string | null
use_gpu boolean
Default: true
RCIReaderComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "RCIReader"
params object

Each parameter can reference other components defined in the same YAML file.

10 nested properties
column_model_name_or_path string
Default: "michaelrglass/albert-base-rci-wikisql-col"
column_model_version string | null
column_tokenizer string | null
max_seq_len integer
Default: 256
row_model_name_or_path string
Default: "michaelrglass/albert-base-rci-wikisql-row"
row_model_version string | null
row_tokenizer string | null
top_k integer
Default: 10
use_auth_token boolean | string | null
use_gpu boolean
Default: true
RouteDocumentsComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "RouteDocuments"
params object

Each parameter can reference other components defined in the same YAML file.

2 nested properties
metadata_values string[] | null
split_by string
Default: "content_type"
SQLDocumentStoreComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "SQLDocumentStore"
params object

Each parameter can reference other components defined in the same YAML file.

6 nested properties
check_same_thread boolean
Default: false
duplicate_documents string
Default: "overwrite"
index string
Default: "document"
isolation_level string
label_index string
Default: "label"
url string
Default: "sqlite://"
SentenceTransformersRankerComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "SentenceTransformersRanker"
params object

Each parameter can reference other components defined in the same YAML file.

9 nested properties
model_name_or_path string | string required
batch_size integer
Default: 16
devices string | string[] | null
model_version string | null
progress_bar boolean
Default: true
scale_score boolean
Default: true
top_k integer
Default: 10
use_auth_token boolean | string | null
use_gpu boolean
Default: true
Seq2SeqGeneratorComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "Seq2SeqGenerator"
params object

Each parameter can reference other components defined in the same YAML file.

10 nested properties
model_name_or_path string required
devices string | string[] | null
input_converter string | null
Default: null
max_length integer
Default: 200
min_length integer
Default: 2
num_beams integer
Default: 8
progress_bar boolean
Default: true
top_k integer
Default: 1
use_auth_token boolean | string | null
use_gpu boolean
Default: true
SklearnQueryClassifierComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "SklearnQueryClassifier"
params object

Each parameter can reference other components defined in the same YAML file.

4 nested properties
batch_size integer | null
model_name_or_path string
Default: "https://ext-models-haystack.s3.eu-central-1.amazonaws.com/gradboost_query_classifier/model.pickle"
progress_bar boolean
Default: true
vectorizer_name_or_path string
Default: "https://ext-models-haystack.s3.eu-central-1.amazonaws.com/gradboost_query_classifier/vectorizer.pickle"
TableReaderComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "TableReader"
params object

Each parameter can reference other components defined in the same YAML file.

10 nested properties
devices string | string[] | null
max_seq_len integer
Default: 256
model_name_or_path string
Default: "google/tapas-base-finetuned-wtq"
model_version string | null
return_no_answer boolean
Default: false
tokenizer string | null
top_k integer
Default: 10
top_k_per_candidate integer
Default: 3
use_auth_token boolean | string | null
use_gpu boolean
Default: true
TableTextRetrieverComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "TableTextRetriever"
params object

Each parameter can reference other components defined in the same YAML file.

20 nested properties
document_store string required
batch_size integer
Default: 16
devices string | string[] | null
embed_meta_fields string[]
Default:
[
  "name",
  "section_title",
  "caption"
]
global_loss_buffer_size integer
Default: 150000
max_seq_len_passage integer
Default: 256
max_seq_len_query integer
Default: 64
max_seq_len_table integer
Default: 256
model_version string | null
passage_embedding_model string | string
Default: "deepset/bert-small-mm_retrieval-passage_encoder"
progress_bar boolean
Default: true
query_embedding_model string | string
Default: "deepset/bert-small-mm_retrieval-question_encoder"
scale_score boolean
Default: true
similarity_function string
Default: "dot_product"
table_embedding_model string | string
Default: "deepset/bert-small-mm_retrieval-table_encoder"
top_k integer
Default: 10
use_auth_token boolean | string | null
use_fast boolean
Default: true
use_fast_tokenizers boolean
Default: true
use_gpu boolean
Default: true
Text2SparqlRetrieverComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "Text2SparqlRetriever"
params object

Each parameter can reference other components defined in the same YAML file.

5 nested properties
knowledge_graph string required
model_name_or_path string
model_version string | null
top_k integer
Default: 1
use_auth_token boolean | string | null
TextConverterComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "TextConverter"
params object

Each parameter can reference other components defined in the same YAML file.

4 nested properties
id_hash_keys string[] | null
progress_bar boolean
Default: true
remove_numeric_tables boolean
Default: false
valid_languages string[] | null
TfidfRetrieverComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "TfidfRetriever"
params object

Each parameter can reference other components defined in the same YAML file.

3 nested properties
document_store string required
auto_fit
Default: true
top_k integer
Default: 10
TikaConverterComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "TikaConverter"
params object

Each parameter can reference other components defined in the same YAML file.

4 nested properties
id_hash_keys string[] | null
remove_numeric_tables boolean
Default: false
tika_url string
Default: "http://localhost:9998/tika"
valid_languages string[] | null
TransformersDocumentClassifierComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "TransformersDocumentClassifier"
params object

Each parameter can reference other components defined in the same YAML file.

12 nested properties
batch_size integer
Default: 16
classification_field string
devices string | string[] | null
labels string[] | null
model_name_or_path string
Default: "bhadresh-savani/distilbert-base-uncased-emotion"
model_version string | null
progress_bar boolean
Default: true
task string
Default: "text-classification"
tokenizer string | null
top_k integer | null
Default: 1
use_auth_token boolean | string | null
use_gpu boolean
Default: true
TransformersQueryClassifierComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "TransformersQueryClassifier"
params object

Each parameter can reference other components defined in the same YAML file.

10 nested properties
batch_size integer
Default: 16
devices string | string[] | null
labels string[]
Default:
[
  "LABEL_1",
  "LABEL_0"
]
model_name_or_path string | string
Default: "shahrukhx01/bert-mini-finetune-question-detection"
model_version string | null
progress_bar boolean
Default: true
task string
Default: "text-classification"
tokenizer string | null
use_auth_token boolean | string | null
use_gpu boolean
Default: true
TransformersReaderComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "TransformersReader"
params object

Each parameter can reference other components defined in the same YAML file.

13 nested properties
batch_size integer
Default: 16
context_window_size integer
Default: 70
devices string | string[] | null
doc_stride integer
Default: 128
max_seq_len integer
Default: 256
model_name_or_path string
Default: "distilbert-base-uncased-distilled-squad"
model_version string | null
return_no_answers boolean
Default: false
tokenizer string | null
top_k integer
Default: 10
top_k_per_candidate integer
Default: 3
use_auth_token boolean | string | null
use_gpu boolean
Default: true
TransformersSummarizerComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "TransformersSummarizer"
params object

Each parameter can reference other components defined in the same YAML file.

13 nested properties
batch_size integer
Default: 16
clean_up_tokenization_spaces boolean
Default: true
devices string | string[] | null
generate_single_summary boolean
Default: false
max_length integer
Default: 200
min_length integer
Default: 5
model_name_or_path string
Default: "google/pegasus-xsum"
model_version string | null
progress_bar boolean
Default: true
separator_for_single_summary string
Default: " "
tokenizer string | null
use_auth_token boolean | string | null
use_gpu boolean
Default: true
TransformersTranslatorComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "TransformersTranslator"
params object

Each parameter can reference other components defined in the same YAML file.

8 nested properties
model_name_or_path string required
clean_up_tokenization_spaces boolean | null
Default: true
devices string | string[] | null
max_seq_len integer | null
progress_bar boolean
Default: true
tokenizer_name string | null
use_auth_token boolean | string | null
use_gpu boolean
Default: true
WeaviateDocumentStoreComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "WeaviateDocumentStore"
params object

Each parameter can reference other components defined in the same YAML file.

17 nested properties
content_field string
Default: "content"
custom_schema object | null
duplicate_documents string
Default: "overwrite"
embedding_dim integer
Default: 768
embedding_field string
Default: "embedding"
host string | string[]
Default: "http://localhost"
index string
Default: "Document"
index_type string
Default: "hnsw"
name_field string
Default: "name"
password string
port integer | integer[]
Default: 8080
progress_bar boolean
Default: true
recreate_index boolean
Default: false
return_embedding boolean
Default: false
similarity string
Default: "cosine"
timeout_config array
Default:
[
  5,
  15
]
username string