Type object
Schema URL https://catalog.lintel.tools/schemas/schemastore/haystack-pipeline/_shared/latest--haystack-pipeline-1.23.0.schema.json
Parent schema haystack-pipeline
Type: object

Haystack Pipeline YAML file describing the nodes of the pipelines. For more info read the docs at: https://haystack.deepset.ai/components/pipelines#yaml-file-definitions

Properties

version string required

Version of the Haystack Pipeline file.

Constant: "1.23.0"
components DeepsetCloudDocumentStoreComponent | ElasticsearchDocumentStoreComponent | FAISSDocumentStoreComponent | InMemoryDocumentStoreComponent | OpenSearchDocumentStoreComponent | PineconeDocumentStoreComponent | SQLDocumentStoreComponent | WeaviateDocumentStoreComponent | AnswerParserComponent | AzureConverterComponent | BM25RetrieverComponent | BaseOutputParserComponent | CohereRankerComponent | CrawlerComponent | CsvTextConverterComponent | DensePassageRetrieverComponent | DiversityRankerComponent | Docs2AnswersComponent | DocumentMergerComponent | DocxToTextConverterComponent | EmbeddingRetrieverComponent | EntityExtractorComponent | FARMReaderComponent | FileTypeClassifierComponent | FilterRetrieverComponent | ImageToTextConverterComponent | JoinAnswersComponent | JoinDocumentsComponent | JsonConverterComponent | LangdetectDocumentLanguageClassifierComponent | LinkContentFetcherComponent | LostInTheMiddleRankerComponent | MarkdownConverterComponent | MultiModalRetrieverComponent | MultihopEmbeddingRetrieverComponent | PDFToTextConverterComponent | ParsrConverterComponent | PptxConverterComponent | PreProcessorComponent | PromptModelComponent | PromptNodeComponent | PromptTemplateComponent | PseudoLabelGeneratorComponent | QuestionGeneratorComponent | RCIReaderComponent | RecentnessRankerComponent | RouteDocumentsComponent | SentenceTransformersRankerComponent | ShaperComponent | TableReaderComponent | TableTextRetrieverComponent | TextConverterComponent | TfidfRetrieverComponent | TikaConverterComponent | TopPSamplerComponent | TransformersDocumentClassifierComponent | TransformersDocumentLanguageClassifierComponent | TransformersImageToTextComponent | TransformersQueryClassifierComponent | TransformersReaderComponent | TransformersSummarizerComponent | TransformersTranslatorComponent | WebRetrieverComponent | WebSearchComponent | WhisperTranscriberComponent[] required

Component nodes and their configurations, to later be used in the pipelines section. Define here all the building blocks for the pipelines.

pipelines object[] required

Multiple pipelines can be defined using the components from the same YAML file.

extras string

To be specified only if contains special pipelines (for example, if this is a Ray pipeline)

Values: "ray"

One of

1. object object
pipelines
2. object object
extras enum required
Values: "ray"

Definitions

DeepsetCloudDocumentStoreComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "DeepsetCloudDocumentStore"
params object

Each parameter can reference other components defined in the same YAML file.

11 nested properties
api_key string | null
workspace string
Default: "default"
index string | null
duplicate_documents string
Default: "overwrite"
api_endpoint string | null
similarity string
Default: "dot_product"
return_embedding boolean
Default: false
label_index string
Default: "default"
embedding_dim integer
Default: 768
use_prefiltering boolean
Default: false
search_fields string | array
Default: "content"
ElasticsearchDocumentStoreComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "ElasticsearchDocumentStore"
params object

Each parameter can reference other components defined in the same YAML file.

33 nested properties
host string | string[]
Default: "localhost"
port integer | integer[]
Default: 9200
username string
Default: ""
password string
Default: ""
api_key_id string | null
api_key string | null
aws4auth
index string
Default: "document"
label_index string
Default: "label"
search_fields string | array
Default: "content"
content_field string
Default: "content"
name_field string
Default: "name"
embedding_field string
Default: "embedding"
embedding_dim integer
Default: 768
custom_mapping object | null
excluded_meta_data array | null
analyzer string
Default: "standard"
scheme string
Default: "http"
ca_certs string | null
verify_certs boolean
Default: true
recreate_index boolean
Default: false
create_index boolean
Default: true
refresh_type string
Default: "wait_for"
similarity string
Default: "dot_product"
timeout integer
Default: 300
return_embedding boolean
Default: false
duplicate_documents string
Default: "overwrite"
scroll string
Default: "1d"
skip_missing_embeddings boolean
Default: true
synonyms array | null
synonym_type string
Default: "synonym"
use_system_proxy boolean
Default: false
batch_size integer
Default: 10000
FAISSDocumentStoreComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "FAISSDocumentStore"
params object

Each parameter can reference other components defined in the same YAML file.

19 nested properties
sql_url string
Default: "sqlite:///faiss_document_store.db"
vector_dim integer | null
embedding_dim integer
Default: 768
faiss_index_factory_str string
Default: "Flat"
faiss_index
return_embedding boolean
Default: false
index string
Default: "document"
similarity string
Default: "dot_product"
embedding_field string
Default: "embedding"
progress_bar boolean
Default: true
duplicate_documents string
Default: "overwrite"
faiss_index_path string | string | null
faiss_config_path string | string | null
isolation_level string | null
n_links integer
Default: 64
ef_search integer
Default: 20
ef_construction integer
Default: 80
validate_index_sync boolean
Default: true
batch_size integer
Default: 10000
InMemoryDocumentStoreComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "InMemoryDocumentStore"
params object

Each parameter can reference other components defined in the same YAML file.

15 nested properties
index string
Default: "document"
label_index string
Default: "label"
embedding_field string | null
Default: "embedding"
embedding_dim integer
Default: 768
return_embedding boolean
Default: false
similarity string
Default: "dot_product"
progress_bar boolean
Default: true
duplicate_documents string
Default: "overwrite"
use_gpu boolean
Default: true
scoring_batch_size integer
Default: 500000
devices string[] | null
use_bm25 boolean
Default: false
bm25_tokenization_regex string
Default: "(?u)\b\w\w+\b"
bm25_algorithm string
Default: "BM25Okapi"
Values: "BM25Okapi" "BM25L" "BM25Plus"
bm25_parameters object | null
OpenSearchDocumentStoreComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "OpenSearchDocumentStore"
params object

Each parameter can reference other components defined in the same YAML file.

37 nested properties
scheme string
Default: "https"
username string
Default: "admin"
password string
Default: "admin"
host string | string[]
Default: "localhost"
port integer | integer[]
Default: 9200
api_key_id string | null
api_key string | null
aws4auth
index string
Default: "document"
label_index string
Default: "label"
search_fields string | array
Default: "content"
content_field string
Default: "content"
name_field string
Default: "name"
embedding_field string
Default: "embedding"
embedding_dim integer
Default: 768
custom_mapping object | null
excluded_meta_data array | null
analyzer string
Default: "standard"
ca_certs string | null
verify_certs boolean
Default: false
recreate_index boolean
Default: false
create_index boolean
Default: true
refresh_type string
Default: "wait_for"
similarity string
Default: "dot_product"
timeout integer
Default: 300
return_embedding boolean
Default: false
duplicate_documents string
Default: "overwrite"
index_type string
Default: "flat"
scroll string
Default: "1d"
skip_missing_embeddings boolean
Default: true
synonyms array | null
synonym_type string
Default: "synonym"
use_system_proxy boolean
Default: false
knn_engine string
Default: "nmslib"
knn_parameters object | null
ivf_train_size integer | null
batch_size integer
Default: 10000
PineconeDocumentStoreComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "PineconeDocumentStore"
params object

Each parameter can reference other components defined in the same YAML file.

18 nested properties
api_key string required
environment string
Default: "us-west1-gcp"
pinecone_index
embedding_dim integer
Default: 768
pods integer
Default: 1
pod_type string
Default: "p1.x1"
return_embedding boolean
Default: false
index string
Default: "document"
similarity string
Default: "cosine"
replicas integer
Default: 1
shards integer
Default: 1
namespace string | null
embedding_field string
Default: "embedding"
progress_bar boolean
Default: true
duplicate_documents string
Default: "overwrite"
recreate_index boolean
Default: false
metadata_config object | null
validate_index_sync boolean
Default: true
SQLDocumentStoreComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "SQLDocumentStore"
params object

Each parameter can reference other components defined in the same YAML file.

6 nested properties
url string
Default: "sqlite://"
index string
Default: "document"
label_index string
Default: "label"
duplicate_documents string
Default: "overwrite"
check_same_thread boolean
Default: false
isolation_level string | null
WeaviateDocumentStoreComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "WeaviateDocumentStore"
params object

Each parameter can reference other components defined in the same YAML file.

24 nested properties
host string | string[]
Default: "http://localhost"
port integer | integer[]
Default: 8080
timeout_config array
Default:
[
  5,
  15
]
username string | null
password string | null
scope string | null
Default: "offline_access"
api_key string | null
use_embedded boolean
Default: false
embedded_options object | null
additional_headers object | null
index string
Default: "Document"
embedding_dim integer
Default: 768
content_field string
Default: "content"
name_field string
Default: "name"
similarity string
Default: "cosine"
index_type string
Default: "hnsw"
custom_schema object | null
return_embedding boolean
Default: false
embedding_field string
Default: "embedding"
progress_bar boolean
Default: true
duplicate_documents string
Default: "overwrite"
recreate_index boolean
Default: false
replication_factor integer
Default: 1
batch_size integer
Default: 10000
AnswerParserComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "AnswerParser"
params object

Each parameter can reference other components defined in the same YAML file.

2 nested properties
pattern string | null
reference_pattern string | null
AzureConverterComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "AzureConverter"
params object

Each parameter can reference other components defined in the same YAML file.

10 nested properties
endpoint string required
credential_key string required
model_id string
Default: "prebuilt-document"
valid_languages string[] | null
save_json boolean
Default: false
preceding_context_len integer
Default: 3
following_context_len integer
Default: 3
merge_multiple_column_headers boolean
Default: true
id_hash_keys string[] | null
add_page_number boolean
Default: true
BM25RetrieverComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "BM25Retriever"
params object

Each parameter can reference other components defined in the same YAML file.

5 nested properties
document_store string | null
Default: null
top_k integer
Default: 10
all_terms_must_match boolean
Default: false
custom_query string | null
scale_score boolean
Default: true
BaseOutputParserComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "BaseOutputParser"
params object

Each parameter can reference other components defined in the same YAML file.

5 nested properties
func string required
outputs string[] required
inputs Record<string, string[] | string>
params object | null
publish_outputs boolean | string[]
Default: true
CohereRankerComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "CohereRanker"
params object

Each parameter can reference other components defined in the same YAML file.

5 nested properties
api_key string required
model_name_or_path string required
top_k integer
Default: 10
max_chunks_per_doc integer | null
embed_meta_fields string[] | null
CrawlerComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "Crawler"
params object

Each parameter can reference other components defined in the same YAML file.

11 nested properties
urls string[] | null
crawler_depth integer
Default: 1
filter_urls array | null
id_hash_keys string[] | null
extract_hidden_text
Default: true
loading_wait_time integer | null
output_dir string | string | null
overwrite_existing_files
Default: true
file_path_meta_field_name string | null
crawler_naming_function string | null
Default: null
webdriver_options string[] | null
CsvTextConverterComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "CsvTextConverter"
params object

Each parameter can reference other components defined in the same YAML file.

4 nested properties
remove_numeric_tables boolean
Default: false
valid_languages string[] | null
id_hash_keys string[] | null
progress_bar boolean
Default: true
DensePassageRetrieverComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "DensePassageRetriever"
params object

Each parameter can reference other components defined in the same YAML file.

17 nested properties
document_store string | null
Default: null
query_embedding_model string | string
Default: "facebook/dpr-question_encoder-single-nq-base"
passage_embedding_model string | string
Default: "facebook/dpr-ctx_encoder-single-nq-base"
model_version string | null
max_seq_len_query integer
Default: 64
max_seq_len_passage integer
Default: 256
top_k integer
Default: 10
use_gpu boolean
Default: true
batch_size integer
Default: 16
embed_title boolean
Default: true
use_fast_tokenizers boolean
Default: true
similarity_function string
Default: "dot_product"
global_loss_buffer_size integer
Default: 150000
progress_bar boolean
Default: true
devices string[] | null
use_auth_token boolean | string | null
scale_score boolean
Default: true
DiversityRankerComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "DiversityRanker"
params object

Each parameter can reference other components defined in the same YAML file.

5 nested properties
model_name_or_path string | string
Default: "all-MiniLM-L6-v2"
top_k integer | null
use_gpu boolean | null
Default: true
devices string[] | null
similarity string
Default: "dot_product"
Values: "dot_product" "cosine"
Docs2AnswersComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "Docs2Answers"
params object

Each parameter can reference other components defined in the same YAML file.

1 nested properties
progress_bar boolean
Default: true
DocumentMergerComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "DocumentMerger"
params object

Each parameter can reference other components defined in the same YAML file.

1 nested properties
separator string
Default: " "
DocxToTextConverterComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "DocxToTextConverter"
params object

Each parameter can reference other components defined in the same YAML file.

4 nested properties
remove_numeric_tables boolean
Default: false
valid_languages string[] | null
id_hash_keys string[] | null
progress_bar boolean
Default: true
EmbeddingRetrieverComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "EmbeddingRetriever"
params object

Each parameter can reference other components defined in the same YAML file.

23 nested properties
embedding_model string required
document_store string | null
Default: null
model_version string | null
use_gpu boolean
Default: true
batch_size integer
Default: 32
max_seq_len integer
Default: 512
model_format string | null
pooling_strategy string
Default: "reduce_mean"
query_prompt string | null
passage_prompt string | null
emb_extraction_layer integer
Default: -1
top_k integer
Default: 10
progress_bar boolean
Default: true
devices string[] | null
use_auth_token boolean | string | null
scale_score boolean
Default: true
embed_meta_fields string[] | null
api_key string | null
azure_api_version string
Default: "2022-12-01"
azure_base_url string | null
azure_deployment_name string | null
api_base string
Default: "https://api.openai.com/v1"
openai_organization string | null
EntityExtractorComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "EntityExtractor"
params object

Each parameter can reference other components defined in the same YAML file.

14 nested properties
model_name_or_path string
Default: "elastic/distilbert-base-cased-finetuned-conll03-english"
model_version string | null
use_gpu boolean
Default: true
batch_size integer
Default: 16
progress_bar boolean
Default: true
use_auth_token boolean | string | null
devices string[] | null
aggregation_strategy string
Default: "first"
Values: "simple" "first" "average" "max"
add_prefix_space boolean | null
num_workers integer
Default: 0
flatten_entities_in_meta_data boolean
Default: false
max_seq_len integer | null
pre_split_text boolean
Default: false
ignore_labels string[] | null
FARMReaderComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "FARMReader"
params object

Each parameter can reference other components defined in the same YAML file.

24 nested properties
model_name_or_path string required
model_version string | null
context_window_size integer
Default: 150
batch_size integer
Default: 50
use_gpu boolean
Default: true
devices string[] | null
no_ans_boost number
Default: 0.0
return_no_answer boolean
Default: false
top_k integer
Default: 10
top_k_per_candidate integer
Default: 3
top_k_per_sample integer
Default: 1
num_processes integer | null
max_seq_len integer
Default: 256
doc_stride integer
Default: 128
progress_bar boolean
Default: true
duplicate_filtering integer
Default: 0
use_confidence_scores boolean
Default: true
confidence_threshold number | null
proxies Record<string, string>
local_files_only
Default: false
force_download
Default: false
use_auth_token boolean | string | null
max_query_length integer
Default: 64
preprocessing_batch_size integer | null
FileTypeClassifierComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "FileTypeClassifier"
params object

Each parameter can reference other components defined in the same YAML file.

2 nested properties
supported_types string[] | null
full_analysis boolean
Default: false
FilterRetrieverComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "FilterRetriever"
params object

Each parameter can reference other components defined in the same YAML file.

5 nested properties
document_store string | null
Default: null
top_k integer
Default: 10
all_terms_must_match boolean
Default: false
custom_query string | null
scale_score boolean
Default: true
ImageToTextConverterComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "ImageToTextConverter"
params object

Each parameter can reference other components defined in the same YAML file.

3 nested properties
remove_numeric_tables boolean
Default: false
valid_languages string[] | null
id_hash_keys string[] | null
JoinAnswersComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "JoinAnswers"
params object

Each parameter can reference other components defined in the same YAML file.

4 nested properties
join_mode string
Default: "concatenate"
weights number[] | null
top_k_join integer | null
sort_by_score boolean
Default: true
JoinDocumentsComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "JoinDocuments"
params object

Each parameter can reference other components defined in the same YAML file.

4 nested properties
join_mode string
Default: "concatenate"
weights number[] | null
top_k_join integer | null
sort_by_score boolean
Default: true
JsonConverterComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "JsonConverter"
params object

Each parameter can reference other components defined in the same YAML file.

4 nested properties
remove_numeric_tables boolean
Default: false
valid_languages string[] | null
id_hash_keys string[] | null
progress_bar boolean
Default: true
LangdetectDocumentLanguageClassifierComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "LangdetectDocumentLanguageClassifier"
params object

Each parameter can reference other components defined in the same YAML file.

2 nested properties
route_by_language boolean
Default: true
languages_to_route string[] | null
LinkContentFetcherComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "LinkContentFetcher"
params object

Each parameter can reference other components defined in the same YAML file.

5 nested properties
content_handlers Record<string, string>
processor string | null
Default: null
raise_on_failure boolean | null
Default: false
user_agents string[] | null
retry_attempts integer | null
LostInTheMiddleRankerComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "LostInTheMiddleRanker"
params object

Each parameter can reference other components defined in the same YAML file.

2 nested properties
word_count_threshold integer | null
top_k integer | null
MarkdownConverterComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "MarkdownConverter"
params object

Each parameter can reference other components defined in the same YAML file.

7 nested properties
remove_numeric_tables boolean
Default: false
valid_languages string[] | null
id_hash_keys string[] | null
progress_bar boolean
Default: true
remove_code_snippets boolean
Default: true
extract_headlines boolean
Default: false
add_frontmatter_to_meta boolean
Default: false
MultiModalRetrieverComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "MultiModalRetriever"
params object

Each parameter can reference other components defined in the same YAML file.

14 nested properties
document_store string required
query_embedding_model string | string required
document_embedding_models Record<string, string | string> required
query_type string
Default: "text"
query_feature_extractor_params object | null
document_feature_extractors_params Record<string, object>
top_k integer
Default: 10
batch_size integer
Default: 16
embed_meta_fields string[] | null
similarity_function string
Default: "dot_product"
progress_bar boolean
Default: true
devices string[] | null
use_auth_token boolean | string | null
scale_score boolean
Default: true
MultihopEmbeddingRetrieverComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "MultihopEmbeddingRetriever"
params object

Each parameter can reference other components defined in the same YAML file.

16 nested properties
embedding_model string required
document_store string | null
Default: null
model_version string | null
num_iterations integer
Default: 2
use_gpu boolean
Default: true
batch_size integer
Default: 32
max_seq_len integer
Default: 512
model_format string
Default: "farm"
pooling_strategy string
Default: "reduce_mean"
emb_extraction_layer integer
Default: -1
top_k integer
Default: 10
progress_bar boolean
Default: true
devices string[] | null
use_auth_token boolean | string | null
scale_score boolean
Default: true
embed_meta_fields string[] | null
PDFToTextConverterComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "PDFToTextConverter"
params object

Each parameter can reference other components defined in the same YAML file.

5 nested properties
remove_numeric_tables boolean
Default: false
valid_languages string[] | null
id_hash_keys string[] | null
encoding string | null
Default: "UTF-8"
keep_physical_layout boolean
Default: false
ParsrConverterComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "ParsrConverter"
params object

Each parameter can reference other components defined in the same YAML file.

13 nested properties
parsr_url string
Default: "http://localhost:3001"
extractor string
Default: "pdfminer"
Values: "pdfminer" "pdfjs"
table_detection_mode string
Default: "lattice"
Values: "lattice" "stream"
preceding_context_len integer
Default: 3
following_context_len integer
Default: 3
remove_page_headers boolean
Default: false
remove_page_footers boolean
Default: false
remove_table_of_contents boolean
Default: false
valid_languages string[] | null
id_hash_keys string[] | null
add_page_number boolean
Default: true
extract_headlines boolean
Default: true
timeout number | array
Default: 10.0
PptxConverterComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "PptxConverter"
params object

Each parameter can reference other components defined in the same YAML file.

4 nested properties
remove_numeric_tables boolean
Default: false
valid_languages string[] | null
id_hash_keys string[] | null
progress_bar boolean
Default: true
PreProcessorComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "PreProcessor"
params object

Each parameter can reference other components defined in the same YAML file.

15 nested properties
clean_whitespace boolean
Default: true
clean_header_footer boolean
Default: false
clean_empty_lines boolean
Default: true
remove_substrings string[] | null
split_by string | null
Default: "word"
Values: "token" "word" "sentence" "passage"
split_length integer
Default: 200
split_overlap integer
Default: 0
split_respect_sentence_boundary boolean
Default: true
tokenizer_model_folder string | string | null
tokenizer string | string | null
Default: "tiktoken"
language string
Default: "en"
id_hash_keys string[] | null
progress_bar boolean
Default: true
add_page_number boolean
Default: false
max_chars_check integer
Default: 10000
PromptModelComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "PromptModel"
params object

Each parameter can reference other components defined in the same YAML file.

9 nested properties
model_name_or_path string
Default: "google/flan-t5-base"
max_length integer | null
Default: 100
api_key string | null
timeout number | null
use_auth_token boolean | string | null
use_gpu boolean | null
devices string[] | null
invocation_layer_class string | null
model_kwargs object | null
PromptNodeComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "PromptNode"
params object

Each parameter can reference other components defined in the same YAML file.

13 nested properties
model_name_or_path string | string
Default: "google/flan-t5-base"
default_prompt_template string | string | null
output_variable string | null
max_length integer | null
Default: 100
api_key string | null
timeout number | null
use_auth_token boolean | string | null
use_gpu boolean | null
devices string[] | null
stop_words string[] | null
top_k integer
Default: 1
debug boolean | null
Default: false
model_kwargs object | null
PromptTemplateComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "PromptTemplate"
params object

Each parameter can reference other components defined in the same YAML file.

2 nested properties
prompt string required
output_parser object | string | null
PseudoLabelGeneratorComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "PseudoLabelGenerator"
params object

Each parameter can reference other components defined in the same YAML file.

10 nested properties
question_producer string | object[] required
retriever required
cross_encoder_model_name_or_path string
Default: "cross-encoder/ms-marco-MiniLM-L-6-v2"
max_questions_per_document integer
Default: 3
top_k integer
Default: 50
batch_size integer
Default: 16
progress_bar boolean
Default: true
use_auth_token boolean | string | null
use_gpu boolean
Default: true
devices string[] | null
QuestionGeneratorComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "QuestionGenerator"
params object

Each parameter can reference other components defined in the same YAML file.

17 nested properties
model_name_or_path string
Default: "valhalla/t5-base-e2e-qg"
model_version string | null
num_beams integer
Default: 4
max_length integer
Default: 256
no_repeat_ngram_size integer
Default: 3
length_penalty number
Default: 1.5
early_stopping boolean
Default: true
split_length integer
Default: 50
split_overlap integer
Default: 10
use_gpu boolean
Default: true
prompt string
Default: "generate questions:"
num_queries_per_doc integer
Default: 1
sep_token string
Default: "<sep>"
batch_size integer
Default: 16
progress_bar boolean
Default: true
use_auth_token boolean | string | null
devices string[] | null
RCIReaderComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "RCIReader"
params object

Each parameter can reference other components defined in the same YAML file.

10 nested properties
row_model_name_or_path string
Default: "michaelrglass/albert-base-rci-wikisql-row"
column_model_name_or_path string
Default: "michaelrglass/albert-base-rci-wikisql-col"
row_model_version string | null
column_model_version string | null
row_tokenizer string | null
column_tokenizer string | null
use_gpu boolean
Default: true
top_k integer
Default: 10
max_seq_len integer
Default: 256
use_auth_token boolean | string | null
RecentnessRankerComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "RecentnessRanker"
params object

Each parameter can reference other components defined in the same YAML file.

4 nested properties
date_meta_field string required
weight number
Default: 0.5
top_k integer | null
ranking_mode string
Default: "reciprocal_rank_fusion"
Values: "reciprocal_rank_fusion" "score"
RouteDocumentsComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "RouteDocuments"
params object

Each parameter can reference other components defined in the same YAML file.

3 nested properties
split_by string
Default: "content_type"
metadata_values string[] | string[][] | null
return_remaining boolean
Default: false
SentenceTransformersRankerComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "SentenceTransformersRanker"
params object

Each parameter can reference other components defined in the same YAML file.

10 nested properties
model_name_or_path string | string required
model_version string | null
top_k integer
Default: 10
use_gpu boolean
Default: true
devices string[] | null
batch_size integer
Default: 16
scale_score boolean
Default: true
progress_bar boolean
Default: true
use_auth_token boolean | string | null
embed_meta_fields string[] | null
ShaperComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "Shaper"
params object

Each parameter can reference other components defined in the same YAML file.

5 nested properties
func string required
outputs string[] required
inputs Record<string, string[] | string>
params object | null
publish_outputs boolean | string[]
Default: true
TableReaderComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "TableReader"
params object

Each parameter can reference other components defined in the same YAML file.

10 nested properties
model_name_or_path string
Default: "google/tapas-base-finetuned-wtq"
model_version string | null
tokenizer string | null
use_gpu boolean
Default: true
top_k integer
Default: 10
top_k_per_candidate integer
Default: 3
return_no_answer boolean
Default: false
max_seq_len integer
Default: 256
use_auth_token boolean | string | null
devices string[] | null
TableTextRetrieverComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "TableTextRetriever"
params object

Each parameter can reference other components defined in the same YAML file.

20 nested properties
document_store string | null
Default: null
query_embedding_model string | string
Default: "deepset/bert-small-mm_retrieval-question_encoder"
passage_embedding_model string | string
Default: "deepset/bert-small-mm_retrieval-passage_encoder"
table_embedding_model string | string
Default: "deepset/bert-small-mm_retrieval-table_encoder"
model_version string | null
max_seq_len_query integer
Default: 64
max_seq_len_passage integer
Default: 256
max_seq_len_table integer
Default: 256
top_k integer
Default: 10
use_gpu boolean
Default: true
batch_size integer
Default: 16
embed_meta_fields string[] | null
use_fast_tokenizers boolean
Default: true
similarity_function string
Default: "dot_product"
global_loss_buffer_size integer
Default: 150000
progress_bar boolean
Default: true
devices string[] | null
use_auth_token boolean | string | null
scale_score boolean
Default: true
use_fast boolean
Default: true
TextConverterComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "TextConverter"
params object

Each parameter can reference other components defined in the same YAML file.

4 nested properties
remove_numeric_tables boolean
Default: false
valid_languages string[] | null
id_hash_keys string[] | null
progress_bar boolean
Default: true
TfidfRetrieverComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "TfidfRetriever"
params object

Each parameter can reference other components defined in the same YAML file.

3 nested properties
document_store string | null
Default: null
top_k integer
Default: 10
auto_fit
Default: true
TikaConverterComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "TikaConverter"
params object

Each parameter can reference other components defined in the same YAML file.

5 nested properties
tika_url string
Default: "http://localhost:9998/tika"
remove_numeric_tables boolean
Default: false
valid_languages string[] | null
id_hash_keys string[] | null
timeout number | array
Default: 10.0
TopPSamplerComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "TopPSampler"
params object

Each parameter can reference other components defined in the same YAML file.

6 nested properties
model_name_or_path string | string
Default: "cross-encoder/ms-marco-MiniLM-L-6-v2"
top_p number | null
Default: 1.0
strict boolean | null
Default: false
score_field string | null
Default: "score"
use_gpu boolean | null
Default: true
devices string[] | null
TransformersDocumentClassifierComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "TransformersDocumentClassifier"
params object

Each parameter can reference other components defined in the same YAML file.

12 nested properties
model_name_or_path string
Default: "bhadresh-savani/distilbert-base-uncased-emotion"
model_version string | null
tokenizer string | null
use_gpu boolean
Default: true
top_k integer | null
Default: 1
task string
Default: "text-classification"
labels string[] | null
batch_size integer
Default: 16
classification_field string | null
progress_bar boolean
Default: true
use_auth_token boolean | string | null
devices string[] | null
TransformersDocumentLanguageClassifierComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "TransformersDocumentLanguageClassifier"
params object

Each parameter can reference other components defined in the same YAML file.

11 nested properties
route_by_language boolean
Default: true
languages_to_route string[] | null
labels_to_languages_mapping Record<string, string>
model_name_or_path string
Default: "papluca/xlm-roberta-base-language-detection"
model_version string | null
tokenizer string | null
use_gpu boolean
Default: true
batch_size integer
Default: 16
progress_bar boolean
Default: true
use_auth_token boolean | string | null
devices string[] | null
TransformersImageToTextComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "TransformersImageToText"
params object

Each parameter can reference other components defined in the same YAML file.

8 nested properties
model_name_or_path string
Default: "Salesforce/blip-image-captioning-base"
model_version string | null
generation_kwargs object | null
use_gpu boolean
Default: true
batch_size integer
Default: 16
progress_bar boolean
Default: true
use_auth_token boolean | string | null
devices string[] | null
TransformersQueryClassifierComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "TransformersQueryClassifier"
params object

Each parameter can reference other components defined in the same YAML file.

10 nested properties
model_name_or_path string | string
Default: "shahrukhx01/bert-mini-finetune-question-detection"
model_version string | null
tokenizer string | null
use_gpu boolean
Default: true
task string
Default: "text-classification"
labels string[] | null
batch_size integer
Default: 16
progress_bar boolean
Default: true
use_auth_token boolean | string | null
devices string[] | null
TransformersReaderComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "TransformersReader"
params object

Each parameter can reference other components defined in the same YAML file.

13 nested properties
model_name_or_path string
Default: "distilbert-base-uncased-distilled-squad"
model_version string | null
tokenizer string | null
context_window_size integer
Default: 70
use_gpu boolean
Default: true
top_k integer
Default: 10
top_k_per_candidate integer
Default: 3
return_no_answers boolean
Default: false
max_seq_len integer
Default: 256
doc_stride integer
Default: 128
batch_size integer
Default: 16
use_auth_token boolean | string | null
devices string[] | null
TransformersSummarizerComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "TransformersSummarizer"
params object

Each parameter can reference other components defined in the same YAML file.

11 nested properties
model_name_or_path string
Default: "google/pegasus-xsum"
model_version string | null
tokenizer string | null
max_length integer
Default: 200
min_length integer
Default: 5
use_gpu boolean
Default: true
clean_up_tokenization_spaces boolean
Default: true
batch_size integer
Default: 16
progress_bar boolean
Default: true
use_auth_token boolean | string | null
devices string[] | null
TransformersTranslatorComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "TransformersTranslator"
params object

Each parameter can reference other components defined in the same YAML file.

8 nested properties
model_name_or_path string required
tokenizer_name string | null
max_seq_len integer | null
clean_up_tokenization_spaces boolean | null
Default: true
use_gpu boolean
Default: true
progress_bar boolean
Default: true
use_auth_token boolean | string | null
devices string[] | null
WebRetrieverComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "WebRetriever"
params object

Each parameter can reference other components defined in the same YAML file.

13 nested properties
api_key string required
search_engine_provider string | string
Default: "SerperDev"
search_engine_kwargs object | null
top_search_results integer | null
Default: 10
top_k integer | null
Default: 5
mode string
Default: "snippets"
Values: "snippets" "raw_documents" "preprocessed_documents"
preprocessor string | null
Default: null
cache_document_store string | null
Default: null
cache_index string | null
cache_headers Record<string, string>
cache_time integer
Default: 86400
allowed_domains string[] | null
link_content_fetcher string | null
Default: null
WebSearchComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "WebSearch"
params object

Each parameter can reference other components defined in the same YAML file.

5 nested properties
api_key string required
top_k integer | null
Default: 10
allowed_domains string[] | null
search_engine_provider string | string
Default: "SerperDev"
search_engine_kwargs object | null
WhisperTranscriberComponent object
name string required

Custom name for the component. Helpful for visualization and debugging.

type string required

Haystack Class name for the component.

Constant: "WhisperTranscriber"
params object

Each parameter can reference other components defined in the same YAML file.

4 nested properties
api_key string | null
model_name_or_path string
Default: "medium"
Values: "tiny" "small" "medium" "large" "large-v2"
device string | null
api_base string
Default: "https://api.openai.com/v1"