Allowing configuration of the ingestion and retrieval pipelines via a yaml file #3214

jacopo-chevallard · 2024-09-17T08:08:03Z

We want to enable users to configure the ingestion and retrieval pipelines via a yaml file. Users will be able to configure the actual steps to be included in the pipelines, as well as the configuration of each step.

Below is an example of a yaml configuration file we are testing

ingestion_config:

 parser_config:

megaparse_config:

strategy: "fast"

pdf_parser: "unstructured"

splitter_config:

chunk_size: 400

chunk_overlap: 100


retrieval_config:

workflow_config:

name: "standard RAG"

nodes:

 - name: "filter_history"

edges: ["rewrite"]


 - name: "rewrite"

edges: ["retrieve"]


 - name: "retrieve"

edges: ["generate"]


 - name: "generate"

edges: ["END"]

# Maximum number of previous conversation iterations

# to include in the context of the answer

max_history: 10


prompt: "my prompt"


max_files: 20

reranker_config:

# The reranker supplier to use

supplier: "cohere"


# The model to use for the reranker for the given supplier

model: "rerank-multilingual-v3.0"


# Number of chunks returned by the reranker

top_n: 5

llm_config:

# The LLM supplier to use

supplier: "openai"


# The model to use for the LLM for the given supplier

model: "gpt-3.5-turbo-0125"


max_input_tokens: 2000


# Maximum number of tokens to pass to the LLM

# as a context to generate the answer

max_output_tokens: 2000


temperature: 0.7

streaming: true

The text was updated successfully, but these errors were encountered:

linear · 2024-09-17T08:08:04Z

CORE-204 Allowing configuration of the ingestion and retrieval pipelines via a yaml file

dosubot bot added the area: backend Related to backend functionality or under the /backend directory label Sep 17, 2024

jacopo-chevallard self-assigned this Sep 19, 2024

jacopo-chevallard closed this as completed Sep 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allowing configuration of the ingestion and retrieval pipelines via a yaml file #3214

Allowing configuration of the ingestion and retrieval pipelines via a yaml file #3214

jacopo-chevallard commented Sep 17, 2024

linear bot commented Sep 17, 2024

Allowing configuration of the ingestion and retrieval pipelines via a yaml file #3214

Allowing configuration of the ingestion and retrieval pipelines via a yaml file #3214

Comments

jacopo-chevallard commented Sep 17, 2024

linear bot commented Sep 17, 2024