You will be navigated to the RAG Tool details screen, where you can either keep default settings or continue configuring your tool to tailor it exactly to your needs.
The RAG Tool details screen is split into the header and five tabs:
Header:
- Name
- Description
- System name. System name is used to identify the RAG Tool, e.g. when embedding it into an AI Panel, so it is strongly recommended not to change it after the RAG Tool has been created.
Retrieve tab:
Settings under this tab control how content from defined Knowledge Sources is retrieved and ranked so that the LLM can ground its answers in accurate and relevant content.
- Semantic cache (Upcoming feature): Improve RAG performance and quality by caching successful LLM responses.
- Knowledge Sources: You can connect one or multiple Knowledge Sources that the RAG can use to generate answers to user queries.
Important: it is not recommended to connect Knowledge Sources in multiple languages to one RAG Tool.
- Retrieval parameters:
⟩The similarity score threshold is the minimum score (0 to 1) needed for a content chunk to be retrieved and sent to LLM. The higher the value, the stricter the selection criteria. The default is 0.75.
⟩ The number of chunks to select defines how many content chunks matching the similarity score threshold will be sent to the LLM for answer generation. For example, if this value is 5, only 5 top-scored chunks will be sent to LLM.
⟩ Context window expansion size indicates how many neighbouring chunks will be appended before and after each selected chunk. This ensures that the LLM will not miss relevant content that did not fit into selected chunks.
Retrieval parameters are closely related to chunk size set on the Knowledge Source
- for example, it might be necessary to retrieve more chunks if they are of smaller size.
Generate tab:
Settings under this tab control the way LLM generates its answers, ensuring it follows the necessary tone, length, and format of response.
- Prompt template. Controls how the LLM generates answer to user query. It is recommended not to edit the default Prompt Template directly. Instead, go to the default Prompt Template, clone it, make necessary changes and select the new Prompt Template under the Generate tab in your RAG Tool settings.
Language Tab:
With settings under this tab, you can track in what languages Magnet AI users communicate with the RAG, as well as optimize its performance for multi-lingual use cases.
- Detect question language - if this setting is enabled, the LLM detects the language of user questions. Logs are sent to Azure Application Insights. Language detection can be enabled as an independent feature for monitoring purposes, or as a required part of Multi-lingual RAG feature.
- Enable multi-lingual RAG - this feature significantly improves RAG performance for multi-lingual use cases, when users can ask questions in languages other than the Knowledge Source content. To use this feature, the following
parameters are necessary:
⟩ Detect question language must be on
⟩ Enable multi-lingual RAG must be on
⟩ RAG Tool source language: the language of your Knowledge Source content. English is selected by default.
⟩ Translation Prompt Template: Prompt Template that translates user question into the language of content and translates generated answer back into the user’s language.
Post-process Tab:
Settings in this tab do not modify the LLM response, but rather serve for monitoring purposes by analyzing generated output.
- Answered/not answered check - This check can be enabled for monitoring purposes. Processes LLM output and logs whether the user’s question was answered or not.
- Check for hallucinations - makes the LLM check its own response, compare it with retrieved content and report cases of hallucinated output. Please note that it does not rewrite or improve the answer. Currently, this feature is in beta.
UI Settings Tab:
In this tab, you can control the look and feel of your RAG, e.g. add a header or help first-time users with some sample questions.
- Header configuration - Configure the header and/or subheader that you want to display for your RAG tool. Leave blank to not display any header or subheader.
- User feedback - if enabled, users can like or dislike RAG answers, and their feedback is logged. This feature is not fully supported yet.
- Sample questions - help first-time users formulate their questions by suggesting up to three sample questions.
- Show link titles (Upcoming feature) - Control how links to found documents are displayed.
- Offer to bypass cache (Upcoming feature) - Gives users an option to bypass semantic cache if caching is enabled and runs a new search across all documents.