There are multiple ways to run an Evaluation Job.
The basic way to run an Evaluation Job is to navigate to
- Evaluation > Evaluation Jobs where the list of jobs is stored.
There are also quick shortcuts to Evaluation on the RAG Tool and Prompt Template details. Let’s explore all options.
Run an Evaluation Job from the Job List
Navigate to Evaluation > Evaluation Jobs and click New.
A pop-up will appear. Fill in the Evaluation Job fields:
- Test Set. You must select one Test Set that you will be using to evaluate your AI tools.
- Evaluated RAGs or Evaluated Prompt Templates. This field is defined by the type of selected Test Set. For example, if you have selected a Test Set with type RAG, you will be able to select only RAG Tools, but not Prompt Templates. Select one or multiple tools that you want to evaluate. For example, you might want to select multiple tools to compare their performance.
- Number of iterations. 1 by default, max 5. This defines the number of times each evaluation input will be sent to the LLM. This setting can be used to evaluate the consistency of generated output as the same input is sent to the LLM multiple times.
Click Save&Run when you are ready to launch the Evaluation Job.
Run an Evaluation Job from a RAG Tool or a Prompt Template
There is a quick way to launch an Evaluation Job directly from a RAG Tool or a Prompt template. When on the details screen, locate the Evaluate button in the top right corner of the Preview area.
- Click the button. A pop-up with new Evaluation Job details will appear.
- Select the Test Set from the dropdown. If you are launching an Evaluation Job from a RAG Tool, you will only have Test Sets of type RAG in the dropdown, and if you are on a Prompt Template details, you will only have Test Sets of type Prompt Template.
The current RAG Tool or Prompt Template is selected by default, but you can select more items of the same type if you want to compare multiple tools.
Number of iterations is 1 by default, but can be increased up to 5 if you want to send each test input multiple times.
- Click Save&Run when you are ready.
- A pop-up with confirmation will open.
Use the View Evaluation Job button to navigate to the job list and view the Job status.