GPT Evaluation
On this page you learn how to evaluate the performance of your chat agent using GPT Evaluation in Ebbot.
The GPT Evaluation feature enables you to evaluate your chat agent, giving it input and using a specific EbbotGPT Config. Compare the results GPT Evaluation gives you to identify what settings in the e.g. EbbotGPT Configuration that is most suitable for your use case.
When to use GPT Evaluation
GPT Evaluation is a useful feature for when you are trying out a new LLM or made other changes to the EbbotGPT Configuration or Knowledge. Below are a few use cases when GPT Evaluation is effective.
Testing a new LLM – you want to switch to a newly released GPT model and ensure your existing configuration settings still work as expected.
Changing the persona - you’ve just made changes to the persona to solve a specific issue, and now you want to ensure those changes haven’t negatively affected the output for other questions.
New sources - you’ve added or updated sources in EbbotGPT Knowledge and want to see how those changes affect the chat agent's output.
Evaluation set
An evaluation set is a collection of questions you want to test. You can add questions manually or import them from a CSV file.

Question: The message you want to test with the LLM.
Expected answer: The answer you're expecting from the LLM. This field is optional and can be left empty.
Note! The expected answer is currently not automatically compared to the actual output. Use it as a reference to remind yourself of what answer you expect from the model.
Runs
A run is the process of testing questions against the LLM using your defined settings. You can compare different runs to see how changes, such as using a different model or persona, affect the answers.
Runs may take some time to complete, depending on the time of day they are started. They are processed with lower priority and placed in a queue to avoid impacting the speed and performance of the LLMs.
Export a run
You can export a run as a CSV from the Settings view or from the Run view. In the settings view you can select to only support export selected questions or all questions.

Last updated
Was this helpful?

