PeterStuer 15 hours ago

Your question is very general. "A customer support app" can mean many things from a faq to a case management interface.

If you 100% can not tolerate "bad" answers, only use the LLM in the front end to map the user's input onto a set of templated questions with templated answers. In the worst case, the user gets a right answer to the wrong question.

jdlshore 15 hours ago

You can’t (practically) unit test LLM responses, at least not in the traditional sense. Instead, you do runtime validation with a technique called “LLM as judge.”

This involves having another prompt, and possibly another model, evaluate the quality of the first response. Then you write your code to try again in a loop and raise an alert if it keeps failing.

mfalcon 15 hours ago

You don't. You have to separate concers between deterministic and stochastic code input/output. You need evals for the stochastic and mocking when the stochastic output is consumed in the deterministic code.