Case Study: Chat Agent Unit Testing Framework

Aug 5, 2024

—

1. Introduction

In a healthcare industry known for inefficient human-powered workflows, many companies seek to provide health systems with tools to help them operate more efficiently. These companies, as well as many others, are looking for ways to build agentic solutions while maintaining transparency and trust with their customers.

2. Problem

A leading healthcare company contracted Auril AI to help them create testing infrastructure to make evaluating their AI Agents easier. This company works with hospitals and health systems nationwide, building voice & chat agents to improve the lives of patients and healthcare workers.

They understand that trust in autonomous systems is tremendously important and have gone above and beyond to find ways to develop new approaches to ensure consistent performance, safety, and security of all of their autonomous systems.

3. Action Steps

The Client hired Auril AI to help accelerate their exploration of using AI Agents to ease and automate the touchpoints between various parties that allow ORs to run efficiently. The goals of the project were:

Step 1: Create an extensible framework for the Client to use to build AI Agents
Step 2: Build a V1 agent that could accomplish a certain set of use-case-specific tasks, interacting with both patients and OR personnel
Step 3: Integrate that Agent into the Client’s systems, giving it the ability to receive and send information through the use of tools to allow it to be meaningfully autonomous

Client and Auril AI partnered to scope a project to build a holistic testing suite that could support their needs. The project involved:

Assessing the landscape of technologies/ vendors on the market to support such testing and working
Developing a system using the chosen platform that could be used to develop and maintain unit tests for AI Agents without code
Integrate these systems into the Client’s existing AI Agent management systems, allowing them to test new iterations on their agents quickly.

The project was executed in close partnership with the client’s talented team, allowing us to ensure that the learnings that came from the project were durable and could be carried across to other projects.

4. Results

The result of the project was a flexible and easy-to-maintain testing infrastructure that could be used to observe the impact of changes to agents quickly.

By integrating this system into their CI/CD pipeline, the Client could be assured that any changes being made to their automated systems passed a series of unit tests, ensuring that a change to one part of a system didn’t have negative follow-on effects on other parts of the AI’s behavior.

“The Auril team has helped us accelerate the growth in quality of our testing infrastructure, and their expertise with Generative AI has up-leveled our whole team” – Client SVP of Product

Since its initial release, the Client has continued to build upon this system, and has partnered with Auril AI to continue expanding their systems for testing, evaluating, and monitoring AI Agents – improving the overall safety and security of their systems as a result.

5. Call To Action

Do you want help ensuring that your AI systems are similarly bulletproof? Would you benefit from partnering with leaders in the deployment of production-grade GenAI? We can help!

Auril AI offers a free consultation to help you gain clarity around your AI strategy. We’re experts in taking AI from concept to production, and we can help you execute on all of the work required to get production models into the hands of your customers.

Click here to schedule your free consultation

Case Study