Reliability Generative AI @C3 AI

Role & Duration

Lead Product Designer

6 Weeks

Highlights & Stage

Generative AI

Visionary Design,

Will ship on March 2025

Team Structure

Leadership: C-Suite Reviews

Peers: 1 Product Designer, 2 PM, 3 DS, 3 SME, 1 Eng Lead, 10+ Eng, 2 Partner Teams

Redesigned Gen AI capabilities to empower reliability engineers with domain-specific insights, guided workflows, and actionable recommendations, transforming alert investigation processes with enhanced trust and efficiency.

Background

Is Gen AI Chat solving all the problems?

Generative AI is changing industries, but tools like Gemini, Claude AI, and ChatGPT often don’t meet professional needs for scalability, security, accuracy, and tailored experiences. To address this, we set out to use Generative AI to solve these issues, improve client satisfaction, and grow revenue.

In early 2023, we launched the first version of a Generative AI co-pilot in the Reliability platform to help users answer unresolved questions when they investigate alerts. However, it faced several key challenges:

Response Latency

Delayed responses due to performance led to user frustration and abandonment

Unclear Scope

Users were unsure what questions the AI could answer.

Limited Trust

Users were skeptical of the AI's accuracy, especially in critical scenarios such as alert investigation.

Workflow Integration Gaps

The feature didn’t align with engineers' existing workflows

Revaluate Alert Detail Page

To address this, we decided to reevaluate the Alert Detail Page, a key touchpoint for reliability engineers. Through this process, we aimed to understand their pain points and explore how Generative AI could be more effectively integrated to empower users.

By analyzing the current experience and talking to the existing customers, we uncovered critical insights:

Generic Page Layout for All Alerts

The same layout is used for all alerts, forcing users to spend extra time analyzing features and charts to determine relevant actions, which reduces efficiency.

Unclear Information Hierarchy

Key elements like risk scores, contributing features, and failure modes lack logical grouping or emphasis, making navigation and interpretation challenging.

Limited Contextual Guidance

Complex charts and data are presented without explanations or actionable insights, leaving users to interpret technical information independently.

Information Overload

The page displays excessive, unprioritized data, overwhelming users and making it difficult to focus on critical insights.

Design Strategy

Starting with the North Star:

A Vision for Transformative Alert Investigation

When we began rethinking the integration of Generative AI, our ambition was clear: to create a system that would fundamentally transform how reliability engineers investigate alerts. The North Star vision aimed to leverage Generative AI Agents to kickstart a guided, full-page alert investigation flow—a seamless, end-to-end experience where engineers could focus on resolving critical issues with clarity and confidence.

North Star Design Principle

Contextual Relevance

Deliver only what matters. Surface the most relevant data and insights based on the specific alert context to reduce information overload and help users focus on actionable items.

Guided Empowerment

Lead with guidance, empower with flexibility. Provide clear, step-by-step workflows to help users navigate complex investigations while enabling them to refine and adjust as needed.

Clear Actionability

Insights into actions. Translate AI-generated insights into clear, actionable recommendations, ensuring users know exactly what steps to take next.

Trust Through Transparency

Show the "why" behind AI decisions. Explain AI-generated insights with confidence scores, logic, and supporting data to build user trust, especially in high-stakes scenarios.

Laying the Groundwork

With the North Star vision defined, we focused on designing an experience that met users where they were but guided them to where they needed to go. Every aspect of the design was informed by key principles: Contextual Relevance, Clear Actionability, Trust Through Transparency, and Guided Empowerment.

Imagining the Future

The blueprint became more than a design goal—it was a transformation roadmap. By starting with a bold vision, we ensured that every iteration of the product moved closer to delivering the ultimate user experience: a platform where Generative AI Agents don’t just assist but become an integral part of the alert investigation process, empowering reliability engineers to make decisions with speed, clarity, and confidence.

Transitioning from Vision to MVP

Prioritizing Impact and Feasibility

While our North Star vision aimed to redefine alert investigation with a fully guided workflow powered by Generative AI Agents, we recognized the need to scale back for our initial implementation. Due to time constraints and the ongoing research required to perfect AI agent capabilities, we shifted our focus to prioritize features that could deliver immediate value while setting the foundation for future expansion.

Focusing on What Matters Most: Gen AI Summary

Through user research and evaluation, it became clear that the Gen AI summary was the single most impactful feature for reliability engineers. Users consistently highlighted their need to quickly understand the most important information—including risk state, potential root causes, and actionable next steps—without sifting through overwhelming data. By prioritizing this feature, we could address the critical pain points of information overload and unclear data hierarchy, delivering a tangible improvement in their workflow.

Simplifying Interactions with Natural Language Filters

In addition to the summary, we decided to simplify the filtering experience with natural language inputs. Engineers often found the existing filter panel cumbersome and time-consuming, particularly when investigating patterns in historical data. By enabling users to create filters and visualizations through intuitive, conversational prompts (e.g., “Show risk scores above 80 for the past week”), we could reduce the barrier to entry and improve usability.

Final MVP

By transitioning from vision to MVP with a focus on immediate value and scalability, we ensured that each feature directly addressed user needs while paving the way for future innovation. This approach allowed us to deliver a high-impact solution within the given constraints while staying aligned with our long-term goals.

Outcome

A Huge Success of Design Team Pushing the Boundary

Presented both the North Star Vision and the MVP to the CEO, securing buy-in and plans to position the solution as a flagship feature at the next C3 AI Annal Conference - Transform. Delivered presentations to key customers, including Shell and FHR, with FHR escalating their excitement to their CEO. Secured over 5M revenue by signing longer term contract with Biggest customers like Shell and FHR.

The AI summaries and actionable recommendations save me hours in diagnosing alerts. I finally feel like I can focus on solving problems rather than sifting through data.


FHR Engineer

The guided workflow is exactly what we needed to make sense of complex alerts. This is a game-changer for improving our response time and efficiency.



Shell Reliability Engineer

Thanks for Stopping By

Evie Xu