AI-Powered Pipeline Recommender for O11ySources
Transforming complex data pipeline creation from a manual process into an intelligent, one-click experience.
Role
Product Manager
Company
Vunet Systems
Date
July 2024
Project Summary
I led the design and delivery of an AI-driven recommendation engine for O11ySources (integrations) that converts raw telemetry samples into validated, deployable streaming pipeline blueprints. The recommender analyzes historical pipeline definitions and plugin configurations, then proposes ready-to-apply pipeline topologies for heterogeneous telemetry data like logs, metrics and traces from various infrastructure sources like servers, DBs, network devices, middleware, cloud services and more.
The Problem
Customers had to manually author complex pipeline configurations, choosing block topologies, transform sequences, and detailed plugin parameters. This manual process was:
- Time-Consuming: Required hours of effort and deep product knowledge.
- Error-Prone: Simple syntax mistakes led to deployment failures.
- A Bottleneck: Slowed down data onboarding and delayed time-to-value for customers.
- Expert-Dependent: Often required on-call engineering effort to build and tune.
The Solution
The recommender follows a hybrid architecture combining retrieval, template-driven generation, and lightweight LLM assistance to automatically generate safe and accurate pipeline configurations.
Corpus & Index: Collect and anonymize historical pipeline JSONs to create a knowledge base.
Feature Extraction & Retrieval: Use KNN over embeddings to find the most similar prior configurations for a given data sample.
Template + LLM Generation: A template engine ensures valid output structure, while an LLM helps fill in specific parameter values (like grok patterns).
Validation & Sandboxing: Statically validate the generated config and perform a dry-run with sample data to ensure correctness before deployment.
My Role as Product Manager
Vision & Strategy
Defined product vision, scope, acceptance criteria, and KPIs for the AI Recommender.
User Discovery
Led discovery sessions with SREs, support teams, and pilot customers to build the data corpus.
Prioritization
Owned product prioritization for the entire ML pipeline, from data ingestion to the final UI wizard.
Safety & Validation
Designed a continuous evaluation (Evals) framework, validation rules and rollback flows for safety protocols.
Go-To-Market
Led UAT, adoption measurement, and created launch collateral like in-product onboarding.
Post-Launch Monitoring
Tracked adoption rates and quality metrics from our Evals dashboard to inform future iterations and improvements.
Architecture & User Experience
High-Level Architecture
1. Input & Analysis
Sample Data
Feature Extraction
2. Recommendation Engine
Vector Index (Corpus)
KNN Retrieval
Hybrid Generator
3. Validation & Output
Validator
Sandbox Dry-Run
UI Suggestion
Suggestion UI Mockup
AI Recommended Pipeline
For your Apache Access Log sample.
Proposed Pipeline
Transformation Preview
BEFORE
AFTER
"response": 200,
"verb": "GET",
"geoip": { "country_name": "USA" },
"user_agent": { "name": "Firefox" }
}
Based on 3 similar historical pipelines.
Impact & Key Metrics
>2 Hours to <5 Mins
Median reduction in time-to-first-pipeline.
~65%
Pilot suggestion acceptance rate without edits.
~90%
Reduction in syntax/logic errors reaching runtime.
Full
Coverage across DB, Network, Middleware & more.
Roadmap & Next Steps
Active Learning
Capture user edits to suggestions to retrain and re-rank proposals over time, creating a self-improving system.
Template Marketplace
Create a library of shareable templates for common telemetry types (e.g., NGINX, MySQL, SNMP).
Performance Tuning
Use AI to analyze pipeline performance data and proactively suggest optimizations, like recommending more efficient plugins or identifying bottlenecks.