It’s designed to be read, executed, and extended — all within 10 minutes.
Introduction — What We’re Building
The Sales Lead Scoring Agent automates the process of identifying and ranking new leads so sales teams can focus on those most likely to convert. It combines KumoRFM’s predictive intelligence with OpenAI’s agentic orchestration, producing a daily, data-driven prioritization that tells Sales Development Representatives (SDRs) exactly who to contact first. Every day, the agent:- Loads the latest lead data from your CRM or CSV source
- Uses KumoRFM (Kumo Relational Foundation Model) to infer conversion likelihoods — no training required
- Categorizes leads into HIGH, MEDIUM, and LOW priority tiers
- Summarizes key drivers behind each prediction so SDRs know why each lead ranks where it does
This integration allows OpenAI Agents to reason over multi-table business data, invoke KumoRFM tools dynamically, and turn raw relational data into actionable daily insights — seamlessly, reproducibly, and at scale.
Behind the scenes, this walkthrough uses the
kumo-rfm-mcp server — a Python MCP server that exposes KumoRFM tools (e.g., predict, evaluate) for various agentic framework, including OpenAI, CrewAI, and LangGraph. Please see KumoRFM MCP Server for more details.System Overview
Architecture Components
The Sales Lead Scoring Agent consists of four key components working together through the Model Context Protocol (MCP):- 🧠 Agent (GPT-5) — The central reasoning engine.
It plans actions, interprets instructions, and dynamically invokes MCP tools (e.g.,predict,lookup_table_rows) to analyze and rank sales leads. - ⚙️ Runner — The execution harness.
It initializes the Agent, maintains the session state, and most importantly, loads lead data from an external source such as S3 or a CRM database. The Runner passes this data into the MCP environment so KumoRFM can interpret it as part of a relational feature graph. - 🔌 KumoRFM MCP Server — The integration bridge.
It exposes the KumoRFM model as a standardized MCP toolset that can be called by any AI agent.
Each tool is strongly typed, authenticated, and schema-aware — handling data loading, graph construction, and model inference. - 🧮 KumoRFM Model — The predictive reasoning engine.
A pre-trained Relational Foundation Model (RFM) that performs zero-training inference over multi-table data.
Through the MCP interface, it executes predictive queries, evaluates results, and provides explainability — all without custom model training.
End-to-End Flow
Each daily run of the Sales Lead Scoring Agent needs to follow these key steps:- Initialize the Agent — Runner launches the Agent and connects it securely to data sources and the KumoRFM MCP Server.
- Fetch Latest Data — Load new or updated leads from CRM or S3 for scoring.
- Inspect Data — Preview schema and structure using
inspect_table_filesto understand available features. - Build Graph — Use
update_graph_metadataandmaterialize_graphto form a relational feature graph. - Predict — Execute a
predictquery to estimate conversion likelihoods — no model training required. - Enrich Results — Retrieve key details for top leads via
lookup_table_rows. - Rank & Summarize — Categorize leads into priority tiers and provide short explanations.
- Log & Automate — Record outputs and repeat the process automatically for continuous insights.
KumoRFM tools used in this agent:
inspect_table_files— Analyze the structure and preview rows of tabular data.update_graph_metadata— Define or refresh the relationships among tables.materialize_graph— Assemble the relational feature graph for inference.predict— Run predictive queries to generate conversion likelihoods.lookup_table_rows— Retrieve detailed records for selected entities or leads.
Data & Problem Definition
In this example, we work with the Lead Scoring Dataset, publicly available at:s3://kumo-sdk-public/rfm-datasets/lead_scoring/lead_scoring.csv
This dataset contains about 8,000 historical leads — both converted and unconverted — and serves as the foundation for our sales prioritization model.It is composed of a single table with the following key columns:
| Column | Description |
|---|---|
lead_id | Unique identifier for each lead |
contact_date | Date of first contact |
converted | Binary target (1 = converted, 0 = not converted) |
source, region, industry, … | Lead attributes describing each lead |
What We’re Predicting
Our goal is to predict which new leads (added yesterday) are most likely to convert (converted = 1).This enables the sales or marketing team to focus efforts on the highest-potential leads, automatically and intelligently. As part of our daily team sync, the sales agent will:
- Take the leads submitted yesterday (one day before
MEETING_DAY). - Generate a ranked list of leads based on their likelihood to convert.
- Present this prioritized list to the team — helping guide outreach efforts efficiently.
Setup
Prerequisites
- Python environment: Create a new environment using
uvorpipwith Python ≥ 3.10 - KumoRFM API key: Obtain from https://kumorfm.ai and set it as
export KUMO_API_KEY=<your_api_key> - OpenAI API key: Obtain from https://platform.openai.com/api-keys and set it as
export OPENAI_API_KEY=<your_openai_api_key> - Internet access: Required for MCP to communicate with KumoRFM services
Install and Import Required Libraries
Before running the example agent notebook, install the following dependencies:kumo-rfm-mcp— Provides the KumoRFM MCP server and tools to interact with relational graph data.openai-agents— Framework for building and running agent workflows.fsspecands3fs— Enable file system access (e.g., loading data from S3).pandas— Required for tabular data handling.
Data Loading: Getting Yesterday’s Leads
Now, let’s define a helper function to load the lead data from S3 and extract the lead IDs from the day before the meeting date.This ensures the agent only scores the most recent leads for prioritization.
- Loads the dataset directly from S3 using
fsspecandpandas. - Parses the
contact_datecolumn into a proper datetime format. - Filters the dataset to include only leads from
one day before the given MEETING_DAY. - Returns a clean DataFrame ready to be passed to the agent for prediction.
This function keeps the pipeline dynamic — just update MEETING_DAY, and it will automatically fetch the latest leads for scoring.
Let’s test the get_leads_from_previous_day() function with a real example from our S3 dataset.
Build the Sales Agent
Now that we’ve prepared our dataset and helper functions, we’re ready to create the agent function.The OpenAI Agents SDK makes this process simple and modular — allowing us to combine reasoning, data access, and predictive intelligence in just a few lines of code. Here’s what we’ll do next:
Define lead_scoring_agent— anAgentobject initialized with the system prompt and connected to KumoRFM MCP tools.- Fetch the target leads using our
get_leads_from_previous_day()helper function. - Compose the request (prompt) for the agent to run predictions and prioritize the leads.
- Execute the agent using:
Lead Scoring Agent
The Lead Scoring Agent is a specialized sub-agent responsible for predicting which leads are most likely to convert.It works as part of the daily automation pipeline — taking yesterday’s leads, running predictions with KumoRFM, and delivering a prioritized outreach list for the sales development team. The following Agent Prompt defines the role, goals, and workflow of our AI sales assistant.
It tells the model how to think and act — from connecting to KumoRFM, running predictions, and generating ranked outreach lists for SDRs. In this scenario, the agent will:
- Use the Lead Scoring dataset hosted on S3
- Run KumoRFM predictive queries to estimate conversion probabilities
- Rank leads into priority tiers (High, Medium, Low)
- Return a clean, actionable list for the daily sales meeting
LEAD_SCORING_AGENT_PROMPT.
Sales Agent Function
The following function ties everything together — it initializes the Lead Scoring Agent, retrieves the latest leads, builds the prompt, and runs the prediction workflow end to end.It serves as the main entry point that the Sales Agent uses during each daily sync to generate actionable insights for the team.
Integrating KumoRFM via MCP
The only remaining thing to do is initialize the KumoRFM MCP and provide it to the agent! We can do so with:main(), then the following outcomes can be shown:
Output
Output
For the full end-to-end code, please see the notebook example
We’d love to hear from you! ❤️
Found a bug or have a feature request? Submit issues directly on GitHub. Your feedback helps us improve RFM for everyone. Built something cool with RFM? We’d love to see it! Share your project on LinkedIn and tag@kumo. We regularly spotlight on our official channels—yours could be next!