Reimagining the Diligence Stack with Composable, Controlled AI

Powered by Lemma Series

Reimagining the Diligence Stack with Composable, Controlled AI

Jen Hilibrand - Chief of Staff at Thread AI

December 09, 2025

In every industry, bad decisions are costly. Diligence, the act of investigating and synthesizing data, often times requires human feedback as a safeguard to ensure that decisions are rooted in the correct evidence and context.

While it typically refers to work done in the Financial Services context, the act of "diligence" has applications across industries and functions.

A salesperson may diligence a prospective client before meeting them.
A store manager may diligence product inventory performance.
A marketer may diligence the performance of an ad campaign.

While the context changes, the shape of the problem remains the same: Synthesizing fragmented, structured, and unstructured data, into actionable insights.

Diligence Gaps in Public AI Tools

Although these problems may take a similar shape, they often require unique context or dynamic context associated with the domain or objective of the diligence, with the driver of the diligence using a unique set of tools or sources depending on the context.

Public Large Language Models (LLMs) like ChatGPT, or Gemini have revolutionized simple information retrieval, but they hit critical roadblocks when applied to high-stakes organizational diligence.

These tools and other research point solutions can fail users in a few dimensions:

1 No or Limited Access to Proprietary Data Sources or Platforms

Since public LLMs are trained on open-web data, they remain blind to an organization's secure internal ecosystem, lacking access to the proprietary databases, private documents, and specialized platforms where critical business intelligence lives.

2 Limited Ability to Specify Diligence Process

Public LLMs function as rigid, generalized reasoning engines that prevent users from configuring a specific diligence workflow or mandating exactly how particular artifacts and data points must be incorporated into the analysis.

These models lack granular controls, and offer no mechanism to assign higher weights to high-fidelity resources, which can be crucial for a diligence process, as well as controlling when information is no longer relevant and removing it from the relevant context.

3 No Clear Traceability

Public LLMs can often function as "black-boxes", without traceability such as citations, audit trails, or attribution across sources.

High Stakes Diligence for a Big Four Consulting Firm

We recently partnered with a Big Four consulting firm, looking to arm front-office employees with insights on clients. This firm was looking to consolidate public information about an enterprise, effectively automating the process of creating a lightweight Public Information Book (PIB) - typically a process that would take hours for analysts to create prior to an important meeting.

This process involves collating information across sources into a consolidated brief. These typically include:

SEC filings
Equity research reports
Earnings and conference call transcripts
Relevant news publications
Investor presentations
And more

We created a Diligence Worker, which is triggered automatically by an upcoming customer calendar meeting. Upon detecting a meeting, the Worker enters a planning and tool-calling loop: searching financial data sources, synthesizing the findings into a report, and pushing it to a "Handoff" state for human review. Once approved, the report is written to SharePoint and emailed to the user.

Its outcome transformed a multi-hour analyst task into an automated workflow measured in minutes.

The Diligence Worker - Supervised reasoning across a series of tools.

Our approach to solving this problem leaned on core architectural features of our platform:

Dynamic Tool-Use

We moved beyond brittle, linear scripts by leveraging Agentic Tool-Calling. Within Lemma, external capabilities, such as accessing SEC filings, equity research databases, or news feeds, are encapsulated as modular Functions that can be executed via protocols like REST or gRPC. Rather than following a hard-coded path, the Worker utilizes an LLM to reason through the specific diligence request, dynamically deciding which Functions to call and with what parameters.

This allows the system to bridge the gap between reasoning and action, retrieving real-time, proprietary data into the run's Context for synthesis. By treating these API integrations as tools, we transform the workflow from a static process into an intelligent engine capable of navigating the unique complexity of each client profile.

Human-in-the-Loop as a Core Part of the Process

In domains characterized by complexity, ambiguity, or significant risk, such as financial services, we may look to augment autonomy with human feedback. In this spirit, we treat Human-in-the-Loop (HITL) as a native architectural component. Lemma's "Handoff" state allows a Worker to pause its execution and await input from an internal stakeholder, effectively placing human oversight at a critical juncture for a workflow.

This creates a synergy where the AI acts as a tool for "Decision Augmentation", automating the heavy lifting of data aggregation while deferring to human judgment for the final "diligence" assessment. Because every Run of a Worker is tracked and the Context is preserved, human decisions can be captured as structured data. This allows organizations to easily implement Reinforcement Learning with Human Feedback (RLHF), turning every human intervention into a data point that refines the Worker's accuracy over time.

This approach ensures we are building a collaborative agentic system where teams oversee complex processes rather than being replaced by them.

Composability

We approached diligence as a reusable architectural pattern that leverages our platform's composability and flexibility. By using our Function Registry, which treats API calls and serverless functions as modular, protocol-agnostic building blocks, we can inject different "layers of context" into the same underlying structure.

This means the reasoning engine originally built to generate a financial Public Information Book can be repurposed for marketing attribution or vendor analysis simply by swapping the collection of available tools. The Worker dynamically orchestrates these domain-specific capabilities within a shared Context, allowing enterprises to solve distinct diligence problems across functions without rebuilding the entire workflow from scratch.

Extending beyond Financial Services

Zooming out to the concept of "diligence as a problem" - we built this workflow with cross-domain extensibility in mind. This pattern of intaking relevant context, planning, tool-calling, extracting, and synthesizing, and then delivering consolidated learnings can be implemented at scale, across domains, and for several teams using Lemma's infrastructure.

This "diligence engine" is an example of pattern reusability - by relying on the composability of Workers and modular Functions, the same fundamental logic applies whether you are performing client diligence in finance, evaluating vendor risk, or summarizing daily retail operations.

The reusable Worker Tools can be refitted for diligence across teams.

Accelerating Time to Value

By building off of reusable, agentic patterns and focusing on composability, Thread AI was able to deliver this transformative solution in a matter of days, not weeks or months.

The best outcomes are often achieved through iterative building, and by utilizing a highly modular and composable platform like Lemma, builders can deliver immediate value while future-proofing automations, with a platform that is both model and tool-agnostic.

While composability is essential, it cannot be fully implemented in high-stakes, critical environments without native oversight. Lemma enables this both as human feedback and as traceability into agentic actions.

Start building your diligence context engine and gain immediate value by leveraging Thread AI's pre-built critical patterns. Contact our team to learn more.

Contact our team to learn more.