World of DaaS
Posts
World of DaaS Roundtable Recap: Unifying Data to Power AI

World of DaaS Roundtable Recap: Unifying Data to Power AI

September 29, 2025

At our latest World of DaaS roundtable, data leaders discussed the challenges and opportunities of unifying data to support AI initiatives. The conversation touched on the critical importance of data quality, how to prove ROI, the value of external reference data, the rise of agentic AI workflows, and the need to deliver insights in executive-ready formats.

1. Data Quality Is Still the Bottleneck

The group began by underscoring a foundational truth: without quality data, even the most advanced AI models will underperform. Many organizations underestimate how complex data preparation and validation can be, leading to disappointment when projects don’t deliver. As one participant put it, “garbage in, garbage out is true as ever.”

Several noted that organizations underestimate the effort required to prepare data, with one adding, “a lot of work has to be done on the metadata and semantic layer in order for it to prove a value.” Another emphasized that building AI agents requires careful design: “It’s not a 30-minute exercise. It has to be very well thought out.”

2. ROI Remains Difficult to Prove

A recurring challenge is turning theoretical data value into tangible, provable ROI. This is particularly difficult with clients who lack analytical maturity, requiring vendors to shift from simply providing data to enabling task-specific, high-impact actions.

One participant observed that many core buyers are “the least sophisticated analytical data users,” requiring vendors to connect directly to task-specific outcomes. Others described prescriptive approaches, “customizing the semantic layer immediately to prove value,” while another pointed to pricing intelligence as a clear-cut case where ROI is evident: “You can very quickly identify whether a promotion actually brought in customers or whether an incorrect price is turning them away.”

3. External Reference Data Accelerates Entity Resolution

The conversation then shifted to how external data sources can significantly improve entity resolution, helping organizations more accurately link and enrich their internal records. By anchoring internal data to external reference sets, companies can develop a more complete and reliable view of their customers and entities. One participant noted, “The accuracy of data about who is who and who’s related goes up with reference data.” Others shared use cases around customer attribution and marketing segmentation that provide “tangible benefits for our clients for sure.”

A contrasting view came from an identity services provider who said their massive data asset is mainly leveraged for external use cases like “investigative risk mitigation and contact enrichment.”

4. Agentic AI Shifts the Focus to Orchestration

Participants drew a distinction between traditional machine learning and the emerging role of agentic AI powered by LLMs. The consensus was that while ML models thrive on clean, centralized data for predictive tasks, LLM-driven systems excel at workflow automation and orchestration. One executive suggested, “A central repository of clean data is very critical for classical ML models, but for LLMs, the value is orchestrating and automating workflows.” They cited text-to-SQL as a practical application.

Others stressed the need for human-in-the-loop logic when workflows cross ambiguous domains. “You need agentic flows that include the human in the loop brand by brand,” one participant explained, describing how industry-specific context determines whether terms like “hurricane” are interpreted as products or events.

5. The Future Belongs to Transparent Providers

The roundtable agreed that the future of this market will not be decided by who can scrape the most websites but by who can do it most responsibly. Compliance, efficiency, and trust are the new differentiators.

“It’s not just about what data you have, but whether buyers can trust how you got it,” one participant summed up.

Key Takeaways

Compliance is the strongest differentiator in today’s web scraping market.
Scaling scraping is costly and technically challenging. Efficiency is key.
Trust and transparency matter as much as coverage.
Education is critical since many buyers do not understand the nuances of data sourcing.
Long term winners will be providers who combine scale with transparency and defensibility.

If you are a DaaS executive interested in participating in future roundtables, apply to join our World of DaaS community.

Reply

or to participate.