Redefining Enterprise AI: A Shift Towards Unstructured Data
Enterprise AI is evolving, spurred by a shift from structured to unstructured data. Traditionally, enterprises have relied on structured data, but with the advent of generative AI, the focus is now on consuming larger volumes of unstructured data. Unstructured data, by definition, lacks structure, which makes it a challenge for enterprises due to unknown data quality.
Understanding Data Quality in AI Deployments
Data quality refers to the accuracy, knowledge gaps, duplication, and other issues impacting the utility of data. The quality control tools, which were traditionally used for structured data, are now being extended to unstructured data. This move is essential for the successful deployment of enterprise AI.
Anomalo, a leading vendor in data quality platforms, has been developing its platform for structured data for several years. They recently announced an expansion of their platform to better support unstructured data quality monitoring.
“We believe that by eliminating data quality issues, we can accelerate at least 30% of gen AI deployments,” said Elliot Shmukler, co-founder and CEO of Anomalo.
Accelerating AI Projects with Quality Unstructured Data
The key challenge in AI deployments lies in poor data quality, large data gaps, and the fact that enterprise data is not ready for gen AI consumption. Anomalo believes that their unstructured monitoring could accelerate gen AI projects in the Enterprise by as much as a year. This acceleration is due to the ability to quickly understand, profile, and ultimately curate the data that these projects rely on.
In addition to the product update, Anomalo announced a $10 million extension of its Series B funding, bringing the total round up to $82 million.
Unique Challenges of Unstructured Data for AI
Unstructured content poses unique challenges for AI applications. As Shmukler pointed out, unstructured data could contain any type of information, including confidential data. The Anomalo platform addresses these challenges by adding structured metadata to unstructured documents, enabling organizations to better understand and control their data before it reaches AI models.
Key features of the Anomalo software for unstructured data quality include:
- Custom issue definition
- Support for private cloud models
- Metadata tagging
- Redaction (an upcoming feature)
Positioning in the Unstructured Data Quality Market
Anomalo isn’t alone in the unstructured data quality market. Several data quality vendors, including Monte Carlo Data, Collibra, and Qlik, offer various forms of unstructured data quality technology. However, Anomalo differentiates itself by not relying on integrating with and monitoring vector databases that contain data powering a retrieval augmented generation (RAG) workflow.
“We believe using Anomalo’s unstructured monitoring could accelerate typical gen AI projects in the Enterprise by as much as a year,” Shmukler reiterated.
Source: VentureBeat