in

How Tamr Data Products leverage generative AI


When the Data Product is run, Tamr’s Data Product Templates and enrichment framework handle:

  • Building appropriate prompts from the Data Product’s declarative configuration

  • Constructing and executing requests to model inference APIs using Vertex AI

  • Batching, caching, validating, and retry (when appropriate) of API requests

  • Post-processing of the API response to ensure that results respect invariants and conform to the expected schema

The resulting attributes produced by the text extraction and classification steps can then be used internally in the Data Product as features to cluster entities using Tamr’s ML record matching, or simply exported to downstream systems for consumption by end-users. 

Considerations and tradeoffs

While the capabilities of AI models like those provided by Google gen AI are undoubtedly impressive, adopting any new technology requires careful consideration of how it will be used in practice.

For example, for some classification use-cases it may be computationally more efficient to run a traditional ML model like Gradient Boosted Trees for a particular task rather than a foundation model. However, training and modifying the model over time is a difficult operational task that requires significant ML sophistication. On the other hand, foundation models like Gemini are effective few-shot learners, and for many problems simply providing a small number of representative examples is sufficient to achieve good performance. In this case, the increased runtime of evaluating a data pipeline may be acceptable since the cost of modifying and adapting the pipeline to feedback is lower than with traditional methods.

Another important consideration is how much fine-grained control to provide to end-users. Because foundation models can use instructions written in natural language, one might assume that end users should always be able to directly augment prompts with their own modifications. However at this stage, devising prompts that perform well for a specific problem domain is a subtle and sometimes counterintuitive task. 

Tamr’s Data Products enable users to leverage the power of generative AI without requiring them to become prompt engineers; instead, users state their intention declaratively with data-centric feedback, and then leverage prompts that have been carefully crafted by Tamr for a specific problem or data domain.

Like all ML-based solutions, foundation models still require review and feedback to verify their results. Tamr has always taken a human-guided approach to machine learning, and foundation models are no exception. In some cases, base models might not suffice for end users’ expectations. Fortunately, Google provides the ability to tune model behavior using reinforcement learning from human feedback (RLHF). This allows Tamr to improve the model’s performance on specific tasks and data domains within the context of a Data Product. 

Elevate data management with generative AI integration

The partnership between Google and Tamr represents a significant leap forward in data management and analytics. By combining Tamr’s Data Products with Google’s cutting-edge generative AI such as Gemini, businesses can overcome the challenges of resolving records from disparate data sources and transform their data easily.

Tamr’s turnkey Data Products leverage ML-based mastering models, data cleaning, standardization services, and reference datasets, all while requiring minimal or no code configuration. This simplicity is further enhanced by the hosted SaaS environment, bringing ease of use for customers.

The integration of Google’s generative AI brings unprecedented capabilities. It enables automatic structured data extraction from unstructured text fields and allows users to perform flexible classification tasks efficiently. With the semantic information harnessed from source systems, data can be accurately resolved to real-world entities without complex ETL pipelines or extensive ML model development.

Together, Google and Tamr provide better value by simplifying data management, accelerating time-to-value, and enhancing data-driven insights. Businesses can unlock the full potential of their data, streamline data processing, and make more informed decisions, all while maintaining data quality and integrity.

Google and Tamr’s partnership brings data management and analytics to new heights, helping organizations succeed in an increasingly complex and competitive landscape that’s heavily reliant on data. Learn more about Google Cloud’s open and innovative generative AI partner ecosystem. To get started with Tamr, check out our Partner Advantage listing.


While Meta Stuffs AI Into All Its Products, It's Apparently Helpless to Stop Perverts on Instagram From Publicly Lusting Over Sexualized AI-Generated Children

While Meta Stuffs AI Into All Its Products, It’s Apparently Helpless to Stop Perverts on Instagram From Publicly Lusting Over Sexualized AI-Generated Children

The Possibilities of AI [Entire Talk] – Sam Altman (OpenAI)