← All Case Studies
Case Study 02

Deep Content Metadata Extraction Drives New Revenue & Engagement

Turning a CDP investment into a personalization engine by enriching every page view with AI-extracted content signals.

The Challenge

A major publisher operating dozens of media brands had just made a significant investment in a Customer Data Platform (CDP). The platform was live, but it was only as powerful as the data feeding it. Without rich, structured content signals, the CDP was operating on surface-level page views—missing the nuanced behavioral context that drives real personalization and monetization.

User-generated content like forum posts represented another massive, untapped data source. The publisher knew there was value buried in all of this content, but lacked the tooling to extract and operationalize it at scale.

Our Approach

We proposed a comprehensive metadata enrichment strategy: use AI to systematically extract deep content metadata from every editorial piece and user-generated content item across the portfolio, then pipe that structured data into the CDP with every page view. This would transform each visitor interaction from a simple "page viewed" event into a rich contextual signal—revealing funnel stages, topical interests, and audience segments that had never been previously identified or monetized.

What We Built

  1. Automated AI-driven metadata extraction pipeline processing every article across the publisher's full portfolio of brands, pulling structured data including topics, subtopics, entities, sentiment, content type, funnel stage indicators, and more.
  2. Custom classification layer mapping all extracted metadata into the publisher's proprietary taxonomy and lexicon, ensuring consistency and business relevance across every brand in the portfolio.
  3. Real-time CDP integration pushing enriched metadata into the Customer Data Platform with every page view, enabling granular audience segmentation and trigger-based personalization that was previously impossible.
  4. Purpose-built vector database enabling semantically related content discovery—powering a "smart related content" widget on each article that surfaces deeply relevant content rather than relying on simple category or tag matching, significantly increasing page views per visit.

The Impact

The CDP finally delivered on its promise. The publisher uncovered entirely new audience segments and funnel stages they hadn't previously considered, opening new monetization pathways that didn't exist before. Engagement metrics improved measurably through more intelligent content recirculation, and the enriched data set became a foundation for personalized advertising, subscription targeting, and editorial strategy decisions across the portfolio.

Facing a similar challenge?

Get in Touch