{"id":122,"date":"2025-02-14T16:23:50","date_gmt":"2025-02-15T00:23:50","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/udm\/?p=122"},"modified":"2025-02-17T12:23:34","modified_gmt":"2025-02-17T20:23:34","slug":"leveraging-the-unified-data-model-a-practical-example-of-data-modeling","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/udm\/leveraging-the-unified-data-model-a-practical-example-of-data-modeling\/","title":{"rendered":"Leveraging the Unified Data Model: A Practical Example of Data Modeling"},"content":{"rendered":"<h1>Introduction<\/h1>\n<p>In today\u2019s data-driven world, businesses need a structured approach to managing foundational data assets. The <strong>Unified Data Model (UDM)<\/strong> provides a scalable and governed framework for modeling key entities while keeping data assets maintainable and extensible.<\/p>\n<p>In this post, we will use a <strong>hypothetical business entity<\/strong> as an example to demonstrate how UDM effectively structures data.<\/p>\n<p>We&#8217;ll model a <strong>Base Profile<\/strong>, an <strong>Extension<\/strong>, and a <strong>Dimension<\/strong> to show how the same data assets can be reused across multiple scenarios.<\/p>\n<p>We will also explore how the UDM approach simplifies data storage, making it easier to query and build future scenarios. Additionally, we will discuss its role in validation at every step, minimizing problem identification time and reducing potential re-statement costs.<\/p>\n<p>Moreover, we&#8217;ll highlight how this method decreases the time required to construct future scenarios.<\/p>\n<hr \/>\n<h1>Hypothetical Business Scenario: Modeling the &#8220;Game Developer Profile&#8221;<\/h1>\n<p>Imagine we are a <strong>gaming company<\/strong> aiming to better understand our <strong>game developers<\/strong> and the challenges they encounter. Our goal is to analyze this by utilizing data effectively.<\/p>\n<p>Our strategy involves creating a <strong>Game Developer Profile<\/strong> and segmenting the data based on various aspects, such as:<\/p>\n<ul>\n<li><strong>Region<\/strong><\/li>\n<li><strong>Age group<\/strong><\/li>\n<li><strong>Game pricing<\/strong><\/li>\n<li><strong>Customer game count<\/strong><\/li>\n<li><strong>Other relevant developer attributes<\/strong><\/li>\n<\/ul>\n<p>Let\u2019s break down how this data can be structured using <strong>Base Profiles, Extensions, and Dimensions<\/strong> to improve clarity and implementation.<\/p>\n<hr \/>\n<h1>Step 1: Creating the Base Profile<\/h1>\n<p>Let&#8217;s establish a <strong>foundational profile<\/strong> for this use case. A <strong>Profile<\/strong> represents a standard business concept, such as a <strong>user<\/strong> or a <strong>purchase order<\/strong>. Most organizational data assets can be linked to or directly define these <strong>profile entities<\/strong>.<\/p>\n<p>Structuring data in this way:<\/p>\n<ul>\n<li><strong>Simplifies data discovery and usage<\/strong><\/li>\n<li><strong>Avoids redundancy and repetitive definitions<\/strong><\/li>\n<li><strong>Provides a scalable foundation for extensions<\/strong><\/li>\n<\/ul>\n<p>In our system, game developers are a fundamental business entity, and thus, they are modeled as a <strong>Profile in UDM<\/strong>.<\/p>\n<h3>Game Developer Profile Schema<\/h3>\n<table>\n<thead>\n<tr>\n<th>Column Name<\/th>\n<th>Data Type<\/th>\n<th>Nullable<\/th>\n<th>Privacy Category<\/th>\n<th>Description<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>DeveloperId<\/td>\n<td>GUID<\/td>\n<td>No<\/td>\n<td>Internal<\/td>\n<td>Unique identifier for each developer<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<ul>\n<li><strong>Primary Key<\/strong>: Developer Id<\/li>\n<li><strong>Team Responsible<\/strong>: Game Analytics Team<\/li>\n<li><strong>Business Context<\/strong>: This dataset will monitor all game developers across all platforms.<\/li>\n<li><strong>Use Case<\/strong>: This profile will lay a foundation for various extensions, such as: \n<ul>\n<li><strong>Developer financial performance analysis<\/strong><\/li>\n<li><strong>Engagement analytics<\/strong><\/li>\n<li><strong>User behavior tracking<\/strong><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>Important to note, we have set the data type of <strong>DeveloperId<\/strong> to <strong>Guid<\/strong> to enhance its performance when joining with other data assets.<\/p>\n<h1>Step 2: Introducing the Developer Core Properties Extension<\/h1>\n<p>Let\u2019s <strong>extend<\/strong> the newly created profile with additional <strong>developer core properties<\/strong>.<\/p>\n<p>An <strong>Extension<\/strong> is a data asset that enhances a <strong>Profile<\/strong> by adding new properties without modifying the base profile definition. Extensions help <strong>capture frequently changing<\/strong> or <strong>event-driven<\/strong> data associated with the base profile.<\/p>\n<p>In this context, we will introduce an <strong>extension for game developers<\/strong> that includes attributes <strong>that change slowly over time<\/strong>. This approach keeps the core profile <strong>lean and efficient<\/strong>, while allowing extensions to operate <strong>independently<\/strong>. The extension helps answer questions like:<\/p>\n<ul>\n<li><em>&#8220;Who is the developer?&#8221;<\/em><\/li>\n<li><em>&#8220;What are their key attributes?&#8221;<\/em><\/li>\n<\/ul>\n<h2>Developer Core Properties Extension Schema<\/h2>\n<table>\n<thead>\n<tr>\n<th>Column Name<\/th>\n<th>Data Type<\/th>\n<th>Nullable<\/th>\n<th>Privacy Category<\/th>\n<th>Description<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>DeveloperId<\/td>\n<td>GUID<\/td>\n<td>No<\/td>\n<td>Internal<\/td>\n<td>Unique identifier for each developer<\/td>\n<\/tr>\n<tr>\n<td>DeveloperName<\/td>\n<td>String<\/td>\n<td>No<\/td>\n<td>Public<\/td>\n<td>Name of the game developer<\/td>\n<\/tr>\n<tr>\n<td>FoundedYear<\/td>\n<td>DateTime<\/td>\n<td>Yes<\/td>\n<td>Public<\/td>\n<td>Year the company was founded<\/td>\n<\/tr>\n<tr>\n<td>CountryId<\/td>\n<td>Long<\/td>\n<td>Yes<\/td>\n<td>Public<\/td>\n<td>Foreign key linking to the Country Dimension<\/td>\n<\/tr>\n<tr>\n<td>TotalGamesPublished<\/td>\n<td>Int<\/td>\n<td>Yes<\/td>\n<td>Public<\/td>\n<td>Total number of games published by the developer<\/td>\n<\/tr>\n<tr>\n<td>PrimaryGenre<\/td>\n<td>String<\/td>\n<td>Yes<\/td>\n<td>Public<\/td>\n<td>The main game genre the developer specializes in<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<ul>\n<li><strong>Associated Base Profile<\/strong>: Game Developer Profile<\/li>\n<li><strong>Join Cardinality<\/strong>: 1:1<\/li>\n<li><strong>Primary Key<\/strong>: DeveloperId<\/li>\n<li><strong>Responsible Team<\/strong>: Game Analytics Team<\/li>\n<li><strong>Business Scenario<\/strong>: Tracks developer key attributes over time.<\/li>\n<li><strong>Use Case<\/strong>: Provides insights into developer attributes and publishing activity<\/li>\n<\/ul>\n<p><div class=\"alert alert-primary\"><p class=\"alert-divider\"><i class=\"fabric-icon fabric-icon--Info\"><\/i><strong>Note<\/strong><\/p><\/p>\n<ol>\n<li>\n<p>This extension\u2019s <strong>join cardinality<\/strong> with the Game Developer Profile is <strong>1:1<\/strong>, meaning each developer has exactly one corresponding row.<\/p>\n<\/li>\n<li>\n<p>The extension includes <strong>CountryId<\/strong>, which links to the <strong>Country Dimension<\/strong> to ensure geographic standardization.<\/p>\n<\/li>\n<\/ol>\n<p><\/div><\/p>\n<h1>Step 3: Introducing the Country Dimension<\/h1>\n<p>Instead of storing <strong>Country<\/strong> as a free-text attribute in our profile, we normalize this data using a <strong>Dimension<\/strong>.<\/p>\n<h2>Why Use a Dimension?<\/h2>\n<ul>\n<li><strong>Ensures consistency<\/strong> across datasets.<\/li>\n<li><strong>Prevents data duplication<\/strong> and redundancy.<\/li>\n<li><strong>Optimizes performance<\/strong> by using <strong>foreign keys instead of raw text values<\/strong>.<\/li>\n<li><strong>Allows easy updates<\/strong> without affecting other datasets.<\/li>\n<\/ul>\n<p>For this use case, we <strong>link the developer\u2019s country<\/strong> to a standardized <strong>Country Dimension<\/strong>, ensuring uniformity.<\/p>\n<h3>Country Dimension Schema<\/h3>\n<table>\n<thead>\n<tr>\n<th>Column Name<\/th>\n<th>Data Type<\/th>\n<th>Nullable<\/th>\n<th>Privacy Category<\/th>\n<th>Description<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>CountryId<\/td>\n<td>Long<\/td>\n<td>No<\/td>\n<td>Internal<\/td>\n<td>Unique identifier for the country (e.g., ISO 3166 code)<\/td>\n<\/tr>\n<tr>\n<td>CountryName<\/td>\n<td>String<\/td>\n<td>No<\/td>\n<td>Public<\/td>\n<td>Full name of the country<\/td>\n<\/tr>\n<tr>\n<td>Region<\/td>\n<td>String<\/td>\n<td>Yes<\/td>\n<td>Public<\/td>\n<td>Geographic region (e.g., North America, Europe)<\/td>\n<\/tr>\n<tr>\n<td>Subregion<\/td>\n<td>String<\/td>\n<td>Yes<\/td>\n<td>Public<\/td>\n<td>More granular geographic grouping (e.g., Western Europe, Southeast Asia)<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<ul>\n<li><strong>Primary Key<\/strong>: CountryId<\/li>\n<li><strong>Team Responsible<\/strong>: Microsoft Sales Data Team<\/li>\n<li><strong>Business Scenario<\/strong>: Provides a single source of truth for geographic data.<\/li>\n<li><strong>Use Case<\/strong>: Used in reporting and analytics for geographic segmentation.<\/li>\n<\/ul>\n<h1>Step 4: Creating an Extension for Revenue Insights<\/h1>\n<p>Instead of adding <strong>revenue-related<\/strong> attributes directly to the <strong>Game Developer Profile<\/strong>, we create an <strong>Extension<\/strong> to store financial data separately.<\/p>\n<h3>Game Developer Revenue Extension Schema<\/h3>\n<table>\n<thead>\n<tr>\n<th>Column Name<\/th>\n<th>Data Type<\/th>\n<th>Nullable<\/th>\n<th>Privacy Category<\/th>\n<th>Description<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>DeveloperId<\/td>\n<td>GUID<\/td>\n<td>No<\/td>\n<td>Internal<\/td>\n<td>Foreign key linking to Game Developer Profile<\/td>\n<\/tr>\n<tr>\n<td>RevenueMonth<\/td>\n<td>String<\/td>\n<td>No<\/td>\n<td>Internal<\/td>\n<td>Reporting month (YYYY-MM)<\/td>\n<\/tr>\n<tr>\n<td>TotalRevenue<\/td>\n<td>Float<\/td>\n<td>Yes<\/td>\n<td>Internal<\/td>\n<td>Total revenue generated by the developer<\/td>\n<\/tr>\n<tr>\n<td>NumberOfTransactions<\/td>\n<td>Int<\/td>\n<td>Yes<\/td>\n<td>Internal<\/td>\n<td>Number of game purchases contributing to revenue<\/td>\n<\/tr>\n<tr>\n<td>Platform<\/td>\n<td>String<\/td>\n<td>Yes<\/td>\n<td>Internal<\/td>\n<td>The platform where revenue was generated (PC, Console, Mobile)<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<ul>\n<li><strong>Associated Base Profile<\/strong>: Game Developer Profile<\/li>\n<li><strong>Join Cardinality<\/strong>: 1:Many<\/li>\n<li><strong>Responsible Team<\/strong>: Game Analytics Team<\/li>\n<\/ul>\n<h1>Step 5: Querying the Structured Data<\/h1>\n<p><a href=\"http:\/\/devblogs.microsoft.com\/udm\/wp-content\/uploads\/sites\/84\/2025\/02\/LeveragingDM_Table.png\"><img decoding=\"async\" src=\"http:\/\/devblogs.microsoft.com\/udm\/wp-content\/uploads\/sites\/84\/2025\/02\/LeveragingDM_Table.png\" alt=\"Image shows the relations between entities under UDM\" class=\"aligncenter\" \/><\/a><\/p>\n<p>Using U-SQL, we can efficiently analyze <strong>top-earning game developers by country<\/strong>:<\/p>\n<pre><code class=\"sql\">@DeveloperRevenue =    SELECT d.DeveloperId, d.DeveloperName, c.CountryName, r.RevenueMonth, r.TotalRevenue    FROM GameDeveloperProfile AS d    INNER JOIN GameDeveloperRevenueExtension AS r    ON d.DeveloperId = r.DeveloperId    INNER JOIN CountryDimension AS c    ON d.CountryId = c.CountryId    WHERE r.RevenueMonth = \"2025-01\";OUTPUT @DeveloperRevenueTO \"\/reports\/top_earning_developers_by_country.csv\"USING Outputters.Csv();\n<\/code><\/pre>\n<h1>Why use UDM for this?<\/h1>\n<p><a href=\"http:\/\/devblogs.microsoft.com\/udm\/wp-content\/uploads\/sites\/84\/2025\/02\/LeveragingDMPost.png\"><img decoding=\"async\" src=\"http:\/\/devblogs.microsoft.com\/udm\/wp-content\/uploads\/sites\/84\/2025\/02\/LeveragingDMPost.png\" alt=\"The image highlights the entire UDM data management ecosystem\" class=\"aligncenter\" \/><\/a><\/p>\n<p>By structuring our data using UDM principles:<\/p>\n<ol>\n<li><strong>Scalability<\/strong> \u2013 The <strong>Game Developer Profile<\/strong> remains lean, avoiding unnecessary updates due to frequently changing attributes.<\/li>\n<li><strong>Performance<\/strong> \u2013 Queries are more efficient since extensions allow us to store and access dynamic data separately.<\/li>\n<li><strong>Governance<\/strong> \u2013 Using a <strong>Country Dimension<\/strong> ensures that geographic data is standardized and centrally managed.<\/li>\n<li><strong>Consistency<\/strong> \u2013 Referencing eo avoids data duplication and prevents inconsistencies in country names across different datasets.<\/li>\n<li><strong>Easy Maintenance<\/strong> &#8211; Since each extension has its own validations, it makes it easy to isolate the issue and fix<\/li>\n<\/ol>\n<p>Would you structure your business data differently? <strong>Share your thoughts in the comments<\/strong>!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introduction In today\u2019s data-driven world, businesses need a structured approach to managing foundational data assets. The Unified Data Model (UDM) provides a scalable and governed framework for modeling key entities while keeping data assets maintainable and extensible. In this post, we will use a hypothetical business entity as an example to demonstrate how UDM effectively [&hellip;]<\/p>\n","protected":false},"author":171116,"featured_media":126,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-122","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-udm"],"acf":[],"blog_post_summary":"<p>Introduction In today\u2019s data-driven world, businesses need a structured approach to managing foundational data assets. The Unified Data Model (UDM) provides a scalable and governed framework for modeling key entities while keeping data assets maintainable and extensible. In this post, we will use a hypothetical business entity as an example to demonstrate how UDM effectively [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/udm\/wp-json\/wp\/v2\/posts\/122","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/udm\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/udm\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/udm\/wp-json\/wp\/v2\/users\/171116"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/udm\/wp-json\/wp\/v2\/comments?post=122"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/udm\/wp-json\/wp\/v2\/posts\/122\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/udm\/wp-json\/wp\/v2\/media\/126"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/udm\/wp-json\/wp\/v2\/media?parent=122"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/udm\/wp-json\/wp\/v2\/categories?post=122"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/udm\/wp-json\/wp\/v2\/tags?post=122"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}