Understanding the Data Warehouse Architecture
The Team Foundation reporting warehouse is a traditional data warehouse consisting of a relational database organized in an approximate star schema and an OLAP database built on top of the relational database. The following diagram shows the high-level architecture of the Team Foundation data warehouse and the relationships between the operational stores, the data warehouse, and the team reports.
Each tool or plug-in in Team Foundation use a relational database in Microsoft SQL Server 2005 to store the data used by the tool in its day-to-day operations. This relational database is often referred to as the operational store. The operational stores for Team Foundation include:
- Common structure databases (TfsIntegration and TfsActivityLogging)
- Work item tracking databases (TfsWorkItemTracking and TfsWorkItemTrackingAttachments)
- Source control database (TfsVersionControl)
- Team build database (TfsBuild)
- Team test database (TfsBuild)
You might also have operational stores created for third-party tools.
Like most operational stores, the schema of the relational database is designed and optimized for the online transactional processing of data. As the tool or plug-in performs an activity, it writes the latest information to the operational store. Therefore, data in the operational store is constantly changing and being updated, and all data is current.
Because each tool or plug-in has its own schema requirements and data is stored in the operational store to optimize transactional processing, the purpose of the warehouse adapter is to put the operational data into a form usable by the data warehouse. The warehouse adapter is a managed assembly that extracts the data from the operational store, transforms the data to a standardized format compatible with the warehouse, and writes the transformed data into the warehouse relational database. There is a separate adapter for each operational data store.
The warehouse adapter copies and transforms those data fields specified in either the basic warehouse configuration or in the process template used at the time a new team project is created. If you subsequently change the process template to add or delete which data fields are written to the data warehouse, these changes are detected the next time the adapter is run. The adapter runs periodically with a frequency set by the RunIntervalSeconds property. The default setting for the refresh frequency is 3,600 seconds, so give careful consideration to the appropriate refresh frequency for your installation.
It is important that data is not written from the relational database to the data cube while the relational database is itself being updated from the operational store. To avoid conflicts reading and writing data, the warehouse adapters that push and pull the data are synchronized. After the adapters have completed their calls, the cube is reprocessed.
The Warehouse Relational Database
Each tool describes its contribution to the data warehouse in an XML schema. The schema specifies the fields that are written to the relational database as dimensions, measures, and details. The schema is also mapped directly into the OLAP database.
The data in the warehouse are stored in a set of tables organized in a star schema. The central table of the star schema is called the fact table, and the related tables represent dimensions. Dimensions provide the means for disaggregating reports into smaller parts. A row in a fact table usually contains either the value of a measure or a foreign key reference to a dimension table. The row represents the current state of every item covered by the fact table. For example, the Work Item fact table has one row for every work item stored in Work Item operational store.
A dimension table stores the set of values that exist for a given dimension. Dimensions may be shared between different fact tables and cubes, and they may be referenced by a single fact table or data cube. A Person dimension, for example, will be referenced by the Work Items fact table for Assigned To, Opened By, Resolved By, and Closed By properties, and it will be referenced by the Change Sets fact table for the Checked In By property.
Measures are values taken from the operational data. For example, Lines of Code Changed is a measure that summarizes the number of source code changes in a changeset. Count is a special measure in that it can be implicit, as long as there is one record for every item that is counted. Measure groups combine multiple measures into a tightly coupled group. Lines of Code Changed and Total Lines of Code might participate in a single measure group.
The Warehouse OLAP Cube
Fact tables are a good source of information for reports that show the current state of affairs. However, to report on trends for data that changes over time, you need to duplicate the same data for each of the time increments that you want to report on. For example, to report on daily trends for work items or test results, the warehouse needs to retain the state of every item for each day. This allows the data cube to aggregate the measures by day. The warehouse OLAP data cube aggregates both data from the underlying star schema and time data into multidimensional structures.
Each time the data cube is processed, the data stored in the star schemas in the relational database are pulled into the OLAP cube, aggregated, and stored. The data in the cube is aggregated so that high-level reports, which would otherwise require complex processing using the star schema, are simple select statements. The cube provides a central place to obtain data for reports without having to know the schema for each operational store and without having to access each store separately.
Report Designer Reports
Report Designer is a component of Visual Studio 2005 Team System that allows you to define the Team Foundation data warehouse as a data source and then design a report interactively. Report Designer provides tabbed windows for Data, Layout, and Preview, and you can add datasets to accommodate a new report design idea, or adjust report layout based on preview results. In addition to the Data, Layout, and Preview design surfaces, Report Designer provides query builders, an Expression editor, and wizards to help you place images or step you through the process of creating a simple report.
Team Foundation integrates with Microsoft Office Excel to allow you to use Microsoft Office Excel to manage your project and produce reports. Microsoft Office Excel provides pivot tables and charts for viewing and analyzing multi-dimensional data. You can bind these pivot tables directly to the Team Foundation OLAP cube, so you can interact with the data in the cube.
Security for the Team Foundation data warehouse is defined at the database level, while security for team reports is at the team project level. The Team Foundation Server administrator determines who has access to the data in the data warehouse by granting or revoking permissions on the user’s account. By default, write access to the warehouse is restricted to a service account under which the warehouse service runs. Each tool adapter has write access to the data warehouse because it runs in this security context. Read-only access is granted by the administrator to individual users or groups of users. A user who has permission to view the data in the warehouse for a particular team project has full access to all of the data for that project. However, a user with permission to view the data for one team project cannot automatically view the data from another team project.