September 27th, 2023

Coding with Customer: A Story of Building a Federated Data Catalog with Microsoft Purview

Introduction

As a dev crew of the Industry Solutions Engineering group inside Microsoft, we code with customers and partners. We want to share a great experience we had with a media group in Europe, both from a human and a technical perspective.

The way we operate is to code with our customers so that they can have a deeper understanding of the solution and requirements. This also affords us an opportunity to better understand, day-to-day, the relationship between the code and the business. Another important benefit is that the team we code with can take over to move the project forward after we disengage.

Our goal is to bootstrap the project by spending a few months with customer engineers. For the engineers, it’s a lot easier to move on with something they highly contributed to, rather than take over some legacy code they might have shaped differently. Coding with our customers allows us to meet great people. Here is a story that shows how this can go!

Problem statement

The enterprise we’re working with has many subsidiaries (hundreds!). Each subsidiary has data. Some of them have a data catalog, others want to start using one. How can the enterprise have a holistic view of the data portfolio while still keeping each subsidiary mastering the way they manage their data? This would enable the creation of more synergies through data sharing on specific use cases.

Designing

Microsoft’s data catalog tool is Microsoft Purview which also covers other needs; it’s a family of data governance, risk, and compliance solutions that can help your organization govern, protect, and manage your entire data estate. The data catalog part is based on Apache Atlas.

As a dev crew, we spent some time upskilling on Purview. We are part of Microsoft, but that does not mean we know everything! We needed to better understand how the type system works, how Purview-specific APIs integrate with Apache Atlas APIs, and what search and UI capabilities are enabled when storing metadata in different types (such as managed attributes, custom attributes, custom or built-in entities (e.g., Google Big Query Views)).

At the same time, the vacation period was approaching and we knew many people in the one development team (Microsoft and customer combined developers, scrum master, product owner, program managers) would take vacation in August or early September. July had already started. So we decided to have an ADS kickoff session before the end of July. The way we normally work is prepare an Architecture Design Session (ADS) and run it for a few days. The goal is to have a deeper understanding of customer context, from a technical perspective but also what we’re trying to achieve by when, what the constraints are etc. Another important aspect of the ADS is to sketch together a first architecture diagram of the solution (hence the name of the architecture design session). Because of the August milestone, we called that July meeting an ADS kickoff (~=not fully ready) session!

On our way towards the ADS kickoff session

The Microsoft dev crew (based in Switzerland and France) traveled to the customer premises (in another country in Europe) and we had a 2 day productive session. This was in a big meeting room (there were about 20 people in the room). Some of us in both companies had already spent months together in a previous project, remotely, during the pandemic, and it was a real pleasure to meet in person for the first time! There were also people joining remotely. Some for the whole duration, others for one or two hours only (e.g. we had a meeting with Purview product group program managers).

The great advantage of a 2-day meeting is that you need to have a dinner in between. That’s a great way to know each other much better, with informal discussions. That makes Pull Request (PR) reviews a lot less impersonal, weeks later.

The 2 day meeting was a great starting point for the remaining part of the design phase. People knew what the common goal was, what had still to be discovered (spikes) and decided. We also had agreed to record decisions through architecture design record documents submitted as PRs. By mid september, most of us had had at least 1 week of vacation (some had up to 4, we’re in Europe!) and the design was more advanced. During that period we were also preparing our dev environment, i.e. dev container definition, CI/CD pipelines, working agreements etc. Most of that is described in our playbook (aka.ms/cseplaybook) but the details are different with each engagement.

Implementing

We moved to the sprinting phase at the end of September and we decided to start that new phase by hacking together for a couple of days in Paris. That gave us a chance to have a stand-up where we … stood up! By that time, two new members had joined us from UK. They are part of another dev team and we were happily hosting them in our project for their onboarding period. Their help also happened to be key!

Even if the implementation phase had started, we still had detailed design decisions to take like Event Hubs vs Service Bus in our Sync framework. You can see what we eventually decided in the architecture center.

Architecture diagram with Event Hubs and Service Bus

After a good day of working, a team building was organized. It is a good practice to improve collaboration in the team, better know each other outside of usual interactions such as in meetings, and have fun while working! We worked on the Marshmallow challenge. What is it?

  • You create teams of 4 people.
  • What you have: 20 sticks of spaghetti, 1 meter of tape, 1 meter of string and 1 marshmallow.
  • Goal: Build a sturdy, freestanding structure with the marshmallow on top, in 18 minutes.

After 18 minutes, some structures were built and some were unfortunately not strong enough, but everyone has a lot of fun, that is the most important.

A structure built during the Marshmallow challenge

Everybody could not join in person in Paris, but we were in rooms equipped with Surface Hub which made the Teams integration a piece of cake! The problem is that it didn’t fully worked for the dinner party ;-).

Let’s speak about that dinner. After a hard working day, it was time to reward the team with authentic French cuisine. Such a dinner helped strengthen our bond even further and return the favor of sharing local delicacies with the customer!

Once everybody had seen the Eiffel tower, we continued developing remotely for a few months. We had the standard scrum ceremonies: sprint planning, stand-up and parking lot (we use an additional 15 minutes to discuss any burning topic announced during the stand-up), sprint reviews and demos, retrospectives. Whenever we needed, we had Teams meetings for Pull Request reviews, or peer programming. We also leveraged asynchronous communications quite a lot, through Azure DevOps (discussions in backlog items, comments in pull requests) and Teams (channels, chats).

Keeping the fun is important. The main driver for this is the passion; we like coding! Still, it makes a lot of sense to also have fun check-ins at the beginning of some of our stand-ups. We used an automatic generator check-in like this one every Monday and Friday. It helps to know each other and build a trustworthy relationship.

As we were progressing, we had a more sophisticated end to end automated test, besides the unit tests and the integration tests. The end to end test was telling us that we had to tweak the dynamic system so that out of order messages would have chances to traverse the pipeline smoothly; that was needed to achieve the eventual consistency we were looking for. We made a great use of the Visual Studio Code debugger, but also the observability we had put together. A distributed system needs that. What we did is described in the architecture center, here.

observability architecture

Moving forward

We made good progress over the sprints. Still we had a deadline for when the one team would come back to the Microsoft team and the customer team. We wanted to make sure the state of the project was good before we left. We also met for the handover phase, in customer premises, for two full days.

Two things happened during the night between day one and two, we had a one team dinner, and the system deployed itself and triggered the data pipeline successfully. We had an MVP environment, but we wanted to test what a new deployment would look like, so we configured the CI/CD pipelines to deploy to another MVP environment. The next morning, we found the new environment that had triggered the pipelines and we had metadata populated in the Purview instance of this environment. There were smiles on faces!

We are now back to two different teams. Our Microsoft dev crew is in the release phase when we write the articles like the ones in the architecture center, and many other things (see ## Public artifacts created from this engagement). On their side, the customer part of the former one team is moving forward by expanding to other data catalog technologies, and bringing more companies in the community.

We are already missing them!

Public artifacts created from this engagement

At the end of an engagement, we package and share our most interesting learnings so that they can benefit to the broader Microsoft ecosystem. Here are some of the artifacts we published after this engagement. You can also have a better idea on how some of the documentation pages or sample codes are built.

Acknowledgements

Members of our dev crew include:

  • Abeeb Amoo
  • Adina Stoll
  • Benjamin Guinebertière
  • Frédéric Le Coquil
  • Gerard Verbrugge
  • Julien Chomarat
  • Julien Corioland
  • Lidia Pomirleanu
  • Madhav Annamraju
  • Mélanie Guittet
  • Raouf Aliouat