Introduction
Our team recently partnered with a customer who was stuck in a frustrating loop. Each new generative AI (GenAI) project required a complete setup from scratch. Imagine the hassle of setting up pipelines (both DevSecOps and LLMOps), managing dependencies, connecting to resources, and configuring tools for Azure resources, all over again for each project. It was like reinventing the wheel every single time.
But the challenges didn’t stop there. They also faced duplication in approval boards and security reviews, which added layers of bureaucracy and delays. Each project came with its own technological stack, leading to a steep learning curve and increased maintenance efforts over time. It was a recipe for inefficiency and frustration.
That’s where our work with the customer came in. By creating together a reusable template, we provided a consistent and efficient way to set up new projects. The template included all the necessary components, from tool configurations to pipeline setups, ensuring that everything was ready to go from day one. The template-starter project further streamlined the process by automating the initial setup, allowing the customer to focus on innovation rather than configuration.
With that approach, the customer could avoid the repetitive tasks and focus on what truly mattered—building and on deploying their GenAI projects. This not only saved time but also reduced the risk of errors and improved the overall quality of their work. Plus, it minimised the need for different teams to be involved in the setup process, making the entire workflow more efficient.
Process
Identifying the problem
In our projects, we always kick things off by diving deep into understanding the customer’s needs and the challenges they face (the process was lead by our lovely TPM Virahn Walia and TL Avishay Balter). In a brainstorming session we thoroughly analyse the various users of the product we are about to build. We can pinpoint their main pain points and the gains they would get from the solution. Once we have those, we are able to extract it into key requirements that need to be addressed, and their prioritisations.
Challenges
Enterprise organisations face a variety of challenges when developing and managing GenAI projects. First, they are dealing with a technology that is constantly evolving, which means keeping solutions up to date and maintaining them can be quite the task. Then there’s the issue of having to start from scratch for each project, which is not only time-consuming but also highly inefficient. Add to that the dependencies between different teams, which complicates the provisioning of resources to Azure. And let’s not forget the responsibility factor; since GenAI is a new and evolving technology, customers need to be able to trust it.
The nature of GenAI technologies and the lack of standardisation in this space make it hard for enterprises to control technology sprawl and to avoid having each project reinvent everything or go through the entire approval stack, such as responsible AI, security, and other review boards. For example, in our case, the customer had to go through an approval process to deploy resources to Azure for each Gen AI project, which means the architecture would stay similar but the time required for approvals and deployments will stay the same for each project. Standardisation of the architecture will save time and make it easier to control the technology used.
By understanding these challenges, the different teams involved, and the various users using the product, we can begin to identify the requirements that need to be addressed. These include:
- Constant Evolution of GenAI: Keeping up with the rapid advancements in GenAI technology can be daunting. Ensuring that the solution remains current and effective requires continuous updates and maintenance.
- Code Sharing Between Projects: Starting from scratch for each project leads to duplication of effort and inefficiency. Reusing code and components can significantly streamline the development process.
- Dependent Teams: The interdependencies between different teams can complicate the provisioning of resources and coordination. Efficient collaboration and clear communication are essential to overcome these hurdles. Additionally, removing these dependencies can make the process faster and more efficient.
- Trust and Compliance: As GenAI is a relatively new and evolving technology, building trust and ensuring compliance with industry standards and regulations is crucial. Customers need to have confidence in the reliability and security of the solution.
By addressing these challenges, we can create a more efficient, reliable, and user-friendly environment for developing and managing GenAI projects.
Requirements
To tackle the challenges, we identified several key requirements. First and foremost, we needed a scalable and reusable infrastructure that could be easily maintained and updated. Connecting to Azure resources and configuring tools in Azure had to be a breeze and work out of the box. We also aimed to save time and effort by reducing team dependencies. It was crucial to have a quick and easy way to get a new project’s infrastructure up and running. Additionally, the solution needed to be easy to maintain and update.
Designing a solution
With all the requirements in mind, we set out to design a solution that would meet the customer’s needs. We realised that we needed to break the project down into smaller, logical pieces that would each be easy to maintain and update. This modular approach allowed us to keep business logic separated from technical logic.
That is why we decided to have a separation between:
- Template – the infrastructure for the GenAI projects.
- Starter – Wizard to configure a new GenAI project.
This design was done by Liam Moat, one of our talented team members.
As shown in the diagram above, we divided the project into four main parts:
Reference Architecture
This section contains the reusable architecture that can be applied to different projects. The architecture can be implemented in various forms, such as with Terraform or an ARM template. The goal is to have an architecture that was pre-approved by the enterprise’s review processes, that can be used to deploy the infrastructure.
Our reference architecture serves as a blueprint, ensuring consistency and efficiency across projects. By leveraging this architecture, teams can quickly set up their infrastructure and save time and effort on creating the entire setup for each individual project. It’s like having a ready-made recipe that guarantees a successful outcome.
Template
The template is the magic ingredient that contains all the reusable components needed to get your project up and running. Within the template’s code, you’ll find the configuration for the tools, the connections to Azure resources, and the pipeline setups. These components are like the ultimate toolkit, ready to be reused in any project that forks from the template repository.
To ensure that tinkering with the template doesn’t break anything and everything works as expected, we’ve created a promptflow project. This project runs the pipelines in Azure Machine Learning (AML) and web app (using docker container), making sure nothing breaks in the template as things are changed and everything is in tip-top shape.
The template consists of the following components:
- Pipelines: Workflows that automate the process of building, testing, and deploying the code.
- GenAI Project: A simple project which works with the template and checks the validity of it.
- Tools: Configurations for connecting to Azure resources, setting up tools, and managing dependencies.
Pipelines:
Our pipelines were designed with reusability in mind. By organising scripts, GitHub actions, and makefiles into folders, we can easily reuse components across different pipelines.
Here’s a breakdown of the main template code pipelines:
- PR Validation Pipeline: This pipeline checks the validity of the code before merging it into the main branch by running tests.
- CI Pipeline: This one builds the code, runs the tests, and creates the Docker image.
- CD Pipeline: This pipeline handles code deployment. We had two deployment targets: either to Azure Machine Learning (AML) or a web app using Docker containers.
Our project included several pipelines, and we ensured that linters, tests (unit and integration were needed in our scenario), and validations were part of the process. For instance in one pipeline we leveraged a PR title checker action to ensure that our PR titles followed naming conventions described in the Conventional Commits specification. This made life easier for reviewers, as they could quickly get an idea of the changes being introduced. It also simplified additional automations, such as automated releases, by enabling the next software version to be inferred from the PR titles, determining whether a major or minor version update was needed (shout out to Frances Tibble for this brilliant idea).
We’ll dive into the AML pipeline in the next section – GenAI project.
GenAI Project
In our GenAI project, we set up a promptflow project with pipelines running either in Azure Machine Learning (AML) or a web app (or both) to ensure the template works seamlessly. One of our goals was to create a document processing flow that would provide users with an easy way to process documents and create a search index. As the document processing pipeline is an example one, we chunked PDF files and created search index for them.
This diagram was created by Martyna Marcinkowska, one of our talented team members.
As part of this initiative, we developed the following template AML pipelines:
- AML CI Pipeline: This pipeline sets up the environment in AML, creates the search index, and validates that everything is functioning correctly.
- AML CD Pipeline: This one handles creating or updating the compute instance, creating the index, and processing documents (loading, chunking, etc.).
Tools
The configuration of the tools needed to run the project is a crucial part of the template. For instance, we had a configuration for connecting to an Azure SQL server. When someone uses the template, they can easily set up their SQL server, connect to it effortlessly, and integrate their code to work with the SQL server. It is useful when you would like to get metadata or storage data as part of your flow.
This seamless configuration ensures that users can hit the ground running without getting bogged down in setup details. It’s like having a wizard by your side, guiding you through the process and making sure everything is configured correctly.
The goal of this setup is to foster cross-team collaboration. By making the configuration process straightforward and reusable, other teams can contribute back to the product’s reusable components and leverage them in future projects with minimal effort. This not only enhances efficiency but also promotes a culture of sharing and continuous improvement.
Starter
We also needed to automate the process of turning the template into a fully-fledged project so that all projects not only share the upstream codebase and pipelines, but also conform to the same secure baseline configurations that were required by the enterprise organisation, such as branch protection and the use of GitHub secrets to store sensitive data. To achieve this, we created another project called template-starter. This project leverages GitHub API, GitHub CLI and the GitHub provider for Terraform to create a new project from the template repository and to align it to the desired, compliant configuration.
Here’s how it works: the template-starter receives the new project name and configuration, then it automagically replaces those in the template code and creates a new repository with the new project code that just works out-of-the-box. It’s like having a project kickoff wizard that takes care of all the initial setup, so you can focus on the fun part—coding and building your project.
Project
The project is a fork that was created from the template repository, using the template-starter project. This magical tool takes care of all the initial setup, allowing you to focus on building and innovating. Once the template-starter receives the new project name and configuration, it automagically replaces those in the template code and creates a new repository with the new project code.
This new project inherits all the reusable components and configurations from the template, ensuring a smooth and efficient start. With everything set up, you can dive straight into development, knowing that the foundational elements are solid and ready to support your work.
How does this all come together?
- The user leverage the template-starter to deploy the infrastructure into Azure. This initial step sets up the necessary Azure infrastructure, including key vaults, service principles, and other configurations.
- The template-starter also creates a new project repository by cloning the template repository.
- The template-starter makes initial, project-specific, code augmentations as well as setting up the configurations needed for it to work with the provisioned Azure infrastructure.
- This new GenAI project will inherit all the reusable components, pipelines, and tools from the template, leveraging the Azure infrastructure configuration. This means that all the foundational elements are already in place, allowing the user to focus on the specifics of their project without worrying about the setup.
Finally, the user can start developing their project with confidence, knowing that the foundational elements are solid and ready to support their work. This streamlined process not only saves time but also ensures consistency and reliability across projects.
Conclusion
The Power of Reusability!
In conclusion, our journey through the template, pipelines, tools, and reference architecture, highlights the incredible power of reusability. By leveraging these components, we can streamline our development process, ensure consistency, and focus on innovation. The template-starter project, automates the initial setup, allowing us to hit the ground running with each new project, and the business to show results faster.
Whether it’s configuring tools, setting up pipelines, or deploying infrastructure, our approach ensures that we have a solid foundation to build upon. It saves our customers valuable time by minimising the need for different teams to be involved in the setup process. Moreover, it also reduces the risk of errors and enhances the overall quality of our projects.
As we continue to refine and expand our reusable components, we look forward to even greater efficiency and success in our future endeavours. Happy coding!
Acknowledgements
As building this was a great team effort, we would like to extend our heartfelt thanks to the amazing wizards who made it all possible:
The feature image was generated using Bing Image Creator. Terms can be found here.