Beyond the hype: Part 1, How trustworthy AI empowers US Government agencies

Kent Cunningham

Large language models’ (LLM) ability to transform mission workflows for US Government agencies is revolutionary and adoption is now starting to accelerate. As agencies begin to leverage LLMs, they unlock capabilities that promise to streamline processes and drive innovation forward. It also means that there’s more to be done in the near term to support the deployment of LLM applications.

With this great potential comes great responsibility. From a national security perspective, LLMs are essential for readiness and resilience. Adversaries have already invested heavily in this technology, employing AI to power disruptive campaigns and support innovations to further their national goals. This presents risk to our national interests, from more effective cyberattacks to more persuasive disinformation and propaganda campaigns.

US Government agencies are leveraging the latest LLMs running on Azure such as Azure OpenAI Service and Meta Llama 3 to unlock essential tasks including content and code creation, image analysis, and new use cases still being imagined. Achieving this requires a solid partnership between trusted industry providers and users across government—the human decision-makers who become empowered to understand, act, and innovate faster.

Overcoming the obstacles

From a “big picture” perspective, the use of LLMs seems simple enough: ask a natural language question and get a response, whether the request is to summarize hundreds of documents or create code for new applications. In practice, getting past the roadblocks to generative AI adoption means addressing legitimate concerns with a strategy that encompasses four key elements:

Building trust and user acceptance is essential

Engaging users is crucial for generative AI to have an impact on mission effectiveness. This goes beyond training; users need to trust the output and see the AI system as trustworthy and providing a tangible benefit. The key is to highlight how LLMs can transform work by streamlining processes, automating repetitive tasks with a human in the loop, and providing real-time insights.

The more familiar users become with the tools, the more likely they will find ways to embed them into their daily workflows. That’s why Microsoft regularly engages with agencies to help them understand the limitations and capabilities of current LLMs—which helps to build trust in AI-powered solutions. This blog is part of a series on how Microsoft is working with the government to establish a foundation of trust around the Azure OpenAI Service. The series will include advanced concepts for building Copilots and recommended tools and strategies to secure the AI data estate.

Creating a culture of trust in AI

Today we are at a crossroads, both resistance and enthusiasm for AI is dependent on the safety, security, and trustworthiness of platforms and tools. Transparency is the solution: being upfront about how generative AI technologies work from the standpoint of accuracy as well as security and privacy is essential to gaining users’ trust.

A look at how your data is managed through Azure OpenAI Service highlights Microsoft’s commitment to building AI systems that are safe, secure, and trustworthy by design. Microsoft hosts the OpenAI models in Microsoft’s Azure environment and the service does NOT interact with any services operated by OpenAI (e.g. ChatGPT, or the OpenAI API). Your data is your data and keeping your data safe is a fundamental aspect of the Azure OpenAI Service.  Your prompts (inputs) and completions (outputs), your embeddings, and your training data:

  • are NOT available to other customers.
  • are NOT available to OpenAI.
  • are NOT used to improve OpenAI models.
  • are NOT used to improve any Microsoft or 3rd party products or services.
  • are NOT used for automatically improving Azure OpenAI models for your use in your resource (The models are stateless unless you explicitly fine-tune models with your training data).
  • Your fine-tuned Azure OpenAI models are available exclusively for your use. 

Choosing the right technology matters

Adopting generative AI is an organization-wide undertaking. Agencies deploying the latest applications leveraging LLMs must be prepared to meet the governance and compliance requirements. At the same time, multiple LLMs, supported by various cloud providers, can muddy the choice of platform and which models to choose.

Microsoft’s investments in AI capabilities set our offerings for government apart. Azure OpenAI Service delivers the speed, scalability, and security that modern missions demand. This secure platform enables agency users to take advantage of OpenAI’s industry-leading LLMs and capabilities, along with Copilots that act as smart assistants to accelerate data-heavy and time-consuming processes. Microsoft continues to invest in tools to adopt, extend, and build copilot experiences through reference architectures and templates. 

Use cases are emerging from experimentation

A technology that appears to be able to do “everything” needs to deliver practical results that move the mission forward. As adoption becomes more prevalent, common use cases become more evident such as content generation, summarization, and code generation.

Advancing this experimentation has highlighted how customers are leveraging Azure AI Search and the Azure OpenAI Service together to extract advanced insights from their data. Those insights stay securely within the customer’s organization. A technique called retrieval augmented generation (RAG) keeps proprietary data secure while taking advantage of Azure OpenAI’s speed and analytical ability.

Image OCTO Blog1 BodyImage Figure1 1920x1080 v4

Figure 1. Retrieval Augmented Generation (RAG) utilizing AI Search

Advanced RAG utilizes pre-retrieval to enhance the quality of data retrieved using techniques such as query rewriting. Post-retrieval focuses on optimizing retrieved information and enhancing the prompts.

Image OCTO Blog1 BodyImage Figure2 1920x1080 v4

Figure 2. Integrating Advanced RAG into your GenAI workflows

The optimal way to build confidence in a particular environment’s ability to solve a specific challenge is almost always to try it. Fortunately, it’s easy to create secure test areas within Azure Government where fresh ideas and “what if?” scenarios can run without disturbing current workflows—potentially uncovering a use case that delivers a huge advantage for national security. 

Streamlining compliance requirements and regulations

President Biden’s Executive Order lays out a vision and requirements to use AI effectively, safely, and fairly, and the OMB Memo M 24-10 outlines a structured approach for doing so. On top of already stringent Zero Trust requirements, deploying AI-powered applications for regulated customers requires additional compliance and accreditation processes to ensure quality, safety, and adherence to standards.

Azure OpenAI Service is approved within the FedRAMP High authorization for Azure Commercial.  This enables agencies with stringent security and compliance requirements to utilize this industry-leading generative AI service at the unclassified level. Azure OpenAI Service is also now available in Azure Government, enabling agencies with stringent security and compliance functionality to utilize this industry-leading generative AI service. Microsoft has submitted Azure OpenAI Service to the Joint Authorization Board (JAB) as a service within the FedRAMP High authorization for Azure Government, and this service will also be submitted for additional authorization for Department of Defense (DoD) Impact Level (IL) 4 and 5. Microsoft will continue to make advanced AI capabilities available to the highest classification levels in the coming months.

Our use of responsible AI practices and commitment to transparency helps organizations comply with policies meant to protect systems, data, and people. This gives agencies an advantage, as our technologies are built to the stringent requirements to support secure workloads. Learn more about how we are supporting federal agencies in their implementation of the Executive Order.

Building trust in LLMs requires straight talk

The human element is essential for users to accept—and ultimately embrace—any new technology. Deploying LLM capabilities with disjointed frameworks and tools can lead to lack of transparency for how decisions are derived from a model. Building on other RAI tools in Azure, the new capabilities will make it easier for developers to monitor and mitigate the challenges associated with deploying generative AI responsibly (hallucinations, jailbreaks, etc.) while helping them improve the safety and quality of AI outputs.

These features are also unique in the market, demonstrating Microsoft’s innovation and investment in responsible AI tooling, and include:

  • Prompt Shields to detect and block prompt injection attacks, including a new model for identifying indirect prompt attacks before they impact your model, available in preview.
  • Groundedness detection to detect and block “hallucinations” in model outputs, coming soon.
  • Safety system messages to steer your model’s behavior toward safe, responsible outputs, coming soon.
  • Safety evaluations to assess an application’s vulnerability to jailbreak attacks and to generating harmful content, available in preview.
  • Risk and safety monitoring to understand what model inputs, outputs, and end users are triggering content filters to inform mitigations, coming soon.

Adopting an appropriate level of human oversight, backed by automated monitoring, ensures the accuracy, relevance, and robustness of LLM applications. On an individual level, users can be trained to verify the output before accepting it as ready for use.

That last point is crucial. Humans must always be in the loop to validate the results of a LLM query and determine what action should be taken. Even when processes are automated, such as a response to a cyber-attack, a human needs to be involved to ensure that the right steps are followed. AI should be considered a capable assistant, not a replacement for human insight and decision making.

Trustworthy approaches for deploying Azure OpenAI Service

Microsoft continues to make significant investments to help guard against abuse and unintended harm, which includes, but is not limited to, requiring applicants to show well-defined use cases, ensuring our engineering practices follow the responsible AI approach, building content filters to support customer needs, and providing responsible AI implementation guidance to onboarded customers. Microsoft also takes a principled approach to staying ahead of threat actors.

“Trustworthy AI” means AI that empowers users to innovate with confidence. Microsoft enables organizations across the government to easily implement responsible and trustworthy AI on Azure.

Taking advantage of the Azure OpenAI Service to accelerate mission workflows requires this secure, scalable, agile foundation—leading to breakthroughs that can protect and improve lives, augment readiness, and prepare for the challenges of the future.

Our AI strategy directly supports national objectives, driving innovation, enhancing security, and boosting efficiency. Get the latest news and announcements on AI for government delivered to your inbox at


Leave a comment

Feedback usabilla icon