{"id":4497,"date":"2025-04-07T08:00:08","date_gmt":"2025-04-07T15:00:08","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/azure-sql\/?p=4497"},"modified":"2025-04-06T20:38:28","modified_gmt":"2025-04-07T03:38:28","slug":"enhancing-search-capabilities-in-sql-server-and-azure-sql-with-hybrid-search-and-rrf-re-ranking","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/azure-sql\/enhancing-search-capabilities-in-sql-server-and-azure-sql-with-hybrid-search-and-rrf-re-ranking\/","title":{"rendered":"Enhancing Search Capabilities in SQL Server and Azure SQL with Hybrid Search and RRF Re-Ranking"},"content":{"rendered":"<p data-start=\"265\" data-end=\"648\">In today&#8217;s data-driven world, delivering precise and contextually relevant search results is critical. SQL Server and Azure SQL Database now enable this through Hybrid Search\u2014a technique that combines traditional full-text search with modern vector similarity search. This allows developers to build intelligent, AI-powered search experiences directly inside the database engine.<\/p>\n<p data-start=\"265\" data-end=\"648\">Funny enough one of the key algorithms needed for implementing Hybrid Search, <a href=\"https:\/\/en.wikipedia.org\/wiki\/Okapi_BM25\">BM25<\/a>, has been included in <a href=\"https:\/\/learn.microsoft.com\/en-us\/sql\/relational-databases\/search\/limit-search-results-with-rank?view=sql-server-ver16#ranking-of-freetexttable\">SQL Server is almost forever<\/a>, without much fanfare: RAG pattern and similarity search is bringing it to the light.<\/p>\n<h2 data-start=\"650\" data-end=\"675\">What Is Hybrid Search?<\/h2>\n<p data-start=\"677\" data-end=\"727\">Hybrid Search blends two key search paradigms:<\/p>\n<ul data-start=\"729\" data-end=\"1117\">\n<li data-start=\"729\" data-end=\"903\">\n<p data-start=\"731\" data-end=\"903\">Full-Text Search (FTS): Uses the <code>FREETEXTTABLE<\/code> function, which ranks results using the BM25 algorithm. This provides high-quality keyword-based relevance scoring.<\/p>\n<\/li>\n<li data-start=\"907\" data-end=\"1117\">\n<p data-start=\"909\" data-end=\"1117\">Vector Search: Introduced via the new <code>VECTOR<\/code> data type and <code>VECTOR_DISTANCE()<\/code> function, vector search enables semantic matching of embeddings\u2014ideal for scenarios involving natural language queries.<\/p>\n<\/li>\n<\/ul>\n<p data-start=\"1119\" data-end=\"1263\">By combining both, you can return results that are relevant not only because of exact term matches but also because of their underlying meaning.<\/p>\n<h2 data-start=\"1265\" data-end=\"1287\">Re-Ranking with RRF<\/h2>\n<p data-start=\"1289\" data-end=\"1489\">Once you&#8217;ve run both search strategies, you can improve the final result list by applying Reciprocal Rank Fusion (RRF)\u2014a simple, effective method for combining ranked lists from different sources.<\/p>\n<p data-start=\"1491\" data-end=\"1654\">While SQL Server and Azure SQL don&#8217;t offer a native <code>RRF()<\/code> function, RRF is straightforward to implement manually with basic SQL logic. The formula is simple:<\/p>\n<div>\n<div dir=\"ltr\">\n<pre class=\"prettyprint language-default\"><code class=\"language-default\">1 \/ (rank + k)<\/code><\/pre>\n<\/div>\n<\/div>\n<p data-start=\"1683\" data-end=\"1826\">Where rank is the position of the item in its respective list, and <code>k<\/code> is a constant (commonly 60, though it can be tuned for your scenario, I like to set it equal to the <code>k<\/code> of requested results)<\/p>\n<p data-start=\"1828\" data-end=\"2044\">This makes it easy to write RRF logic directly in SQL, without needing any custom functions or extensions. You can assign RRF scores to each document using a SELECT statement, then sort by the final combined score.<\/p>\n<h2 data-start=\"2046\" data-end=\"2070\">Sample Implementation<\/h2>\n<p data-start=\"2072\" data-end=\"2324\">You can find a working example of Hybrid Search with manual RRF in the <a href=\"https:\/\/github.com\/Azure-Samples\/azure-sql-db-openai\/blob\/main\/vector-embeddings\/07-hybrid-search.sql\" target=\"_new\" rel=\"noopener\" data-start=\"2143\" data-end=\"2279\">Azure SQL + OpenAI samples repo<\/a>. Here&#8217;s a simplified version of the process:<\/p>\n<pre class=\"prettyprint language-sql\"><code class=\"language-sql\">-- Keyword (Fulltext) search\r\nSELECT TOP(@k)\r\n  id, \r\n  RANK() OVER (ORDER BY ft_rank DESC) AS rank,\r\nINTO\r\n  #keyword_search\r\nFROM\r\n  (\r\n    SELECT TOP(@k)\r\n      id,\r\n      ftt.[RANK] AS ft_rank,\r\n    FROM \r\n      dbo.wikipedia_articles_embeddings w\r\n    INNER JOIN \r\n      FREETEXTTABLE(dbo.wikipedia_articles_embeddings, *, @q) AS ftt ON w.id = ftt.[KEY]\r\n    ORDER BY\r\n      ft_rank DESC\r\n  ) AS freetext_documents\r\nORDER BY\r\nrank ASC\r\n\r\n-- Semantic search\r\nSELECT TOP(@k)\r\n  id, \r\n  RANK() OVER (ORDER BY cosine_distance) AS rank\r\nINTO\r\n  #semantic_search\r\nFROM (\r\n  SELECT TOP(@k)\r\n    id, \r\n    VECTOR_DISTANCE('cosine', @e, content_vector_ada2) AS cosine_distance\r\n  FROM \r\n    dbo.wikipedia_articles_embeddings w\r\n  ORDER BY\r\n    cosine_distance\r\n) AS similar_documents\r\n\r\n-- Reciprocal Rank Fusion (RRF)\r\nSELECT TOP(@k)\r\n  COALESCE(ss.id, ks.id) AS id,\r\n  ss.rank AS semantic_rank,\r\n  ks.rank AS keyword_rank,\r\n  COALESCE(1.0 \/ (@k + ss.rank), 0.0) +\r\n  COALESCE(1.0 \/ (@k + ks.rank), 0.0) AS score  \r\nFROM\r\n  #semantic_search ss\r\nFULL OUTER JOIN\r\n  #keyword_search ks ON ss.id = ks.id\r\nORDER BY \r\n  score DESC<\/code><\/pre>\n<p data-start=\"2972\" data-end=\"3143\">This approach gives more weight to results that rank highly in either method, ensuring that strong keyword matches or highly semantically similar items surface at the top.<\/p>\n<h2 data-start=\"3145\" data-end=\"3162\">Practical Benefits of Hybrid Search + RRF<\/h2>\n<p>By combining full-text and vector-based search, and applying RRF for re-ranking, you&#8217;re able to build smarter and more flexible search solutions directly within SQL Server or Azure SQL. This approach is not only technically efficient but also developer-friendly and highly adaptable to real-world use cases.<\/p>\n<p>The Reciprocal Rank Fusion (RRF) algorithm was introduced in the 2009 paper <a href=\"https:\/\/plg.uwaterloo.ca\/~gvcormac\/cormacksigir09-rrf.pdf\">Reciprocal Rank Fusion Outperforms Condorcet and Individual Rank Learning Methods by Cormack et al<\/a>. It&#8217;s a simple yet powerful method to merge ranked result lists from different sources, giving each source a fair chance to influence the final ranking.<\/p>\n<p>Here are some key reasons why this pattern is worth adopting:<\/p>\n<ul>\n<li><strong>No External Tools Required<\/strong>: Everything runs within SQL Server or Azure SQL\u2014no need to call out to a vector database or separate search engine.<\/li>\n<li><strong>Simple to Implement<\/strong>: Even without a native RRF function, the scoring logic is minimal and easy to express in SQL.<\/li>\n<li><strong>More Relevant Results<\/strong>: Hybrid search enhances both precision and recall by blending lexical and semantic signals.<\/li>\n<li><strong>Cost-Effective<\/strong>: Leveraging built-in database features reduces architecture complexity and operational costs.<\/li>\n<li><strong>Scalable<\/strong>: Because it all happens within the database engine, performance scales predictably with your data and infrastructure.<\/li>\n<\/ul>\n<h2 data-start=\"3531\" data-end=\"3548\">Final Thoughts<\/h2>\n<p data-start=\"3550\" data-end=\"3860\">Hybrid Search with BM25, vector similarity, and RRF-style re-ranking unlocks a new level of intelligence in your SQL-based search applications. With native support for full-text and vector queries, and easy-to-implement ranking logic, you can build powerful, production-ready search experiences using just SQL.<\/p>\n<p data-start=\"3862\" data-end=\"4117\">Give it a try with the <a href=\"https:\/\/github.com\/Azure-Samples\/azure-sql-db-openai\/blob\/main\/vector-embeddings\/07-hybrid-search.sql\" target=\"_new\" rel=\"noopener\" data-start=\"3885\" data-end=\"4001\">sample code<\/a> and explore how to take your application\u2019s search capabilities to the next level\u2014without ever leaving the database.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In today&#8217;s data-driven world, delivering precise and contextually relevant search results is critical. SQL Server and Azure SQL Database now enable this through Hybrid Search\u2014a technique that combines traditional full-text search with modern vector similarity search. This allows developers to build intelligent, AI-powered search experiences directly inside the database engine. Funny enough one of the [&hellip;]<\/p>\n","protected":false},"author":24720,"featured_media":4508,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[601],"tags":[657,658,591],"class_list":["post-4497","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","tag-rrf","tag-semantic-search","tag-vector-search"],"acf":[],"blog_post_summary":"<p>In today&#8217;s data-driven world, delivering precise and contextually relevant search results is critical. SQL Server and Azure SQL Database now enable this through Hybrid Search\u2014a technique that combines traditional full-text search with modern vector similarity search. This allows developers to build intelligent, AI-powered search experiences directly inside the database engine. Funny enough one of the [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/posts\/4497","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/users\/24720"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/comments?post=4497"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/posts\/4497\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/media\/4508"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/media?parent=4497"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/categories?post=4497"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/tags?post=4497"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}