New and improved NuGet Search is here!

Karan Nandwani

Karan

It’s been a long time coming, and today we are excited to announce the new and improved search on NuGet.org leveraging Azure Search. We want to start this post with a huge thanks to you, the NuGet community, for providing feedback. We have aggregated all feedback around search result relevance into one mega issue. We used this as the starting point and ensured the most egregious cases were fixed before we launched the side-by-side preview experience a few weeks ago. 70% of you voted that the new search is better! This was one of the key results in deciding to move forward leading up to today’s release.

Try the new search today!

  • Search on NuGet.org
  • Search using the NuGet package manager UI in Visual Studio (or an IDE of your choice)

Ensure your NuGet clients point to the V3 API endpoint for NuGet.org https://api.nuget.org/v3/index.json.

Key improvements

Improved weighting for popularity – we have adjusted the weight for package download count such that popular packages show up higher. For example, “Microsoft.Extensions“.

Improved weighting for popularity

Multi-term matching – packages where the ID and other metadata contain all the search terms are ranked higher. For example, “Entity Framework Core“.

Multi-term matching

Improved tokenization – we have improved the way we tokenize search terms resulting in more search queries which match results. For example, searching for “microsoft.aspnetcore.static” previously returned no results.

Improved tokenization

Better default search results – results are now ordered by descending download count. Previously, top results were sorted by nuget 2.x operations which over time became less relevant since it does not account for traffic from newer clients.

Better default search results

Reduced package publishing time

In the past, we had to decide when to reload the index and download it from blob storage. With Azure Search, the updated package metadata is almost immediately available. This will bring down indexing time significantly, reducing package publishing time.

Behind the scenes

We have re-architected the search infrastructure from the ground-up. NuGet.org search is now powered by the Azure Search service. Our previous search implementation was based on Lucene.Net which is very powerful but not very easy to maintain. Azure Search abstracts away having to directly interact with Lucene and allows us to better control the factors that contribute towards the quality and relevancy of search results. The new infrastructure also makes us more agile and enables us to continuously and quickly iterate on the ranking algorithm to deliver ever-improving results.

What’s next?

From the side-by-side preview feedback, 30% voted either the old or neither results were better. We are taking a closer look at these cases to understand how we can return better results for an even larger set of queries. We will continue to tweak the search algorithm, run A/B tests, and improve the search relevancy.

In addition, we have a healthy pipeline of new features and experiences for NuGet search. The top of that list is Package Applicability, i.e. the ability to search by TFM (target framework). The re-architecture and the move to Azure Search provides us with a strong foundation to build the next set of productivity-boosting experiences.

Feedback

Use the GitHub issue tracking this experience to provide feedback and report any cases where the new search has regressed or does not behave as expected. You can also reach out to us on twitter – mention @nuget in your tweets.

Karan Nandwani
Karan Nandwani

Program Manager, NuGet

Follow Karan   

4 comments

  • Avatar
    Thomas Ardal

    First of all, it is awesome to see NuGet search improve. Searching nuget.org has been awfully slow in the past, why moving to something like Azure Search seems like a great choice. With that said, I believe that the search can be improved. When searching for exact package names, the right search result is no longer shown. Like if I search for “serilog.sinks.http” then that package is shown as the fifth result. I guess I know why since the search is broken down into terms and that the 4 packages above are more popular. But, in my opinion, if hitting an exact package name with a search, that package should always be on top.

    Also, I believe that matches for most-left terms should be more relevant than terms on the right (don’t know if that sentence made sense). Like searching for “elmah.io” returns “system.io” as the second result. Again, I see why since “io” matches and that package is popular. But I cannot think of a situation where a user is making that search to get to the system.io package πŸ™‚

    • Karan Nandwani
      Karan Nandwani

      Hi Thomas, thanks for the feedback.

      We noticed that returning exact matches isn’t always the best result. For example – searching for entity, you don’t want the package named “entity” but probably “entityframework”. We have some ideas on how to improve that experience.

      As for “matches for most-left terms should be more relevant than terms on the right” – I understand what you mean which is why when searching for elmah.io, package ID that contains both elmah and io is ranked higher. Only returning packages that contain all the search terms is not always the right result. Consider “microsoft.aspnetcore.static”. If we didn’t tokenize it the way we do, no results would be returned since there is no package that has microsoft+aspnetcore+static in the id.

Leave a comment

Your email address will not be published. Required fields are marked *