New and improved NuGet Search is here!

Karan Nandwani

Karan

It’s been a long time coming, and today we are excited to announce the new and improved search on NuGet.org leveraging Azure Search. We want to start this post with a huge thanks to you, the NuGet community, for providing feedback. We have aggregated all feedback around search result relevance into one mega issue. We used this as the starting point and ensured the most egregious cases were fixed before we launched the side-by-side preview experience a few weeks ago. 70% of you voted that the new search is better! This was one of the key results in deciding to move forward leading up to today’s release.

Try the new search today!

  • Search on NuGet.org
  • Search using the NuGet package manager UI in Visual Studio (or an IDE of your choice)

Ensure your NuGet clients point to the V3 API endpoint for NuGet.org https://api.nuget.org/v3/index.json.

Key improvements

Improved weighting for popularity – we have adjusted the weight for package download count such that popular packages show up higher. For example, “Microsoft.Extensions“.

Improved weighting for popularity

Multi-term matching – packages where the ID and other metadata contain all the search terms are ranked higher. For example, “Entity Framework Core“.

Multi-term matching

Improved tokenization – we have improved the way we tokenize search terms resulting in more search queries which match results. For example, searching for “microsoft.aspnetcore.static” previously returned no results.

Improved tokenization

Better default search results – results are now ordered by descending download count. Previously, top results were sorted by nuget 2.x operations which over time became less relevant since it does not account for traffic from newer clients.

Better default search results

Reduced package publishing time

In the past, we had to decide when to reload the index and download it from blob storage. With Azure Search, the updated package metadata is almost immediately available. This will bring down indexing time significantly, reducing package publishing time.

Behind the scenes

We have re-architected the search infrastructure from the ground-up. NuGet.org search is now powered by the Azure Search service. Our previous search implementation was based on Lucene.Net which is very powerful but not very easy to maintain. Azure Search abstracts away having to directly interact with Lucene and allows us to better control the factors that contribute towards the quality and relevancy of search results. The new infrastructure also makes us more agile and enables us to continuously and quickly iterate on the ranking algorithm to deliver ever-improving results.

What’s next?

From the side-by-side preview feedback, 30% voted either the old or neither results were better. We are taking a closer look at these cases to understand how we can return better results for an even larger set of queries. We will continue to tweak the search algorithm, run A/B tests, and improve the search relevancy.

In addition, we have a healthy pipeline of new features and experiences for NuGet search. The top of that list is Package Applicability, i.e. the ability to search by TFM (target framework). The re-architecture and the move to Azure Search provides us with a strong foundation to build the next set of productivity-boosting experiences.

Feedback

Use the GitHub issue tracking this experience to provide feedback and report any cases where the new search has regressed or does not behave as expected. You can also reach out to us on twitter – mention @nuget in your tweets.

Karan Nandwani
Karan Nandwani

Program Manager, NuGet

Follow Karan   

11 comments

  • Avatar
    Thomas Ardal

    First of all, it is awesome to see NuGet search improve. Searching nuget.org has been awfully slow in the past, why moving to something like Azure Search seems like a great choice. With that said, I believe that the search can be improved. When searching for exact package names, the right search result is no longer shown. Like if I search for “serilog.sinks.http” then that package is shown as the fifth result. I guess I know why since the search is broken down into terms and that the 4 packages above are more popular. But, in my opinion, if hitting an exact package name with a search, that package should always be on top.

    Also, I believe that matches for most-left terms should be more relevant than terms on the right (don’t know if that sentence made sense). Like searching for “elmah.io” returns “system.io” as the second result. Again, I see why since “io” matches and that package is popular. But I cannot think of a situation where a user is making that search to get to the system.io package πŸ™‚

    • Karan Nandwani
      Karan Nandwani

      Hi Thomas, thanks for the feedback.

      We noticed that returning exact matches isn’t always the best result. For example – searching for entity, you don’t want the package named “entity” but probably “entityframework”. We have some ideas on how to improve that experience.

      As for “matches for most-left terms should be more relevant than terms on the right” – I understand what you mean which is why when searching for elmah.io, package ID that contains both elmah and io is ranked higher. Only returning packages that contain all the search terms is not always the right result. Consider “microsoft.aspnetcore.static”. If we didn’t tokenize it the way we do, no results would be returned since there is no package that has microsoft+aspnetcore+static in the id.

      • Avatar
        Raman Jindal

        Kiran, I belive Thomas’s point is that searches with exact matches should be given more weightage compared to matches determined by other ways (like download count, tokenizer etc.) And I think there should not be any reason to not show the exact match at top. 

      • Avatar
        Raman Jindal

        I agree to Thomas view that if there is matching search with exact name then it should be on top. Example you mentioned around “entity” that it should return “entityframeowrk”, the assumption made in that case is individual is looking for more popular packages which may not be correct always.

      • Avatar
        Jon

        Nevertheless, not listing *exact matches* first is an annoying, counter-intuitive time waster. Pull it up and try to find “Microsoft.AspNetCore.Components.Authorization” … it lists over 58,000 matches, and on the website it doesn’t show up until page 7. It’s even harder to deal with in the VStudio GUI because you can’t do convenient things like a browser full-text-search to skip all the noise.Frankly the new search is terrible, as it exists now. Regarding your example, the right thing to do would be to list all of those speculative matches after the exact matches.

  • Angel Wang
    Angel Wang

    We are using nuget list to find latest available package in our automation, but it returned a list of not exact matching packages with its version. and it broke our build automation. Is there anyway to force it to return exact match?

Leave a comment