{"id":551,"date":"2016-03-21T09:00:06","date_gmt":"2016-03-21T17:00:06","guid":{"rendered":"http:\/\/blogs.msdn.microsoft.com\/pythonengineering\/?p=551"},"modified":"2019-02-17T15:27:10","modified_gmt":"2019-02-17T22:27:10","slug":"text-analytics-of-github-issues","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/python\/text-analytics-of-github-issues\/","title":{"rendered":"What do your users really think? Using Text Analytics to understand GitHub Issue Sentiment"},"content":{"rendered":"<p><a href=\"https:\/\/gallery.cortanaanalytics.com\/Notebook\/Analyzing-GitHub-Issue-Sentiment-using-Text-Analytics-3\" target=\"_blank\"><img decoding=\"async\" class=\"alignnone size-full wp-image-452\" src=\"https:\/\/devblogs.microsoft.com\/python\/wp-content\/uploads\/sites\/12\/2016\/03\/Launch-Notebook-Now-1.png\" alt=\"Launch Notebook Now!\" width=\"338\" height=\"50\" \/><\/a><\/p>\n<p>Ever get the feeling your users aren&#8217;t that happy with your project? We all get those issues that are real downers on our repository. So I thought, let&#8217;s take these issues and make something fun. Using the Text Analytics Service and the WordCloud Python package, we can make some pretty pictures out of otherwise negative comments. I also found it fun to make clouds of the more positive issues.<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/12\/2019\/02\/PTVS-Word-Cloud1.png\"><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/python\/wp-content\/uploads\/sites\/12\/2016\/03\/PTVS-Word-Cloud1-1.png\" alt=\"PTVS-Word-Cloud\" width=\"500\" class=\"alignnone size-full wp-image-531\" \/><\/a><\/p>\n<p>Below you will find a few snippets on how to do this yourself. If you want to just run this against your favorite GitHub project you can open a Jupyter notebook using the notebook link above. The below code is shown just to give you an idea of what the notebook does. In order to make it function you would need to complete a few additional bits of code.<\/p>\n<p>&nbsp;<\/p>\n<p><strong>Step 1: Install some libraries<\/strong><\/p>\n<p>We use a few libraries and should start by installing them. <a href=\"https:\/\/pypi.python.org\/pypi\/CortanaAnalytics\">CortanaAnalytics<\/a> is a small library to wrap requests to Azure Data Market Services. <a href=\"https:\/\/pypi.python.org\/pypi\/PyGithub\">PyGitHub<\/a> serves a similar purpose for GitHub. <a href=\"https:\/\/pypi.python.org\/pypi\/wordcloud\">WordCloud<\/a> helps to make pretty pictures.<\/p>\n<pre>pip install <a href=\"https:\/\/pypi.python.org\/pypi\/CortanaAnalytics\">CortanaAnalytics<\/a>\npip install <a href=\"https:\/\/pypi.python.org\/pypi\/PyGithub\">PyGitHub<\/a>\npip install <a href=\"https:\/\/pypi.python.org\/pypi\/wordcloud\">wordcloud<\/a><\/pre>\n<p>&nbsp;<\/p>\n<p><strong>Step 2: Get API keys to access Azure Text Analytics and GitHub<\/strong><\/p>\n<p>We need two API keys so we can access some services.<\/p>\n<p>You can get an account key for the Text Analytics Service by signing up at\u00a0<a href=\"http:\/\/azure.microsoft.com\/en-us\/marketplace\/partners\/amla\/text-analytics\/\">http:\/\/azure.microsoft.com\/en-us\/marketplace\/partners\/amla\/text-analytics<\/a>\nYou can get\u00a0a GitHub API Key by creating a token at\u00a0<a href=\"https:\/\/github.com\/settings\/tokens\">https:\/\/github.com\/settings\/tokens<\/a>.<\/p>\n<p>&nbsp;<\/p>\n<p><strong>Step 3: Get some issues from GitHub<\/strong><\/p>\n<p>Once you have API Keys, you just need to get GitHub issues.<\/p>\n<pre>\nimport github\n\ng = github.Github(GITHUB_ACCESS_TOKEN)\nr = g.get_repo(GITHUB_REPOSITORY)\nissues = r.get_issues(state='open')\n<\/pre>\n<p><strong> Step 4: Analyse an issue using the Text Analytics Service<\/strong><\/p>\n<p>Once you have the GitHub issues we can iterate on them arranging them into text bits that can be analysed by Text Analytics. We will batch Sentiment requests together for the issue to cut down on the overall number of requests.<\/p>\n<pre>\nfrom cortanaanalytics.textanalytics import TextAnalytics\ntext_bits_to_analyse = [\n    { 'Id':0, 'Text':issue.title },\n    { 'Id':1, 'Text':issue.body }\n]\n\nta = TextAnalytics(AZURE_PRIMARY_ACCOUNT_KEY)\nsentiments = ta.get_sentiment_batch(text_bits_to_analyse)\n\ntitle_sentiment = sentiments[0]['Score']\nbody_sentiment = sentiments[1]['Score']\n<\/pre>\n<p>&nbsp;<\/p>\n<p><strong> Step 5: Get Key Phrases<\/strong><\/p>\n<p>We can also get key phrases using the same issues we used for sentiment.<\/p>\n<pre>\nkey_phrases = ta.get_key_phrases_batch([{ 'Id':i.number, 'Text':i.body }])[0]['KeyPhrases']\n<\/pre>\n<p><strong>Step 6: Generate some pretty pictures of our data using WordCloud<\/strong>\nAnd once we have all of that, we can go ahead and make word clouds.<\/p>\n<pre>from wordcloud import WordCloud\nimport matplotlib.pyplot as plt\n\ndef show_wordcloud(frequencies):\n    frequencies_cleaned = [x for x in frequencies if x[0].lower() not in words_to_remove]\n    \n    wordcloud = WordCloud(width=1920, height=1080).generate_from_frequencies(frequencies_cleaned)\n    plt.axis(\"off\")\n    plt.imshow(wordcloud)\n\nshow_wordcloud(key_phrases)<\/pre>\n<p>And that&#8217;s it. You can try to reproduce this on your own locally or run the notebook and experiment from the Azure Notebooks environment.<\/p>\n<p>&nbsp;<\/p>\n<p><a href=\"https:\/\/gallery.cortanaanalytics.com\/Notebook\/Analyzing-GitHub-Issue-Sentiment-using-Text-Analytics-3\" target=\"_blank\"><img decoding=\"async\" class=\"alignnone size-full wp-image-452\" src=\"https:\/\/devblogs.microsoft.com\/python\/wp-content\/uploads\/sites\/12\/2016\/03\/Launch-Notebook-Now-1.png\" alt=\"Launch Notebook Now!\" width=\"338\" height=\"50\" \/><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Ever get the feeling your users aren&#8217;t that happy with your project? We all get those issues that are real downers on our repository. So I thought, let&#8217;s take these issues and make something fun. Using the Text Analytics Service and the WordCloud Python package, we can make some pretty pictures out of otherwise negative [&hellip;]<\/p>\n","protected":false},"author":382,"featured_media":10119,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[4],"tags":[8,11,14],"class_list":["post-551","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-jupyter","tag-azure-notebooks","tag-cortana-analytics","tag-jupyter"],"acf":[],"blog_post_summary":"<p>Ever get the feeling your users aren&#8217;t that happy with your project? We all get those issues that are real downers on our repository. So I thought, let&#8217;s take these issues and make something fun. Using the Text Analytics Service and the WordCloud Python package, we can make some pretty pictures out of otherwise negative [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/python\/wp-json\/wp\/v2\/posts\/551","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/python\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/python\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/python\/wp-json\/wp\/v2\/users\/382"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/python\/wp-json\/wp\/v2\/comments?post=551"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/python\/wp-json\/wp\/v2\/posts\/551\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/python\/wp-json\/wp\/v2\/media\/10119"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/python\/wp-json\/wp\/v2\/media?parent=551"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/python\/wp-json\/wp\/v2\/categories?post=551"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/python\/wp-json\/wp\/v2\/tags?post=551"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}