{"id":230355,"date":"2020-08-24T08:00:56","date_gmt":"2020-08-24T15:00:56","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/visualstudio\/?p=230355"},"modified":"2020-08-24T10:39:03","modified_gmt":"2020-08-24T17:39:03","slug":"the-making-of-intellicodes-first-deep-learning-model-a-research-journey","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/visualstudio\/the-making-of-intellicodes-first-deep-learning-model-a-research-journey\/","title":{"rendered":"The making of Visual Studio IntelliCode&#8217;s first deep learning model: a research journey"},"content":{"rendered":"<p><span style=\"font-size: 14pt;\"><strong>Introduction<\/strong><\/span><\/p>\n<p><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">Since the first <\/span><a href=\"https:\/\/visualstudio.microsoft.com\/services\/intellicode\/\" target=\"_blank\" rel=\"noreferrer noopener\"><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"none\">IntelliCode<\/span><\/a><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">code completion\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">model<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">was shipped<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0in Visual\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">Studio and<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0Visual Studio Code<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">in 2018<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">,\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">it has become an essential\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">coding assistant<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0for\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">millions of<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"> developers <\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">a<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">round<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">the\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">world.<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0In\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">the\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">past two year<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">s, we have been\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">working tirelessly to enable IntelliCode for more programming languages\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">and<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">,<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">in<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"> the meantime<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">,<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">researching<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0ways<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0to<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0improve<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0the model precision and coverage<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">to deliver\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">an even\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">more satisfying\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">user experience<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">.\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">One of\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">our\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">major research effort<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">s<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0was to bring the latest advancement<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">s<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0in deep learning\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">for\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">natural language <\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">modeli<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">ng<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">to\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">programming language\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">modeling.\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">After\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">leveraging technologies like\u00a0<\/span><a href=\"https:\/\/azure.microsoft.com\/en-us\/services\/machine-learning\/\" target=\"_blank\" rel=\"noreferrer noopener\"><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"none\">Azure Machine Learning<\/span><\/a><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0and\u00a0<\/span><a href=\"https:\/\/microsoft.github.io\/onnxruntime\/\" target=\"_blank\" rel=\"noreferrer noopener\"><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"none\">ONNX Runtime<\/span><\/a><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">,<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">we\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">have\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">successfully<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">shipped\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">the first<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">deep learning model for all the\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">IntelliCode\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">Python\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">users in V<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">isual Studio\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">Code.<\/span><\/p>\n<p><span style=\"font-size: 14pt;\"><strong>The Research Journey<\/strong><\/span><\/p>\n<p><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">The<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0journey\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">started\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">with a research exploration\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">in applying<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">language<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0model<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">ing\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">technique<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">s<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">in<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0natural language processing\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">for learning\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">P<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">ython code.<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">We\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">focused on the current IntelliCode member completion scenario<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">as shown in\u00a0<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">Figure 1 below<\/span><span lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\">.\u00a0<\/span><\/p>\n<p><figure id=\"attachment_230366\" aria-labelledby=\"figcaption_attachment_230366\" class=\"wp-caption aligncenter\" ><img decoding=\"async\" class=\"wp-image-230366 size-full\" src=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2020\/08\/IntelliCode_DL_img1.png\" alt=\"Image illustrates IntelliCode Completion of tensorflow types\" width=\"870\" height=\"607\" srcset=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2020\/08\/IntelliCode_DL_img1.png 870w, https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2020\/08\/IntelliCode_DL_img1-300x209.png 300w, https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2020\/08\/IntelliCode_DL_img1-768x536.png 768w\" sizes=\"(max-width: 870px) 100vw, 870px\" \/><figcaption id=\"figcaption_attachment_230366\" class=\"wp-caption-text\">Figure 1. Example of member completions powered by IntelliCode for Python in Visual Studio Code<\/figcaption><\/figure><\/p>\n<p>&nbsp;<\/p>\n<p>The fundamental task is to find the most likely member of a type given a code snippet preceding the member invocation. In other words, given original code snippet C, the vocabulary V, and the set of all possible methods M \u2282 V, we would like to determine:<\/p>\n<p><img decoding=\"async\" class=\"aligncenter wp-image-230375 size-medium\" src=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2020\/08\/IntelliCode_DL_img2-300x41.png\" alt=\"Image illustrates the learning task\" width=\"300\" height=\"41\" srcset=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2020\/08\/IntelliCode_DL_img2-300x41.png 300w, https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2020\/08\/IntelliCode_DL_img2.png 513w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/p>\n<p>To find that member, we need to build a model capable of predicting likelihood of the available members.<\/p>\n<p>Previous state-of-the-art <a href=\"https:\/\/en.wikipedia.org\/wiki\/Recurrent_neural_network\">recurrent neural network<\/a> (RNN) based approaches only leveraged the sequential nature of the source code, trying to transfer natural language methods without taking advantage of unique characteristics of programming language syntax and code semantics. The nature of the code completion problem made <a href=\"https:\/\/en.wikipedia.org\/wiki\/Long_short-term_memory\">long short term memory<\/a> (LSTM) networks a promising candidate. During data preparation for model training, we leveraged partial abstract syntax tree (AST)s corresponding to code snippets containing member access expressions and module function invocations, aiming to capture semantics carried by distant code.<\/p>\n<p>Training deep neural networks is a computationally intensive task that requires high-performance computing clusters. We used a data-parallel distributed training framework <a href=\"https:\/\/github.com\/horovod\/horovod\">Horovod<\/a> with <a href=\"https:\/\/arxiv.org\/abs\/1412.6980\">Adam optimizer<\/a>, keeping a copy of an entire neural model on each worker, processing different mini-batches of the training dataset in parallel lockstep. <span class=\"TextRun BCX8 SCXW149441735\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun CommentStart BCX8 SCXW149441735\">We\u00a0<\/span><\/span><span class=\"TextRun BCX8 SCXW149441735\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun BCX8 SCXW149441735\">utilized<\/span><\/span><span class=\"TextRun BCX8 SCXW149441735\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun CommentStart BCX8 SCXW149441735\">\u00a0<\/span><\/span><a class=\"Hyperlink BCX8 SCXW149441735\" href=\"https:\/\/azure.microsoft.com\/en-us\/services\/machine-learning\/\" target=\"_blank\" rel=\"noreferrer noopener\"><span class=\"TextRun Underlined BCX8 SCXW149441735\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"none\"><span class=\"NormalTextRun BCX8 SCXW149441735\" data-ccp-charstyle=\"Hyperlink\">Azure\u00a0<\/span><\/span><span class=\"TextRun Underlined BCX8 SCXW149441735\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"none\"><span class=\"NormalTextRun BCX8 SCXW149441735\" data-ccp-charstyle=\"Hyperlink\">Machine Learning<\/span><\/span><\/a><span class=\"TextRun BCX8 SCXW149441735\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun BCX8 SCXW149441735\">\u00a0<\/span><\/span><span class=\"TextRun BCX8 SCXW149441735\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun BCX8 SCXW149441735\">for<\/span><\/span><span class=\"TextRun BCX8 SCXW149441735\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun BCX8 SCXW149441735\">\u00a0<\/span><\/span><span class=\"TextRun BCX8 SCXW149441735\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun BCX8 SCXW149441735\">model<\/span><\/span><span class=\"TextRun BCX8 SCXW149441735\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun BCX8 SCXW149441735\">\u00a0<\/span><\/span><span class=\"TextRun BCX8 SCXW149441735\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun BCX8 SCXW149441735\">training<\/span><\/span><span class=\"TextRun BCX8 SCXW149441735\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun BCX8 SCXW149441735\">\u00a0<\/span><\/span><span class=\"TextRun BCX8 SCXW149441735\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun BCX8 SCXW149441735\">and<\/span><\/span><span class=\"TextRun BCX8 SCXW149441735\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun BCX8 SCXW149441735\">\u00a0<\/span><\/span><span class=\"TextRun BCX8 SCXW149441735\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun BCX8 SCXW149441735\">hyper<\/span><\/span><span class=\"TextRun BCX8 SCXW149441735\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun BCX8 SCXW149441735\">\u00a0<\/span><\/span><span class=\"TextRun BCX8 SCXW149441735\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun BCX8 SCXW149441735\">parameter<\/span><\/span><span class=\"TextRun BCX8 SCXW149441735\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun BCX8 SCXW149441735\">\u00a0<\/span><\/span><span class=\"TextRun BCX8 SCXW149441735\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"auto\"><span class=\"NormalTextRun BCX8 SCXW149441735\">tuning <\/span><\/span>because its on-demand GPU cluster service made it easy to scale up our training as needed, and it helped to provision and manage clusters of VMs, schedule jobs, gather results, and handle failures. The Table 1 showed the model architectures we tried and their corresponding accuracy and model size.<\/p>\n<p>&nbsp;<\/p>\n<p><figure id=\"attachment_230376\" aria-labelledby=\"figcaption_attachment_230376\" class=\"wp-caption aligncenter\" ><img decoding=\"async\" class=\"wp-image-230376 size-full\" src=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2020\/08\/IntelliCode_DL_img3.png\" alt=\"Image illustrates the top 5 accuracy and associated model size for the different deep learning model architectures\" width=\"596\" height=\"122\" srcset=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2020\/08\/IntelliCode_DL_img3.png 596w, https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2020\/08\/IntelliCode_DL_img3-300x61.png 300w\" sizes=\"(max-width: 596px) 100vw, 596px\" \/><figcaption id=\"figcaption_attachment_230376\" class=\"wp-caption-text\">Table 1. The top 5 accuracy and associated model size for the different deep learning model architectures<\/figcaption><\/figure><\/p>\n<p>&nbsp;<\/p>\n<p>We chose to productize with predicted embedding due to its smaller model size and 20% model accuracy improvement comparing to the previous production model during offline model evaluation; model size is critical to production deployability.<\/p>\n<p>The model architecture is shown in the Figure 2 below:<\/p>\n<p><figure id=\"attachment_230377\" aria-labelledby=\"figcaption_attachment_230377\" class=\"wp-caption aligncenter\" ><img decoding=\"async\" class=\"wp-image-230377 size-full\" src=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2020\/08\/IntelliCode_DL_img4.png\" alt=\"Architecture diagram illustrating IntelliCode's deep LSTM model\" width=\"540\" height=\"322\" srcset=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2020\/08\/IntelliCode_DL_img4.png 540w, https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2020\/08\/IntelliCode_DL_img4-300x179.png 300w\" sizes=\"(max-width: 540px) 100vw, 540px\" \/><figcaption id=\"figcaption_attachment_230377\" class=\"wp-caption-text\">Figure 2. Architecture diagram illustrating IntelliCode&#8217;s deep LSTM model<\/figcaption><\/figure><\/p>\n<p>&nbsp;<\/p>\n<p>To deploy the LSTM model into production, we had to improve the model inference speeds and memory footprint to meet edit-time code completion requirements. Our memory budget was about 50MB and we needed to keep the mean inference speed under 50 milliseconds. The IntelliCode LSTM model was trained with TensorFlow and we chose <a href=\"https:\/\/microsoft.github.io\/onnxruntime\/\">ONNX Runtime<\/a> for inferencing to get the best performance. ONNX Runtime works with popular deep learning frameworks and makes it easy to integrate into different serving environments by providing APIs covering a variety of languages including Python, C, C++, C#, Java, and JavaScript \u2013 we used the .NET Core compatible C# APIs to integrate into the <a class=\"Hyperlink SCXW17507816 BCX8\" href=\"https:\/\/github.com\/Microsoft\/python-language-server\" target=\"_blank\" rel=\"noreferrer noopener\"><span class=\"FieldRange SCXW17507816 BCX8\"><span class=\"TextRun Underlined SCXW17507816 BCX8\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"none\"><span class=\"NormalTextRun CommentStart SCXW17507816 BCX8\" data-ccp-charstyle=\"Hyperlink\">Microsoft Python Language Se<\/span><\/span><span class=\"TextRun Underlined SCXW17507816 BCX8\" lang=\"EN-US\" xml:lang=\"EN-US\" data-contrast=\"none\"><span class=\"NormalTextRun SCXW17507816 BCX8\" data-ccp-charstyle=\"Hyperlink\">rver<\/span><\/span><\/span><\/a>.<\/p>\n<p>Quantization is an effective approach for model size reduction and performance acceleration if the accuracy drop introduced by low bit width numbers approximation is acceptable. With the post-training INT8 quantization provided by ONNX Runtime, the resulting improvement was significant: both memory footprint and inference time were brought down to about a quarter of the pre-quantized values, comparing to the original model with an acceptable 3% reduction of model accuracy. You can find details of the model architecture design, hyperparameter tuning, accuracy and performance in the <a href=\"https:\/\/www.kdd.org\/kdd2019\/accepted-papers\/view\/pythia-ai-assisted-code-completion-system\">research paper<\/a> we published at the 2019 KDD conference.<\/p>\n<p>The final gate of the release to production was doing online A\/B experimentation comparing the new LSTM model and the previous production model. The online A\/B experimentation results in the Table 2 below showed about 25% improvement on top-1 recommendation precision (the precision of the first recommended completion item in the completion list) and 17% improvement on <a href=\"https:\/\/en.wikipedia.org\/wiki\/Mean_reciprocal_rank\">mean reciprocal rank<\/a>(MRR), which convinced us the new LSTM model is significantly better than the previous production model.<\/p>\n<p><figure id=\"attachment_230380\" aria-labelledby=\"figcaption_attachment_230380\" class=\"wp-caption aligncenter\" ><img decoding=\"async\" class=\"wp-image-230380 size-full\" src=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2020\/08\/IntelliCode_DL_img5.png\" alt=\"A\/B testing result showing significant metrics' improvements\" width=\"686\" height=\"210\" \/><figcaption id=\"figcaption_attachment_230380\" class=\"wp-caption-text\">Table 2. A\/B testing result showing significant metrics&#8217; improvements<\/figcaption><\/figure><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"font-size: 14pt;\"><strong>Python developers: Try IntelliCode completions<\/strong><strong> and send us your feedback! <\/strong><\/span><\/p>\n<p>With a great team effort, we completed the staged roll-out of the first deep learning model to all the IntelliCode Python users in Visual Studio Code. In the latest release of the\u00a0<a href=\"https:\/\/aka.ms\/vsic\/xtn\/vscode\">IntelliCode extension for Visual Studio Code<\/a>, we\u2019ve also integrated ONNX Runtime and the LSTM model to work with the new <a href=\"https:\/\/nam06.safelinks.protection.outlook.com\/?url=https%3A%2F%2Fmarketplace.visualstudio.com%2Fitems%3FitemName%3Dms-python.vscode-pylance&amp;data=02%7C01%7Cshengyfu%40microsoft.com%7Cd283c41fd30941597f2508d844d6d85c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637335036837375230&amp;sdata=5qhTmfSp5StreHQp6vRlfqzCZd6ZbTlNsstSiO3%2FOLY%3D&amp;reserved=0\">Pylance<\/a> extension, which is written entirely in TypeScript. If you\u2019re a Python developer, please install the IntelliCode extension and provide us feedback.<\/p>\n<p><span style=\"font-size: 14pt;\"><strong>What\u2019s next?<\/strong><\/span><\/p>\n<p>We are looking forward to shipping the deep learning model of member completion for more programming languages in IntelliCode\u2019s coming releases. \u00a0In the meantime, we are actively working on more advanced <a href=\"https:\/\/en.wikipedia.org\/wiki\/Transformer_(machine_learning_model)\">transformer based deep learning model<\/a> for even longer code completions.<\/p>\n<p><span style=\"font-size: 14pt;\"><strong>How can you leverage what we\u2019ve learned?<\/strong><\/span><\/p>\n<p>Along this journey, ONNX Runtime and Azure Machine Learning were critical in making these developments possible. You can learn how to leverage them in your own scenarios from these links (<a href=\"https:\/\/microsoft.github.io\/onnxruntime\/\">ONNX Runtime<\/a>, <a href=\"https:\/\/azure.microsoft.com\/en-us\/services\/machine-learning\/\">Azure Machine Learning<\/a>). The ONNX and Azure Machine Learning teams will be happy to hear your feedback!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>After\u00a0leveraging technologies like\u00a0Azure Machine Learning\u00a0and\u00a0ONNX Runtime,\u00a0IntelliCode has\u00a0successfully\u00a0shipped\u00a0the first\u00a0deep learning model for all the\u00a0IntelliCode\u00a0Python\u00a0users in Visual Studio\u00a0Code.\u00a0This blogpost gives a detailed account of the journey from research to model deployment.<\/p>\n","protected":false},"author":33576,"featured_media":230358,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[155],"tags":[6790,6789,6784,6786,4469,467,6785,1054,6787,6788,526],"class_list":["post-230355","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-visual-studio","tag-azure-machine-learning","tag-azure-ml","tag-data-science","tag-deep-learning","tag-developer","tag-intellicode","tag-intellicode-completions","tag-ml","tag-onnx","tag-onnx-runtime","tag-productivity"],"acf":[],"blog_post_summary":"<p>After\u00a0leveraging technologies like\u00a0Azure Machine Learning\u00a0and\u00a0ONNX Runtime,\u00a0IntelliCode has\u00a0successfully\u00a0shipped\u00a0the first\u00a0deep learning model for all the\u00a0IntelliCode\u00a0Python\u00a0users in Visual Studio\u00a0Code.\u00a0This blogpost gives a detailed account of the journey from research to model deployment.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/posts\/230355","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/users\/33576"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/comments?post=230355"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/posts\/230355\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/media\/230358"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/media?parent=230355"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/categories?post=230355"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/tags?post=230355"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}