Posts by this author

Jan 18, 2019
Post likes count0

Running Parallel Apache Spark Notebook Workloads On Azure Databricks

This article walks through the development of a technique for running Spark jobs in parallel on Azure Databricks. The technique enabled us to reduce the processing times for JetBlue's reporting threefold while keeping the business logic implementation straight forward. The technique can be re-used for any notebooks-based Spark workload on Azure Dat...

Big Data
May 17, 2018
Post likes count0

Using Otsu’s method to generate data for training of deep learning image segmentation models

In this article, we introduce a technique to rapidly pre-label training data for image segmentation models such that annotators no longer have to painstakingly hand-annotate every pixel of interest in an image. The approach is implemented in Python and OpenCV and extensible to any image segmentation task that aims to identify a subset of visually d...

Machine Learning
Jan 9, 2018
Post likes count0

Deploying a Linux Python web application to Service Fabric via Docker Compose

This article covers how to take a standard Python web service consisting of an application tier, a WSGI server, and a Nginx reverse proxy and deploy it via Linux containers to a Linux cluster managed by Azure Service Fabric using only simple tooling like Docker Compose.

DevOpsContainers
Dec 5, 2017
Post comments count1
Post likes count0

Comparing Image-Classification Systems: Custom Vision Service vs. Inception

This story covers how to get started with transfer-learning and build image classification models in Python with the Custom Vision Service. We compare the results with the popular Tensorflow-based models Inception and MobileNet.

Machine LearningCognitive Services
Nov 20, 2017
Post likes count0

Permissively-Licensed Named Entity Recognition on the JVM

The ability to correctly identify entities, such as places, people, and organizations, adds a powerful level of natural language understanding to applications. This post introduces a MIT-licensed one-click deployment to Azure for web services that lets developers get started with a wide range of natural language tasks in 5 minutes or less, by consu...

Machine LearningAzure App Services
Nov 1, 2017
Post likes count1

Building a Custom Spark Connector for Near Real-Time Speech-to-Text Transcription

This post describes in detail the Azure Cognitive Services speech-to-text WebSocket protocol and shows how to implement the protocol in Java. This enables us to transcribe audio to text in near real-time. We then show how to feed the transcribed radio into a pipeline based on Spark Streaming for further analysis, augmentation, and aggregation. The ...

Machine LearningCognitive Services