Showing results for Big Data - ISE Developer Blog

Jan 18, 2019

Running Parallel Apache Spark Notebook Workloads On Azure Databricks

Clemens Wolff
Clemens Wolff

This article walks through the development of a technique for running Spark jobs in parallel on Azure Databricks. The technique enabled us to reduce the processing times for JetBlue's reporting threefold while keeping the business logic implementation straight forward. The technique can be re-used for any notebooks-based Spark workload on Azure Dat...

Big Data
Dec 12, 2018

Social Stream Pipeline on Databricks with auto-scaling and CI/CD using Travis

Mor Shemesh
Mor Shemesh

This code story describes CSE's work with ZenCity to create a data pipeline on Azure Databricks supported by a CI/CD pipeline on TravisCI. The aim of the collaboration was to create a pipeline capable of processing a stream of social posts, analyzing them, and identifying trends.

DevOpsBig DataAzure App Services