Case Study Archives

Introduction: The Gradient Command Line Interface (CLI) is a powerful yet easy utility to automate the optimization of your Spark jobs from your terminal, command prompt, or automation scripts. Whether you are a Data Engineer, SysDevOps administrator, or just an Apache Spark enthusiast, knowing how to use the Gradient CLI can be incredibly beneficial as

Pete Tamisin
11 Jul 2023

Blog, Case Study

Insider’s engineering blog discusses how they integrated the Sync Gradient API into their Airflow pipelines to continuously monitor and reduce costs.

Jeffrey Chou
09 Apr 2023

Case Study

Here at Sync we are passionate about optimizing cloud infrastructure for Apache Spark workloads. One question we receive a lot is “Do Graviton instances help lower costs?” For a little background information, AWS built their own processors which promise to be a “major leap” in performance. Specifically for Spark on EMR, AWS published a report

Jeffrey Chou
04 Apr 2023

Blog, Case Study

Here at Sync, we are passionate about optimizing data infrastructure on the cloud, and one common point of confusion we hear from users is what kind of worker instance size is best to use for their job? Many companies run production data pipelines on Apache Spark in the elastic map reduce (EMR) platform on AWS.

Jeffrey Chou
01 Mar 2023

Blog, Case Study

As many previous blog posts have reported, tuning and optimizing the cluster configurations of Apache Spark is a notoriously difficult problem. Especially when a data engineer needs to lower costs or accelerate runtimes on platforms such as EMR or Databricks on AWS, tuning these parameters becomes a high priority. Here at Sync, we will experimentally

Jeffrey Chou
07 Feb 2023

Blog, Case Study

Here at Sync we are always trying to learn and optimize complex cloud infrastructure, with the goal to help more knowledge to the community. In our previous blog post we outlined a few high level strategies companies employ to squeeze out more efficiency in their cloud data platforms. One very popular response from mid-sized to

Jeffrey Chou
20 Jan 2023

Blog, Case Study

Category:
Case Study

How to Use the Gradient CLI Tool to Optimize Databricks / EMR Programmatically

How Insider Reduced their EMR Cost by 25%

Do Graviton instances lower costs for Spark on EMR on AWS?

How does the worker size impact costs for Apache Spark on EMR AWS?

Databricks driver sizing impact on cost and performance

Is Databricks autoscaling cost efficient?