How to Use the Gradient CLI Tool to Optimize Databricks / EMR Programmatically
How to Use the Gradient CLI Tool to Optimize Databricks / EMR Programmatically
Introduction: The Gradient Command Line Interface (CLI) is a powerful yet easy utility to automate the optimization of your Spark jobs from your terminal, command prompt, or automation scripts. Whether you are a Data Engineer, SysDevOps administrator, or just an Apache Spark enthusiast, knowing how to use the Gradient CLI can be incredibly beneficial as
Pete Tamisin
11 Jul 2023
Blog, Case Study
How Insider Reduced their EMR Cost by 25%
How Insider Reduced their EMR Cost by 25%
Insider’s engineering blog discusses how they integrated the Sync Gradient API into their Airflow pipelines to continuously monitor and reduce costs.
Jeffrey Chou
09 Apr 2023
Case Study
How does the worker size impact costs for Apache Spark on EMR AWS?
How does the worker size impact costs for Apache Spark on EMR AWS?
Here at Sync, we are passionate about optimizing data infrastructure on the cloud, and one common point of confusion we hear from users is what kind of worker instance size is best to use for their job? Many companies run production data pipelines on Apache Spark in the elastic map reduce (EMR) platform on AWS.
Jeffrey Chou
01 Mar 2023
Blog, Case Study