Developing Gradient Part II
Developing Gradient Part II
Introduction: Using Gradient in a Workflow Gradient, the latest product release from Sync Computing, helps customers manage the infrastructure behind their recurring Apache Spark applications. Gradient gives infrastructure recommendations for each job to lower the cost of their Production jobs while hitting their target SLA’s. We’ve been hard at work on this project for a
Sean Gorsky
19 Jun 2023
Blog
Developing Gradient Part I
Developing Gradient Part I
Introduction Sync recently introduced Gradient, a tool that helps data engineers manage and optimize their compute infrastructure. The primary facet of Gradient is a Project which groups a sequence of runs of a Databricks job. After each run, the Spark eventlog and cluster information is sent to Sync. That accumulated project data is then fed
Sean Gorsky
Blog
Introducing: Gradient for Databricks
Introducing: Gradient for Databricks
Wow the day is finally here! It’s been a long journey, but we’re so excited to announce our newest product: Gradient for Databricks. Checkout our promo video here! The quick pitch Gradient is a new tool to help data engineers know when and how to optimize and lower their Databricks costs – without sacrificing performance.
Jeffrey Chou
Blog
Do Graviton instances lower costs for Spark on EMR on AWS?
Do Graviton instances lower costs for Spark on EMR on AWS?
Here at Sync we are passionate about optimizing cloud infrastructure for Apache Spark workloads. One question we receive a lot is “Do Graviton instances help lower costs?” For a little background information, AWS built their own processors which promise to be a “major leap” in performance. Specifically for Spark on EMR, AWS published a report
Jeffrey Chou
04 Apr 2023
Blog, Case Study
How poor provisioning of cloud resources can lead to 10X slower Apache Spark jobs
How poor provisioning of cloud resources can lead to 10X slower Apache Spark jobs
The Situation Let’s say you’re a data engineer and you want to run your data/ML Spark job on AWS as fast as possible. You want to avoid slow Apache Spark performance. After you’ve written your code to be as efficient as possible, it’s time to deploy to the cloud. Here’s the problem, there are over
Jeffrey Chou
24 Mar 2023
Blog
How does the worker size impact costs for Apache Spark on EMR AWS?
How does the worker size impact costs for Apache Spark on EMR AWS?
Here at Sync, we are passionate about optimizing data infrastructure on the cloud, and one common point of confusion we hear from users is what kind of worker instance size is best to use for their job? Many companies run production data pipelines on Apache Spark in the elastic map reduce (EMR) platform on AWS.
Jeffrey Chou
01 Mar 2023
Blog, Case Study