Blog Archives

Databricks recently released System Tables, to help users understand and monitor their usage. They also had the older and more skunk works project called Overwatch, which also provides usage and infrastructure information. So what’s the difference between the two? When should I use either one? Let’s dive in! Introduction Databricks as a whole is a

Jeffrey Chou
02 Oct 2023

Blog

Databricks is a popular unified analytics platform and a go-to solution for many organizations looking to harness the power of big data. Its collaborative workspaces have become the industry standard for data engineering and data science teams and an ideal environment for building, training and deploying machine learning and AI models at scale. However—as with

Jeffrey Chou
28 Sep 2023

Blog

Since we launched Gradient to help control and optimize Databricks Jobs, one piece of feedback from users was crystal clear to us: “It’s hard to set up” And we totally agreed. By nature, Gradient is a pretty deep infrastructure product that needs not only your Spark eventlogs but also cluster information from your cloud provider.

Pete Tamisin
22 Sep 2023

Blog

Realities of company and cloud complexities require new levels of control and autonomy to meet business goals at scale

Jeffrey Chou
10 Sep 2023

Blog

Configuring Databricks clusters can seem more like art than science. We’ve reported in the past about ways to optimize worker and driver nodes, and how the proper selection of instances impacts a job’s cost and performance. We’ve also discussed how autoscaling performs, and how it’s not always the most efficient choice for static jobs. In

Jeffrey Chou
17 Aug 2023

Blog

Introduction: The Gradient Command Line Interface (CLI) is a powerful yet easy utility to automate the optimization of your Spark jobs from your terminal, command prompt, or automation scripts. Whether you are a Data Engineer, SysDevOps administrator, or just an Apache Spark enthusiast, knowing how to use the Gradient CLI can be incredibly beneficial as

Pete Tamisin
11 Jul 2023

Blog, Case Study

Category:
Blog

What’s the difference between Databricks’s Overwatch and System Tables tools?

The Easy and Comprehensive Guide To Understanding Databricks Pricing: How It Works and How To Reduce Your Cost

New Gradient quick-start notebooks- Optimize your Databricks jobs in minutes

Why Your Data Pipelines Need Closed-Loop Feedback Control

Are Databricks clusters with Photon and Graviton instances worth it?

How to Use the Gradient CLI Tool to Optimize Databricks / EMR Programmatically