Everything You Need To Know About Azure Databricks Pricing 2024
When trying to determine Databricks pricing, one of the most important aspects to consider is the cost of your cloud provider. This means one of three companies: Microsoft Azure, Amazon AWS, or Google Cloud.
All three cloud service providers are extremely popular, and there’s no finite answer to which is best. Most companies choose one based on their existing software stack or suite of products. This is a perfectly fine way to make your decision on cloud service provider, and all three work very well with Databricks.
However there are some small differences in price, users experience, and integration between each of the three. That’s why we decided to put together a quick guide explaining the exact costs of Databricks Azure.
If you’re looking for a full guide on everything related to Databricks pricing, including how to calculate your compute cost, check out our full guide here. For everyone else, continue on here.
First off, what is a “Cloud Service Provider”?
According to Google, a Cloud Service Provider is “a third-party company that provides scalable computing resources that businesses can access on demand over network”.
In practical terms, this means storage, computing power or database access that enterprise companies use over the internet. Amazon Web Services, Google Cloud Platform, Oracle, Alibaba Cloud, IBM Cloud and Microsoft Azure are among the most popular web service providers.
In the context of Databricks specifically, your cloud service provider is the processing layer on which the Databricks analytics platforms runs. Your virtual machines (VMs), data storage, securities and compute costs are all tied to your cloud provider. For this reason, there are slightly different costs based on instance type, compute type virtual machine and adds on like security and storage.
Cost of Databricks Azure vs AWS vs Google Cloud
Among the three cloud services provider, AWS generally seems to be the cheapest, while Azure is the most expensive. The difference in cost depends on the type of compute but an apples-to-apples comparison of Jobs Compute for the Standard tier is $0.07 per dbu hour for AWS, $0.10 for Google Cloud and $0.15 for Azure Databricks. However this difference is negated for All Purpose Computes, with all three providers coming in at $0.40 per dbu/hour.
Part of the reason for the higher cost of Azure is that with Azure Databricks is considered a Microsoft first party service—meaning it’s natively integrated with Microsoft and optimized for a host of their products including Power BI, Azure Synapse Analytics and Azure Data Lake Storage. This is a very unusual move and puts Databricks in rarified air for third party companies with which Microsoft has made a partnership.
As the Microsoft suite of products is so popular among analysts, Azure often has the benefit of a network effects as companies will often already being using its cloud service when they start using Databricks.
This has been a largely successful partnership and users have reported positively on the ease of access and sharing with Azure Databricks (the portal can be setup and accessed with a single click), and the included mission critical tech support — which can itself turn into thousands of dollars spent annually when purchased through Databricks.
How Instances and VMs Affect Azure Databricks Pricing
When it comes to calculating your total cost for Databricks with Azure, your instance number and type will play a large role in total cost.
Instances refer to virtual machines (VMs), which, as the name suggests, are processing hardware you are allocated by your cloud service provider (in this case Azure). The important factors to consider with Instance is the type and size of your virtual machine.
For Azure there are several different types of virtual machines that will be referred to under Instance type:
- General Purpose
- Compute Optimized
- Memory Optimized
- Accelerated Computing
- Storage Optimized
These instance types are straightforward, with the name explaining what they are best used for. You will notice families of Instance denoted by the first letter in their naming convention which classify them as running large workloads (R series) or sustained high performance (G series). Instances will often list their generation, denoted as a “v” with the generation or version number next to it (v1, v2, v3, etc). Newer generation instances will generally cost most.
Apart from instance type, the instance size determines how much you pay for processing power. The two factors here are number of CPU cores, and total RAM. CPU cores are often listed after the first letter of an instance (for example E16d has 16 cores). The larger the instance, the more it will cost with instances of the same type and generation costing $0.825 per DBU hour as with a 4-core instance, and up to $18.15 per DBU hour with a 96-core instance.
To get your total cost of Databricks, add your DBU compute price to your monthly instance price.
Azure Databricks Compute Pricing
Here is a quick breakdown of compute type for a standard plan in the U.S. Central Zone:
- Jobs Light Compute: $0.07/Dbu-hour
- Jobs Compute: $0.15/Dbu-hour
- All-Purpose Compute: $0.40/Dbu-hour
Here is a breakdown of services only available in premium pricing plan in the U.S. Central Zone:
- SQL Compute: $0.22/dbu-hour
- SQL Pro Compute: $0.44/dbu-hour
- Serverless SQL: $0.44/dbu-hour
- Serverless Real-Time Inference: $0.079/dbu-hour
For a full breakdown check out the dedicated Microsoft Azure page on Databricks pricing.
Saving with Pre-purchase Plans.
One of the big ways you can save on Azure Databricks pricing is through the use of pre-purchase plans—also known as Databricks Commit Units. In essence, you are predicting a certain amount of Databrick’s usage and paying for that amount up front. The incentive for doing this are large savings—up to 37%.
Pre purchase plans come in 1 year plans or 3 years plans. The more you buy—both in terms of Dbus and time duration of your deal—the more you will save. Here’s a breakdown of how much you can save for each level of Databricks pricing.
Databricks commit unit (DBCU) | Price (with discount) | Discount | Year of Contract |
25,000 | $23,500 | 6% | 1 year |
50,000 | $46,000 | 8% | 1 year |
100,000 | $89,000 | 11% | 1 year |
200,000 | $1,72,000 | 14% | 1 year |
350,000 | $2,87,000 | 18% | 1 year |
500,000 | $4,00,000 | 20% | 1 year |
750,000 | $5,77,500 | 23% | 1 year |
1,000,000 | $7,30,000 | 27% | 1 year |
1,500,000 | $10,50,000 | 30% | 1 year |
2,000,000 | $13,40,000 | 33% | 1 year |
75,000 | $69,000 | 8% | 3 year |
150,000 | $135,000 | 10% | 3 year |
300,000 | $261,000 | 13% | 3 year |
600,000 | $504,000 | 16% | 3 year |
1,050,000 | $819,000 | 22% | 3 year |
1,500,000 | $1,140,000 | 24% | 3 year |
2,250,000 | $1,642,500 | 27% | 3 year |
3,000,000 | $2,070,000 | 31% | 3 year |
4,500,000 | $2,970,000 | 34% | 3 year |
6,000,000 | $3,780,000 | 37% | 3 year |
Enhanced Security & Compliance Add-on
For premium tier Azure customers processing regulated data, Azure Databricks offers enhanced security and controls for their compliance needs. This is offered at 10% of list price added to the Azure Databricks product spend in a selected workspace. Read more about the security and complain add on here.
Conclusion: Is Azure Databricks Worth It?
If you’re a company that really values the Microsoft suite and enjoys working with Azure, then Azure Databricks is 100% worth it. It’s highly integrated, and has a tremendous support team and UI experience.
If you are not yet invested with Azure or the Microsoft Suite of products, it may not be worth the additional cost premium to run Databricks on Azure. If you’re looking for the cheapest cloud service provider on which to run Databricks, your best bet is likely Amazon AWS.
Again, all three major cloud service providers are popular, and it’s really hard to go wrong with one. Make sure to evaluate cost and integration with existing software when evaluating the choice that’s best for you.
More from Sync:
Data-driven dollars: How Gradient decodes ROI
Data-driven dollars: How Gradient decodes ROI
Sync Computing Joins NVIDIA Inception to Expand to GPU Management
Sync Computing Joins NVIDIA Inception to Expand to GPU Management