Snowflake Unistore and Hybrid Tables Explained

Snowflake’s new Unistore workload and Hybrid Tables represent a seismic shift in database technology.  For the past 40 years, data has been stored in independent data silos with transactional data (also known as Online Transaction Processing systems or OLTP) used to support operational workloads and analytic (Operational Data Stores, Data Warehouses or Data Lake) solutions to support Online Analytic Processing (OLAP) workloads.

With Snowflake Unistore, both of these workloads can be seamlessly managed on a single cloud platform.

Snowflake has now delivered a single unified platform (Unistore), which supports both transaction and analytic workloads at massive scale.

What problem are we trying to solve? 

It may not be obvious, but the transactional workloads have wildly different characteristics to analytic processing and these are summarised below:

Comparison of Transactional and Analytic Workloads

Whereas transactional systems typically handle very short, fast updates to single rows with transaction completion averaging 50 milliseconds, most analytic queries process billions of rows taking seconds to minutes to complete.

The diagram below illustrates the typical workload profile on an transactional system.  It shows a fairly repeatable pattern with a higher rate of transactions during the working day with a dip around lunch-time until the evening batch processes start.

Compare the above workload with an typical analytic workload in which there's no pattern to usage at all aside from a wild swings as huge analytic queries are submitted and successfully completed.

The diagram below illustrates the resulting system architecture in which operational systems are kept on entirely separate hardware platforms from analytic queries to avoid contention between these hugely different workloads. 

On the left hand side you have the operational (transactional) systems which run the business while on the right the data warehouse (analytical) systems support business intelligence, ad-hoc reporting and data scientists.

While this architecture has dominated the IT landscape for 40 years, it does have a number of significant drawbacks including:

  • Complexity:  As a result of the separation of transactional and analytical platforms, we need to stitch together multiple disparate technologies from different vendors.

  • Extract Transform and Load:  We need to perform periodic extracts, loads and data transformationto produce useful insights from the analytic systems.  These data pipelines are often complex, fragile and expensive to maintain.

  • Latency:  Despite the best efforts to produce near real-time data loading and automated pipelines, there is an in-built latency involved in extracting and loading data to the analytics platform.

  • Data Silos:  Are the result of workload separation which often leads to further complexity in blending operational and analytic data.

  • Difficulty closing the loop:  There's an increasing need to close the loop between operational and analytic solutions.  For example, retailers can use analytic results to identify cross-sell and up-sell opportunities based upon machine learning models but these need to be quickly acted upon by operational systems.  This becomes increasingly difficult with independently deployed solutions.

To help address some of these concerns, some architects implemented an Operational Data Store (ODS) between the transactional and analytic platforms and while this partly addresses the latency issue, it adds further to system complexity and overall cost.

The Snowflake Solution - Unistore

 Snowflake was originally built to deliver a scalable solution to analytic workloads, but this capability has now been extended to transactional workloads with Unistore.

 Hybrid tables delivers an entirely new database processing engine built from the ground up to support both transactional and analytic workloads at scale.  These build upon the existing Snowflake framework and include additional features including:

  •  Low latency lookups:  The ability to execute sub-second single row fetches on using a unique key

  • Primary Key Constraints:  Primary key constraints are enforced

  • Foreign Key Constraints:  To enforce referential integrity

  • Unique and Non-Unique Indexes: To provide alternate access to data

  • Row Level Locking:  Support for transactions with row level locking and Read Committed isolation level

The SQL code below illustrates the code needed to create a hybrid table:

create hybrid table customer (
  customer_id 			integer primary key,
  full_name 			varchar(255),
  email 			varchar(255),
  customer_info 		variant
);

Of course, because it’s Snowflake you will also get a huge list of additional features including:

Unistore and Hybrid Tables Explained

A hybrid analytic and transactional workload

The ability to host transactional and analytic workloads on a single database platform would by itself be a significant achievement, but Snowflake has gone further.  As data is loaded into Snowflake hybrid tables, the data is stored to support fast single-row transactions.  However, the same data is also stored in columnar format which can be up to 50 times faster at analytic queries. 

This approach is often referred to as  Hybrid Transactional and Analytical Processing (HTAP) and was first described in 2009 by Dr. Hasso Plattner is illustrated in the diagram below:

 

The underlying Unistore technology is based upon the ability to manage both workloads, so data written using the transaction processing engine is also available via the analytic engine.  This means the same platform can be used to support both workloads without the challenges associated with Extract, Transform and Load operations.

What do customers think?

Adobe has been using Hybrid Tables in support of it’s cross-channel marketing campaign. They said:

"Running Adobe Campaign on Snowflake has enabled us to offer unparalleled speed and scale to our customers, who can now leverage our best-in-class cross-channel campaign management functionality with performance that can't be matched," - Nick Hall. (Head of Adobe Campaign).

A new target architecture

With the new Snowflake Unistore technology, transactional systems can be deployed directly upon Snowflake without the need to extract and load to a separate platform.  This leads to a huge architectural simplification and avoids the need to support multiple platforms and skill sets.  Furthermore, data loaded into Snowflake hybrid tables is seamlessly available for analysis which avoids the need to support both a transactional and operational data stores or an ODS.

The diagram below illustrates the new target architecture.

Snowflake Unistore Architecture

The Benefits of using Unistore and Hybrid Tables

The benefits of using Hybrid Tables include

  • A single, consistently enforced set of data governance and security controls including Tagging and Dynamic Masking.

  • All the existing Snowflake benefits including Time Travel, Zero Copy Clones and Data Sharing.

  • Cloud performance at scale including the ability to dynamically resize the warehouse and scale up to larger workloads.

  • Finally because Hybrid Tables also deliver excellent performance for analytic type queries from a columnar data store without the need to extract from from a transactional system, and load and process it on an analytics platform. This means whenever Snowflake executes a query, it uses either the row store or column store - depending which is most efficient.

Conclusion 

For the past 40 years, database workloads have been separately deployed on transactional or analytic platforms to avoid contention and balance the needs of these two diametrically opposed solutions.  Finally with Snowflake Unistore and hybrid tables, we can achieve transactional query performance in the milliseconds  range with concurrency of up to 10,000 queries per second on the same analytic platform used to query billions of rows.

Clearly, this is a seismic shift in database industry and will help cement Snowflake's leadership in delivering innovative solutions.

Notice Anything Missing?

No annoying pop-ups or adverts. No bull, just facts, insights and opinions. Sign up below and I will ping you a mail when new content is available. I will never spam you or abuse your trust. Alternatively, you can leave a comment below.

Disclaimer: The opinions expressed on this site are entirely my own, and will not necessarily reflect those of my employer.

John A. Ryan

John A. Ryan

Previous
Previous

Top 10 Strategies for Snowflake Cost Optimization

Next
Next

What is Snowflake Query Acceleration Service?