Designing Snowflake Role Based Access Solutions

Lock.png

This article describes a design pattern to deliver a simple but scalable and extensible role-based access control architecture for Snowflake. It removes a huge amount of the complexity involved in delivering an RBAC design and sets the standard for how RBAC should be deployed on Snowflake.

Three-Part Series on RBAC

This article is part of a series of three, including:

  1. Introducing Role Based Access Control, an introduction to RBAC concepts

  2. Snowflake System Defined Roles - Best Practice, which describes Best Practices for RBAC design

  3. This Article - How to architect an RBAC solution.

    Together, they describe the underlying RBAC concepts and best practices needed to deploy a robust but simple Role-Based Access architecture.

A Quick Summary of RBAC Concepts

The diagram below illustrates a typical RBAC scenario that illustrates the nature of granting access to database objects and the difference between ownership of a role and an object.

Snowflake Cloud Data Platform

The diagram above illustrates a number of important RBAC concepts which may be summarised under the headings of:

  • Object Ownership: The PROD_DATA_ADMIN role owns the Database, Schema and Tables and can therefore Insert, Update and delete from the tables and grant access to other roles. Effectively this role manages the objects.

  • Role Ownership: PROD_ROLE_ADMIN owns the PROD_DATA_ADMIN role. Despite the fact it owns the role it has no access to the underlying data which would require a grant. It could however, grant the PROD_DATA_ADMIN role to other users or roles (including itself). Effectively this role manages a role.

  • Inheritance: PROD_DATA_ADMIN has been granted the PROD_USER role. This means it inherits all of the underlying grants and effectively also owns the underlying Database, Schema and Tables. This means RICK has the exact same privileges on the database as LEON.

  • Object Grants: The PROD_READ role has been granted USAGE privilege on the Database and Schema and SELECT on the tables. This means RACHEL can query the data, but unlike LEON or RICK, cannot alter the table structure.

It is worth pausing to fully understand these concepts, as they form the underlying RBAC principles that we will use to implement the solution.

Simplifying Grants

Albert Einstein is quoted as saying “Everything Should be a Simple as Possible, and no Simpler” and while this should be a key tenet of any design, it is especially so for Role-Based Access design where there is almost unlimited scope for complexity. To understand the need fo simplicity, fast forward five years and consider the database landscape for your project. It’s likely you’ll have over 5,000 tables and data accessed by over 20 roles accessed by thousands of users.

Consider the number of grants alone needed to support this system. As every table is potentially accessible using SELECT, INSERT, UPDATE and DELETE access, that means you will need between 5,000 and 20,000 grants per role.

In the very worst case, you might need up to 400,000 individual grants. It’s almost impossible to visualise that number of privileges, let alone verify they are correct. Clearly we need a simple solution.

The SQL statement below shows the way to simply object level grants.

grant select, insert, update, delete, references
   on all tables
   in schema main
   to role PROD_ANALYST;
   

Using this single statement we can reduce the number of individual grants by a factor of 20 and means you can be 100% sure of the access granted to every user with the PROD_ANALYST role.

Future Grants

We can further simplify the process of granting access to new objects with Future Grants. The number of tables in a given schema will never remain static and additional tables and views will be added over time. To ensure any newly added tables are accessible in exactly the same way, Snowflake introduced the concept of Future Grants.

grant select, insert, update, delete, references
   on FUTURE tables
   in schema main
   to role PROD_ANALYST;
   

The SQL above shows the syntax to create a rule whereby any additional tables created in the same schema automatically triggers he creation of the same grants to the PROD_ANALYST role.

Using this feature means once the future grants are created, they are automatically applied whenever new objects are created.

Controlling Grants with Managed Access Schemas

As indicated above, whenever a role has ownership of an object they have full control over it, including the ability to grant access to others.
The diagram below illustrates a common situation whereby the database and schemai s owned by a central administrator (SYSADMIN), while the DATA_ANALYST role has ownership on the Tables and is allowed to create and modify tables in the schema.

130 - Users allowed to grant access.png

The above scenario has a major drawback, in that the owner of the table can grant access to any other role. This means there’s a potential risk that LEON could grant SELECT on the tables to anybody else on the account which is a potential security risk.
The diagram below illustrates a solution in which the schema is marked as a managed access schema.

140 - managed access schemas.png

In the above scenario, despite the fact the PROD_DATA_ANALYST role has created and therefore owns tables in the MAIN schema, only the schema owner (SYSADMIN) can grant access to others. In this case, only the central admin (RACHEL) can grant access to the database, schema and table to the PROD_USER role. The SQL needed to create or modify a managed access schema is listed below.

create schema main
    with managed access;

alter schema main
    with managed access;

Role Based Access Architecture

As indicated above, it’s absolutely vital to simplify the the Role Based Access solution and this can be achieved by organising roles in a sequence of layers. The diagram below illustrates the overall RBAC architecture.

Snowflake Role Based Access Architecture

Snowflake Role Based Access Architecture

The diagram above illustrates the overall multi-layer RBAC architecture which consistent of the following layers.

  • Database Objects: At the lowest level is one or more databases containing schemas and database objects which must be secured.

  • Security Roles: These are the first layer in role-based access and provide a predefined set of access controls. Each role defines (and simplifies) the access to a given database and schema and provides Read, Write, Execute or Full access to the underlying schema objects. These roles use Future Grants to automatically provide Select, Insert, Update, Delete or Execute access to underlying data. Full Access grants ownership access, and in the above example, the PROD_DATA_SCIENTIST role can create or drop tables in the SANDBOX schema.

  • Functional Roles: Are the second layer of role-based access and these roles are granted to individual users to perform a task. Unlike the other roles in the account, these are named to indicate the task assigned the individual, for example PROD_BI_ANALYST or DEV_ENGINEER.

  • Environment Admin Roles: Are the first layer of account control, and these give the ability to delegate admin responsibility for an Environment. A Snowflake Environment is not a physical object, but an arbitrary collection of Roles, Virtual Warehouses and Databases which are managed as a single unit. This could be used, (for example), to provide access to a UAT, TEST or PROD environment. The roles at this level are:

    • PROD_SECURITY_ADMIN: Which is used to create and own all the roles in the environment. This role performs a similar role to USERADMIN but for a given environment. This role has the privilege to CREATE ROLES, but can only manage the roles it owns within it’s own environment.

    • PROD_SYSADMIN: Which is used to create and own the Database and Schemas in the environment. This role performs a similar role to SYSADMIN but for a given environment. This role has the privilege to CREATE DATABASEs.

  • Account Admin Roles: Unlike the Environment Admin roles which are limited to a single environment, these roles have control over every Virtual Warehouse, Role, Database and Schema in the entire Snowflake account. As defined in the RBAC Best Practices, these roles inherit privileges from the underlying Environment Admin Roles which maintains a consistent RBAC hierarchy.

RBAC Environment Administration

As indicated above, one of the main challenges faced by Snowflake Administrators is the mounting complexity involved in managing an ever expanding system. The only way to realistically scale the solution is automate the building and replication of both data and the associated RBAC hierarchy. While Snowflake provides cloning for rapidly producing data copies, there is no corresponding equivalent to replicate an RBAC hierarchy.

The diagram below illustrates how two or more environments can be created using a well designed RBAC architecture, in this case the PROD and a TEST environments.

Snowflake Elasticity

The diagram above illustrates how the multi-layer RBAC architecture can be scaled up to support multiple independent environments. While the USERADMIN and SYSADMIN roles have control of the entire account, the environment admin roles for each environment only have control of their own environment and cannot impact other areas in the system.

Notice the strictly defined naming convention for Roles and Databases which must also extend to Virtual Warehouse names. Each role is prefixed with a short code to indicate the environment, (eg. PROD or TEST), and all roles in the environment are owned by the XXX_SECURITY_ADMIN role. In addition, a set of Security Roles are automatically created for each schema consisting of: XXX_DATABASE_SCHEMA_TYPE. Where the TYPE is one of READ,WRITE, EXECUTE or FULL and DATABASE and SCHEMA identify the corresponding Database and Schema they are granted access to.

Because a strict naming convention has been observed, it is possible (indeed almost certainly necessary) to create a stored procedure to both clone a schema and replicate the associated RBAC hierarchy.

While it may not be obvious at this point, the entire solution can (and has been) scripted and this means the entire task of deploying and replicating RBAC can be automated reducing the risk of manual mistakes. It also means that once 99% of the RBAC solution is scripted, the only remaining effort is to decide which access to give to each Functional Role. For example, the SQL below grants access to a number of functional roles and therefore users.

grant role PROD_EDW_SANDBOX_FULL
   to role PROD_DATA_SCIENTIST;
   
grant role PROD_EDW_MAIN_READ
   to role PROD_DATA_SCIENTIST;
   
grant role PROD_EDW_MAIN_READ
   to role PROD_BI_ANALYST;

Conclusion

Even the smallest analytics application or Data Warehouse will eventually grow and administrators risk being overwhelmed by the complexity of grants and ownership. It’s a situation I have witnessed at many Snowflake customer deployments.

If you find yourself fighting a similar challenge with Snowflake, you need to take control and deploy a simple but scalable Role Based Access architecture.
If you are lucky enough to be reading this before deploying the production system, you are strongly advised to prepare the groundwork with an scalable solution based upon the advice in these articles.

During the past two years, I have personally helped over fifty customers understand and deploy role-based access solutions and not one of them asked for a more complex solution. The fact is, RBAC is almost entirely flexible and has an unlimited capacity for complexity. It’s only by designing out the complexity to deliver a robust, architected solution that you’ll be able to manage.

Three-Part Series on RBAC

This article is part of a series of three, including:

  1. Introducing Role Based Access Control, an introduction to RBAC concepts

  2. Snowflake System Defined Roles - Best Practice, which describes Best Practices for RBAC design

  3. This Article - How to architect an RBAC solution.

Notice Anything Missing?

No annoying pop-ups or adverts. No bull, just facts, insights and opinions. Sign up below and I will ping you a mail when new content is available. I will never spam you or abuse your trust. Alternatively, you can leave a comment below.

Disclaimer: The opinions expressed on this site are entirely my own, and will not necessarily reflect those of my employer.

John A. Ryan

John A. Ryan

Previous
Previous

What is Snowflake Dynamic Data Masking ?

Next
Next

Snowflake Accounts, Databases and Schemas: Best Practices