The Holy Grail: Securely Use Real Production Data in Development with Snowflake & ALTR 

The Holy Grail: Securely Use Real Production Data in Development with Snowflake & ALTR

PUBLISHED:

Snowflake Cloning and ALTR’s FPE enable secure, production-like data access for developers without compromising security or compliance.

In the world of data management, agility and security often feel like opposing forces. On one hand, developers need rapid, high-quality access to production-like data to build and test effectively. On the other hand, security teams must enforce strict data protection policies, especially when dealing with sensitive information. Historically, balancing these needs has been difficult and expensive. But with Snowflake Replication Groups and Zero-Copy Cloning features and ALTR’s Format-Preserving Encryption (FPE), companies can achieve both agility and security in ways that were previously impossible. 

Cloning: A Game-Changer for Development 

One of the most transformative features in Snowflake is Zero-Copy Cloning, which allows for the creation of virtual copies of datasets. Unlike traditional database copies, these clones do not require additional storage because they are metadata-based. This means organizations can create multiple clones—whether for development, testing, or pre-production—without the overhead of maintaining separate copies or syncing data manually. 

For teams moving from traditional on-premises environments, the benefits of Cloning are striking. Typically, organizations have separate production, development, and/or test environments. Each requires complex synchronization of both code and data. This leads to slow, labor-intensive processes where development is often based on outdated or incomplete data. With Snowflake’s Cloning, this problem disappears. Developers can work on an exact, up-to-date copy of production—guaranteeing consistency across environments while keeping the process fast and cost-effective. 

Key benefits of Cloning include: 

  • Speed – Creating a clone is nearly instantaneous, as it depends only on the number of micro-partitions involved. 
  • Zero Additional Storage Costs – Clones share the same underlying storage, meaning there’s no extra data footprint. 
  • Personalized Developer Environments – Developers can each have their own clone without interfering with one another. 
  • No Need for Dedicated Servers – No infrastructure provisioning is required, further reducing overhead. 

Limitations 

In a single account, deployment cloning by itself is often the ideal solution. However, many organizations face Infosec constraints that require production and non-production data to be kept in separate accounts. In such cases, cloning isn’t feasible due to the need for a physical copy of the data to clone from. 

Replication Groups can offer a solution, though it involves full refreshes. This approach allows for the transfer of a complete physical copy of all data and code from the production database to the non-production account. However, this approach gets less cost effective as an organization scales. 

The Security Challenge: Why Traditional Masking Isn’t Enough 

Despite these advantages, many organizations face a critical challenge when using Cloning: protecting Personally Identifiable Information (PII), Protected Health Information (PHI), or Payment Card Information (PCI).  

While data masking is great for protecting sensitive data in production, it’s ineffective in development. Developers need access to the full format of data to properly test for all edge cases, and because they require object ownership, they can disable masking policies—exposing raw data in the process. 

The traditional approach to securing sensitive information in development environments therefore involves physically replacing sensitive data with non-sensitive replacement data before pushing it downstream. This ensures that developers don’t have access to real PII, but it comes with significant downsides: 

  • High Costs – Full refreshes from production to non-production environments are expensive and time-consuming. Especially in a scenario where the data needs to be replicated to a separate non-prod account. 
  • Loss of Agility – Dev environments often lag behind production because data replication is a slow, manual process. 

This is where ALTR’s Format-Preserving Encryption (FPE), available now as a Native App in the Snowflake Marketplace, provides a breakthrough solution.  

How ALTR FPE Enables Secure, Scalable Cloning 

ALTR’s FPE flips traditional development data protection techniques—like masking and synthetic data—on its head. Instead of altering or obscuring data after it’s been loaded, ALTR protects sensitive information at the source. By encrypting data before it enters the production environment, it remains readable in production while being inherently protected and safe for development use. Here’s how it works: 

  • Raw data enters Snowflake in its unencrypted form and is then processed through ALTR’s FPE before being stored in production databases. 
  • Encrypted data is fully usable – Unlike traditional encryption, which makes data unreadable, FPE preserves the format of the original data. This means developers can still perform queries and operations on the data without needing decryption. Cardinality is also preserved, ensuring accurate performance benchmarks. Additionally, since the process is deterministic, joins will still work. 
  • Clones remain secure – When production databases are cloned for development, the data remains encrypted. Developers can work with current data without exposing PII, ensuring compliance with InfoSec policies. 
  • Decryption is restricted to production – No matter what a developer does—whether pulling down a clone or disabling masking policies—the data remains encrypted outside of production. 

The Result: Agility Without Compromise 

By combining Snowflake Cloning with ALTR’s FPE, organizations no longer have to choose between agility and security. This approach provides: 

  • Guaranteed Data Freshness – Developers always work with current data without the delays of manual replication. 
  • Lower Costs – Incremental replication of encrypted data eliminates expensive full refreshes. 
  • Scalability – This method works seamlessly across multiple environments without added complexity. 
  • Security & Compliance – Even in non-production environments, PII remains protected at all times. 
  • Worry-Free Lower Trust Environments – Securely clone data without compromising trust, even in less secure environments. 

Snowflake replication costs in a region (excluding storage) are approximately 40 credits for a 10TB database. Refreshing it every two weeks in line with sprint cycles would total over 1,000 credits per year. However, enabling incremental refreshes could significantly reduce this cost, which depending on the amount of churn in the database, a reduction in compute requirements of over 90% would be quite possible.  

Wrapping Up

For organizations using Snowflake, Cloning is a transformative feature that eliminates many traditional pain points in data management. When combined with ALTR’s Format-Preserving Encryption, it offers a best-of-both-worlds solution—allowing teams to move fast without compromising security. By embracing this approach, businesses can keep their data agile, secure, and simple to manage, paving the way for a more efficient and compliant development workflow.