ALTR Blog

The latest trends and best practices related to data governance, protection, and privacy.
BLOG SPOTLIGHT

Data Security for Generative AI: Where Do We Even Begin?

Navigating the chaos of data security in the age of GenAI—let’s break down what needs to happen next.
Data Security for GenAI

Browse All

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Ask a company which role or team is ultimately responsible for ensuring data protection or data security, and they often cannot give a single, clear answer. Even if the organization has a Chief Data Officer or designated data protection officers, responsibility is typically distributed across various functions under the CTO, the CIO, the risk or compliance team, and the CISO, with input from business units, data scientists, business analysts, product developers, and marketers.

While it might sound nice to say “Data security is everybody’s job,” in practice this scenario commonly leads to an ambiguous, inefficient mess — and serious security gaps.

The High Stakes of Enterprise Software

Even if they do not make software as their primary work, virtually every enterprise today is heavily reliant on software, in the sense of purchasing or creating applications to improve processes. Examples abound: Insurance companies build mobile apps so policyholders can file claims and adjusters can fill out damage reports. Big retailers and shippers write massive logistical programs to manage complex supply chains. Many types of companies create their own software for making forecasts. And almost every enterprise tasks solution architects or other application owners with implementing major third-party packages for many corporate functions.

Of course, software vendors are even more heavily engaged in this work, and tensions abound. The CTO wants software that makes the company’s IP portfolio more valuable, product and marketing teams want apps that are better and cheaper, the CISO wants the product to be more secure, and so on. Application owners and the developers who work with them can be pulled in different directions as they try to create and manage highly functional apps. In this setting, security and governance concerns can easily fall by the wayside — affecting not just the vendor itself, but all of their clients as well.

Data Protection Is Critically Important, but Orphaned in Many Organizations

All of these organizations rely heavily on the data that flows into and out of enterprise applications. The good news is that these apps function as superhighways for the flow of data, bringing huge benefits in terms of productivity.

But the benefits also come with real risks. Now more than ever, business apps handle many kinds of data coming in at all hours from all over the map, and then pass that data to any user with the right credentials. In many cases, unfortunately, this includes exposing sensitive data to employees who don't need it for their jobs, or who shouldn't be permitted to see it at all. Giving so many people access to that much data creates serious potential hazards even with traditional cybersecurity measures in place, as a glance at the past decade of headlines about corporate data breaches makes obvious.

When implementing data protection is so fragmented, no single team is given real responsibility, much less empowerment, to carry out the task. The result? Data security falls between the cracks.

What’s the Answer for Better Data Protection?

There is an answer to this dilemma: put the responsibility for data protection in the hands of the application owners who create or manage the applications that use the data, and empower them — and the development teams that work with them — accordingly. Such empowerment implies removing organizational roadblocks and using appropriate technology to handle the burdens of data protection. This quickly improves data security and compliance, but it also boosts innovation and competitiveness over the longer term.

ALTR co-founder and CTO James Beecham recently led a discussion of these issues in a Data Protection World Forum webinar, “Data Protection Is Everyone's Job, so It's No One's Job.” He was joined by Jeff Sanchez, a managing director at Protiviti who draws on his nearly thirty years of industry experience as he leads that firm’s global Data Security and Privacy solutions. During the session, these experts explained exactly how organizations can empower application owners and development teams with solutions that enable them to quickly incorporate security and compliance at the same time — and at the code level.

Access the webinar now so you can find out how this approach not only protects the organization from data breaches and compliance failures, but also enables personnel across many functions to improve innovation and competitiveness.


We've partnered with Chicago-based data and analytics consultancy firm, Aptitive, to drive innovation in the cloud for our customers; the first order of business was to chat with their CTO, Fred Bliss, regarding data-driven enterprises. Fred's experience with enterprise organizations to develop custom applications, data pipelines, and analytical data platforms makes him an expert when it comes to best practices for organizations that want to become data-driven. In preparation for our upcoming webinar with Aptitive on February 17, we had a few questions for Fred; his answers are certainly worth sharing.

As you have done a number of analytics and data-driven projects for many businesses, what is a common theme you see within the projects and what does "data-driven" mean to you?

"Data-driven" is both a buzzword and a reality. When you hear that someone wants to become data-driven, you have to look past the technology angle and into the people and process side of the business. An organization could build the greatest, most innovative analytics technology in the world, but if it's driven entirely by IT in a "build it and they will come" approach, it will likely fail due to poor adoption. When I see data-driven done right, it's creating a vision and plan collaboratively with both business and IT leaders, getting a business executive to sponsor and spearhead the effort, and adopting the platform in their day-to-day operations. When a customer of ours started driving every executive meeting using dashboards and data to make decisions, instead of "gut feels", we knew a change had happened in the organization at the top, and it paved the way for the rest of the organization to quickly follow suit.

What are one or two critical aspects of these engagements or projects that make them successful for a business?

1. Sponsorship from a key business leader.

2. Connecting technology and development efforts directly to a business case that carries a big ROI.

The days of spending a year (or more) building the "everything"' enterprise data warehouse are over - start small, and start with providing real insights that drive action. Dashboards that provide "that's interesting" metrics are interesting - but so what? We need to use data that goes beyond telling us what happened, and instead points business leaders to where they need to focus their initiatives to take appropriate action.

On the flip side, what do you see during a project that makes you think “this isn’t going to work”?

100% IT-driven projects. As much as I love building a beautiful back-end architecture, if it's not solving the pain of a business or delivering a new opportunity that they never had before, it's not going to compel anyone to do anything differently. While some IT-driven projects make sense (for cost reduction, whether in technology cost or the opportunity cost of maintaining a complex system), the real adoption comes when the business is able to do things more quickly than they could before.

In your experience, if the project put data security first when designing and building the output, what does that alleviate in the short and long term?

It significantly removes the future risk of managing security in an analytics environment. While basic security is a no-brainer (SSO, authentication, row-level security, etc), when you have a complex organization, effectively designing a security model that can scale takes time, thought, and long-term planning. An organization's data is their most valuable asset. A security-first approach ensures that as the company grows and scales its data initiatives, they can also allow security to scale with it, rather than hold it back from opportunities.

A massive thank you to Fred from the whole ALTR team for taking the time to share some of your knowledge with us and our readers. For more insights from Fred as well as our own CTO, James Beecham, tune in to our webinar on February 17th: The Hidden ROI - Taking a Security-First Approach with Cloud Data Platforms.

In preparation for a recent webinar, I chatted with both the other presenters to get deeper insight into the webinar topic “A Security-First Approach to Re-Platforming Data in the Cloud” and to give you an idea of what you can expect to learn from our On-Demand webinar.

This post features Omer Singer, the Head of Cyber Security Strategy from Snowflake, another industry expert with years of experience in both cybersecurity and cloud data warehouses.  

As I’m sure you already know, Snowflake enables organizations to easily and securely access, integrate, and analyze data with near-infinite scalability. The rapid adoption of solutions like Snowflake is the main reason we are discussing what it looks like to re-platform data in the cloud. Who better to give us valuable insight and lessons learned than one of their own?  

Here's some of what Omer had to say:

1. What's a common mistake you see companies make when re-platforming data in the cloud/Snowflake?

Companies are used to thinking about network security architectures, and they've been adapting that approach to cloud security. When it comes to data security, there is still a tendency to start with a flat architecture that relies too heavily on authentication as the exclusive security control. In fact, Snowflake has granular RBAC capabilities that are very capable of restricting data access to the right people. While Snowflake has automated nearly all the onerous management tasks, access control is one of those things that each org needs to tailor for itself. Because it's on the customer to manage who has access to what datasets, it's also on the customer to monitor for account compromise and abuse, no different from monitoring infrastructure cloud activity. That's something that security teams are becoming more aware of recently and I'm glad that they have solutions like ALTR that can help them be successful.  

2. What are the benefits of a Security Data Lake? Why is it important?

Anyone that's paying attention to the headlines can tell that cybersecurity has yet to achieve its objective of companies operating online with assurance. Instead, there are justified concerns around the risk companies take on by adopting new technologies and becoming more interconnected. The answer is not to avoid progress but to see these powerful new technologies as opportunities for achieving radically better cybersecurity. The cloud, with its bottomless storage and nearly unlimited compute resources, can become an enabler for big progress in areas like threat detection and identity management. A security data lake is really just a concept that says "Infosec is joining the rest of the company on a cloud data platform". At Snowflake, we're starting to see the impact of this movement at our customers and it's very exciting.

3. What do you hope the audience takes away from this webinar?

It's not like security teams have an overabundance of time on their hands but I'm hoping that they use this webinar as an opportunity to revisit their data security strategy. Just like everyone in the audience has learned in the past that using the public cloud doesn't mean not worrying about cloud security, I hope there's a similar aha! moment around the increasingly critical cloud data platform. I also hope that the examples we'll be sharing about combining ALTR insights with other datasets to catch compromised accounts and insider threats will get attendees fired up about doing security analytics themselves.  

End of Q&A.

Between Omer (Snowflake) and Lou (Q2), our webinar was illuminating and thought-provoking for everyone involved. Click the image below to watch the webinar on demand.

Data and security professionals alike are feeling the pressure to make data accessible yet secure. The rapid adoption of cloud data platforms like Snowflake enables organizations to make faster business decisions because of how easy it has become to analyze and share large quantities of data.  

But how has this impacted organizations' overall risk exposure and data security strategies?

During our latest webinar, “A Security-First Approach to Re-Platforming Cloud Data”, we explored that question and discuss the best practices for mitigating that risk. I was joined by Omer Singer, Snowflake’s Head of Cyber Security Strategy, as well as Lou Senko, Chief Availability Officer at Q2.  

First, let’s level set on what we mean by “Security-First.”  Typically, security-related projects are brought in at the eleventh hour with a corresponding eleventh hour budget. This makes the task of adding data security into a product or application stressful and cumbersome. A security-first approach means that security is part of the discussion from day 1, built into the strategy, not an add-on.  

That’s why I am so excited that Lou Senko joined us for this discussion. Lou understands the importance of security-first and he’s experienced the challenges and benefits firsthand. Lou and his team are responsible for the availability, performance, and quality of the services that Q2 delivers, including security and regulatory compliance.

In preparation for the webinar, I had the privilege of getting some time with Lou for a quick Q&A session that gives a sneak peek into what you can expect to learn about during the live event.  

1. Why did Q2 choose Snowflake and what benefits were you looking for/problem were you looking to solve?

Q2 went through the ‘digitization’ of our back office internal IT back in 2013-2014, resulting in us moving all of our in-house application portfolio over to best-of-breed SaaS solutions. This provided Q2 an opportunity to re-imagine our internal IT staffing – shifting away from looking after machines to more Business Analysts, Reporting Analysts and experts of this SaaS applications – working with the business to drive more value from the insights these applications offer. At first, we were building our own data warehouse, but ended up taking a step backwards from that.

With Snowflake we get all of these new capabilities without adding complexity from an infrastructure perspective. We had run into the typical issues with large data – finding ways to keep it performant, serving both ETL and OLTP usage models, and it was a big lift to manage and secure it. Snowflake’s simplicity has allowed us to focus resources on additional projects and accelerate our ability to innovate.

2. For others in the same position, what are the biggest risks to look out for when re-platforming your data in the cloud?  

Security. We pulled all the data out of these highly secure, highly trusted SaaS applications and then plunked it all into a single database. Before, a bad actor would need to figure out how to hack Netsuite, Salesforce, Workday, etc. But now they just have to focus on hacking this one database and you take it all. So, it makes the data warehouse a very rich target.  

3. What do you hope the audience takes away from this webinar?  

First, I hope they can learn from our experiences, so they don’t have to waste a lot of time for no reason. I want them to better understand the risks too. The business needs the insights from the data – so you must deliver – but pulling it out of your vendors application just removed all that security.  

Overall, I’m excited to show them how combining Snowflake with ALTR can enable them to optimize their benefits and minimize their risk.  

End of Q&A

As you can see you are in for an amazing discussion if you check out the on-demand webinar. You will learn best practices from Lou around scaling and making data available to all consumers who need access without putting burdens on your operations teams.

Alongside Lou, we were also joined by Omer Singer of Snowflake who shared how modern businesses are enhancing threat detection capabilities with best of breed tools like Snowflake to cut cost and time related to security incident and event management. Check out the Q&A we had with Omer here.  

In preparation for our upcoming webinar, Simplifying Data Governance Through Automation, I chatted with OneTrust’s Sam Gillespie to get a preview of his thoughts and insights. Sam is an Offering Manager at OneTrust with years of experience supporting and guiding clients through implementation.  

As more companies than ever are using data to make better business decisions, they’re also contending with growing privacy, governance, and risk obligations. OneTrust unifies data governance under one platform, streamlining business processes and ensuring privacy and security practices are built in. The integration between OneTrust and ALTR further simplifies this by automating the enforcement of governance policy.  

Let’s hear what Sam thinks about how data use and data regulations are creating challenges for companies:  

1. Can you talk a little bit about how increasing privacy regulations are making it more challenging than ever to utilize data? How is this affecting teams from privacy and legal to data and security?

I think we all expected there to be a GDPR “domino effect.” However, the scale and pace at which we’re seeing new privacy laws are taking many by surprise. In practically every corner of the globe, new privacy laws or proposed ones have come into place which require organizations to better understand, process and protect personal data. Although many of them have shared principles and foundations, no law is the same. Even just within the U.S. each law is different in its scope, requirements, and definitions of personal data. Yet data is becoming even more key to the growth and future of most organizations, so privacy and security teams must be able to meet this ever-growing complex regulatory landscape without hindering business use of data where it can be avoided. This is a huge and complex task and is a core reason why most organizations, big and small, are turning to technology to help with this.  

2. What is the problem with enforcing governance policy today?

Most companies I speak to are too far on either end of the scale. They either lock down their data to better protect it but then force their people to go through an often complex process to get access to it, if their teams are even aware the data exists, which causes slowdowns in day-to-day operations and business use of the data. Or on the other end, they have pretty open access to all data all the time, risking violations of policies and, in extreme cases, the law. Therefore, governing your data is a tricky balancing act of business enablement and compliance. It can be done though, and the benefits are felt throughout the organization. But you need to ensure you have the right tools and processes in place to facilitate this.  

3.  Are there ways to make enforcement of data governance policy more effective?

This is where technology is going to play a vital role. The sheer amount of data that most organizations have, coupled with the ever-increasing ways in which it can be used, means that governing its use manually is impractical. However, technology is not going to solve all your problems. You also need to make sure that the right tool is embedded within your current processes and operations for it to be used effectively. Also, it has to be scalable and easy to implement—it’s only going to effectively do its job if it works and is used!  

4. What do you hope the audience takes away from this webinar?

I hope the audience takes away that help is out there! Tools like ALTR and OneTrust can really help organizations with their privacy and security obligations in a way that will actually work! Great technology that complements each other and ultimately helps customers solve a business challenge: what’s not to love?!

Thanks to Sam for sharing his thoughts. In the webinar, you can hear more and see how the OneTrust + ALTR integration makes automated data governance easy. Watch on demand now.  

What is tokenization of data? 

A few years ago, a handful of “tokens” used to be as good as gold at the local arcade. It meant a chance to master Skee-Ball or prove yourself a pinball wizard by getting your initials on the leaderboard. But what's "tokenization of data? It's kind of the same thing except instead of exchanging money for tokens, you exchange sensitive data. It's a data security solution alternative to encryption or anonymization. By substituting a “token” with no intrinsic value in place of sensitive data, such as social security numbers or birth dates that do have value, usually quite a lot of it, companies can keep the original data safe in a secure vault, while moving tokenized data throughout the business.  

And today, one of the places just about every business is moving data is to the cloud. Companies may be using cloud storage to replace legacy onsite hardware or to consolidate data from across the business to enable BI and analysis without affecting performance of operational systems. To get the most of this analysis, companies often need to include sensitive data.  

Tokenization of data is ideal for sensitive data security in the cloud data warehouse environment for at least 3 reasons:  

#1 Tokens have no mathematical relationship to the original data, which means unlike encrypted data, they can’t be broken or returned to their original form.

While many of us might think encryption is one of the strongest ways to protect stored data, it has a few weaknesses, including this big one: the encrypted information is simply a version of the original plain text data, scrambled by math. If a hacker gets their hands on a set of encrypted data and the key, they essentially have the source data. That means breaches of sensitive PII, even of encrypted data, require reporting under state data privacy laws. Tokenization on the other hand, replaces the plain text data with a completely unrelated “token” that has no value if breached. Unlike encryption, there is no mathematical formula or “key” to unlocking the data – the real data remains secure in a token vault.

#2 Tokens can be made to match the relationships and distinctness of the original data so that meta-analysis can be performed on tokenized data.

When one of the main goals of moving data to the cloud is to make it available for analytics, tokenizing the data delivers a distinct advantage: actions such as counts of new users, lookups of users in specific locations, and joins of data for the same user from multiple systems can be done on the secure, tokenized data. Analysts can gain insight and find high-level trends without requiring access to the plain text sensitive data. Standard encrypted data, on the other hand, must be decrypted to operate on, and once the data is decrypted there’s no guarantee it will be deleted and not be forgotten, unsecured, in the user’s download folder. As companies seek to comply with data privacy regulations, demonstrating to auditors that access to raw PII is as limited as possible is also a huge bonus. Tokenization allows you to feed tokenized data directly from Snowflake into whatever application needs it, without requiring data to be unencrypted and potentially inadvertently exposed to privileged users.

#3 Tokens maintain a connection to the original data, so analysis can be drilled down to the individual as needed.

Anonymized data is a security alternative that removes the personally identifiable information by grouping data into ranges. It can keep sensitive data safe while still allowing for high-level analysis. For example, you may group customers by age range or general location, removing the specific birth date or address. Analysts can derive some insights from this, but if they wish to change the cut or focus in, for example looking at users aged 20 to 25 versus 20 to 30, there’s no ability to do so. Anonymized data is limited by the original parameters which might not provide enough granularity or flexibility. And once the data has been analyzed, if a user wants to send a marketing offer to the group of customers, they can’t, because there’s no relationship to the original, individual PII.

Tokenization of data essentially provides the best of both worlds: the strong at-rest protection of encryption and the analysis opportunity provided by anonymization. It delivers tough security for sensitive data while allowing flexibility to utilize the data down to the individual. Tokenization allows companies to unlock the value of sensitive data in the cloud.  

Get a tokenization of data demo

Snowflake makes moving your data to the cloud quick and easy. But hosting sensitive data in Snowflake can add a layer of complexity due to governance and security concerns that could slow down or even grind your project to a halt. 

Ask yourself these questions to find out if ALTR can help control and secure your sensitive data in Snowflake, moving your data projects along and getting to Snowflake value more quickly:  

snowflake sensitive data

1. Do you know which Snowflake data is sensitive or regulated?  

When adding new data or new databases, it can be difficult to really know what you're bringing in. If you have hundreds or thousands of databases or millions of rows and columns of data, how can you know which have sensitive data? Column names? Not so fast – some databases use indecipherable codes as column names and there’s no guarantee the column name matches the data in the column. Manual review? Sure, if you don’t want to do anything else for the rest of your life. But if you don’t know which data is sensitive or regulated, how can you safeguard it? 

snowflake sensitive data

2. Is your Snowflake data classified into types like social security # and email addresses?  

Once you find all that sensitive data (wherever it its), the next step is to get it categorized and tagged so that you can automatically apply the appropriate privacy and access policies. Maybe only HR should have access to employee SS #s or maybe only marketing actually needs customer date of birth. Are you going to manually go in and grant or deny access to those columns or rows for each user? What a headache. If you’re doing this, keep reading. 

snowflake sensitive data

3. Can you easily see who has accessed your most sensitive Snowflake data in the last 2 weeks? When and how much?

After you have that data classified and tagged, you want to be able to see who accesses it – when and how much. If you don’t have this in an easy-to-understand dashboard, it makes comprehending normal data usage very difficult and identifying anomalies and outliers just about impossible. Not to mention, complying with audit requests for regulated data access becomes another onerous, time-consuming task.  

snowflake sensitive data

4. Can you create column and row-level access controls on sensitive Snowflake data without using SnowSQL or other code?  

Developers do love coding, but is writing (and worse, updating!) access controls the best use of your time in Snowflake? What do you do when new data is added? Or new users? What if you want to hand off maintenance to someone on the governance team or a line of business data owner because they set the policy? Do they have to learn to code SQL? Unlikely

snowflake sensitive data

5. Can you grant access to new users or new data in less than 10 minutes in Snowflake?  

As more and more companies strive to become “data-driven” and more and more people across the company understand the value data analytics can provide, requests for data access can explode. While in the past it may have been one or two requests a week, it can easily grow to hundreds or thousands in the largest companies. Granting data access to new users can quickly become a full time job. Is that what you signed up for?  

6. Can you apply sensitive data masking in Snowflake with just a few clicks?

Snowflake provides powerful native capabilities for masking data, but the catch is that you need to know SQL to use them. This means coders whose time would be better spent on other, more high value activities, spend time writing and rewriting masking policies. Wouldn’t it be nice to have an interface over those native features you can activate and update in just a few clicks. (*hint* ALTR has this).

snowflake sensitive data

7. Can you limit the amount of sensitive Snowflake data someone can download per minute, per hour or per day?

Privileged users and credentialed access present a significant risk to PII, PHI and PCI. Even if you trust the user, credentials can be lost, stolen or hacked. And even trusted employees can become disgruntled or sense an opportunity. It’s better to assume that credentials are always compromised and put a system in place that ensures even the most privileged user can’t do too much damage.    

8. Can you see and control the data Tableau users are accessing?

Snowflake and Tableau make a powerful combination for data analytics. Instead of granting users access to data directly to Snowflake, many organizations choose to offer it through a business intelligence tool like Tableau. This comes with setting up Tableau user accounts for each, but to save time/maintenance they sometimes use a single Tableau service account to access Snowflake. So from a data governance perspective, without some additional help, admins can’t see who is accessing what data.   

9. Can you prove that data governance policies are working correctly?  

When your data governance is part of a compliance program focused on meeting regulatory requirements, it’s not enough to set up the policies. You have to prove that they’re working correctly and no one who shouldn’t have access did access the data. Can you prove this, without spending hours or weeks sifting through data access logs? 

snowflake sensitive data

10. Do you spend less than 2 hours per week monitoring data access, revising access policies and adjusting data controls?  

Why should you have to pull all this info together manually by scanning through text access logs or go in and update SQL access policies line by line? These activities are all very critical to governing and securing your data in Snowflake but they’re really just administrative tasks. Is this why you became a DBA or data engineering? Probably not. What you really want to do is pull in new data streams, offer up new analytics opportunities and find ways to extract more value from your data. Why don’t you do that instead? 

Bonus question: Can you set up all of this in less than 60 minutes, for free?  

If you answered “No” to even one of these questions, ALTR’s Free plan can help you today. Unlike legacy data governance solutions that take 6 months to implement and cost at least 6 figures to start, we built our solution to simplify data control and security for busy data architects, engineers and DBAs so you can focus on getting value from data without disruption. In just 10 minutes you can control 10 columns of sensitive data in Snowflake and get visibility into all your data usage with role-based heat maps and analytics dashboards!

No code, no complexity and no credit card required. Get started now...  

The heightened regulatory environment kicked off with GDPR and CCPA in 2018, the unrelenting PII data breaches, and the acceleration of data centralization across enterprises into the cloud have created enormous new risks for companies. This has skyrocketed the importance of “data governance” and transformed what was once a wonky business process into a hot topic and an even hotter technology category. But some software companies are self-servingly narrowing the definition in a way that leaves customers’ sensitive data exposed.

Data Governance is more than discovery and classification

There are a few different definitions of the concept floating around. The Data Governance Institute defines it as “a system of decision rights and accountabilities for information-related processes, executed according to agreed-upon models which describe who can take what actions with what information, and when, under what circumstances, using what methods.” The Data Management Association (DAMA) International defines it as “planning, oversight, and control over management of data and the use of data and data-related sources.”

Both are pretty broad and very focused on process. Technologies can help at various stages though, and we’re seeing software companies make their claims as “data governance” solutions. But unfortunately, too many are narrowly focused on the initial steps of the data governance process—data discovery and classification—and act as if that’s enough. It’s not even close to being enough.  

A data card catalog tells you about the data but doesn’t govern it  

Data discovery and classification are critical first steps of the data governance process. You must know where sensitive data is and what it is before you can be prepared to govern or secure it. Creating a “card catalog” for data that puts all the metadata about that data at a user’s fingertips, like many solutions do, is extremely useful. But if you’ve ever used a card catalog you know that the card tells you about the book, but it’s not the book itself. For valuable books, you may have to take the card to the librarian to retrieve the book for you. Or if the book is part of a rare books collection, it may even be locked away in a vault. The card catalog itself is a read-only reference that does nothing to make sure the most valuable books are protected.

It doesn’t stop the librarian from accessing sensitive data themselves or using their credentials to get into the locked rare books room. And if the librarian loses their credentials or has them stolen, there’s no way to stop a thief from taking off with irreplaceable texts.

It’s similar with data governance tools. Knowing where the data is and providing information around it are necessary pre-conditions, but they’re not governance.  

You’ve classified your data; now shouldn’t you control and protect it?

Software solutions that say they provide “data governance” but don’t go further than data discovery and classification leave their customers asking, “What next?” It’s a little like this video. Shouldn’t a vendor help you with a full solution instead of just identifying the problem? Recently some have taken one next step into access controls and masking, but the way they’ve implemented this may cause additional pain for users. If the solution requires data to be cloned and copied into a proprietary database in order to be protected, it leaves the original data exposed. Or if users have to write SQL code access controls, it puts a burden on DBAs. These are not full solutions.

For complete data governance, policy enforcement needs to be automated, require no code to implement, and focus on the original data (not a copy) to ensure only the people who should have access do. Companies then need visibility into who is consuming the data, when, and how much, in a visualization tool that makes it easy to see both baseline activity and out-of-the-norm spikes. Finally, in order to truly protect data, companies need a solution that takes the next crucial step into real data security that limits the potential damage of credentialed access threats. This means consumption limits and thresholds where abnormal utilization triggers an alert and access can be halted in real-time, and the power to limit data theft by tokenizing the most critical and valuable data.

Complete data governance includes control and security

All of these steps—data intelligence, discovery, classification, access and consumption control, tokenization—are necessary to proper data governance. To faithfully live up to the responsibility created by collecting and storing sensitive data, companies need a solution like ALTR’s complete Data Governance to keep sensitive data safe.

In the mad rush to become the next winner of the “data-driven” race, many companies have focused on enabling data access across the business. More and more data are used, viewed and consumed throughout the company by more and more groups and more and more individuals. But sometimes a critical piece of information is missing: who’s consuming what data, where, when, why, and how much. Without this data observability, companies are flying blind.

When you don’t know what “normal” is, everything is “abnormal”. And how can you effectively run a business that way?

Being able to observe data consumption and truly comprehending how data is being utilized brings tremendous value to an enterprise. When you have the full context around how your sensitive PII, PHI and PCI data is consumed by whom, in what roles, at what time of day, on what day of week, in what patterns across time, you can start to understand what “normal” looks like. When you have this level of operational data visibility, you can compare normal to abnormal and begin to drive valuable insights. You can create custom views and reports that show how the data is being consumed for groups across the enterprise. And you can start to make better decisions for your business.

data observability

Get data insights and take action

For “risk” executives – in other words, people who look at risk to the business at a high level and care about what happens to the brand – you can create visualizations that show how often data that is covered by regulatory or privacy requirements is accessed and by whom. For execs who have to sign privacy and compliance attestations you can tailor visualizations that give them clear information they need in order to feel confident signing their name.  

Compliance and privacy teams

These groups may discover anomalies like DBAs looking at Social Security numbers or developers accessing customer PII. They can then set up more stringent governance policies to correct that access. They can also create purpose-based access controls based on normal usage. For example, they can set controls that allow access within those normal constraints, but any access outside that will be flagged and an alert sent in real-time to the Security Team

Data management teams

These teams may want to look at the value the company is getting out of its data consumption. They will want to ensure that people who legitimately need access to data get as much as they legitimately need without any friction. Comprehensive observability into how the company is using data, what roles or lines of business are using what data, and how they’re using it can help data teams optimize data consumption to deliver the most value.  

If data sharing and monetization are part of your business, observability over data consumption is key. You can start to see which data and which data sets are valuable to which customer roles, segregated by company size, location, industry and more. For example, one of the companies I’m working with has seen that midsize companies are interested in a specific data set, but enterprise size companies look at a completely different data set. Knowing this could lead to data product optimizations like carving out the enterprise class data and potentially charging a higher price.

Easy data observability with ALTR + Snowflake

data observability

ALTR provides several features with Snowflake that make data observability easy. You can classify which columns contain sensitive data and focus observation only on those, so you’re not drowning in an unnecessary flood of “abnormal” alerts. You can step back and look at a data usage heatmap that gives you an overview of consumption by individual so you can see normal usage over time and easily spot and dig into any outliers. And our Tableau user governance capability allows you to collect individual consumption information on end users even when they’re using a shared service account.  

I believe observability is one of the most valuable things we do — everything else we effect, evoke or instrument is driven by knowing who is consuming what data. It’s a powerful foundation for our complete Data Governance solution and can be powerful tool to drive insights and action for your company.

If you’d like to see our data usage analytics in action, request a demo!

What is SQL, and why is it important?

SQL: Structured Query Language.  

In this blog, we’re going to explain why SQL is so important without getting too technical. It’s the query language for relational databases from Oracle, Microsoft, IBM, Snowflake, and others that primarily store and process sensitive information like personally identifiable information (PII), PHI health information, and PCI data.

Let’s use Snowflake (who, coincidentally, has their own version of SQL) for example. They have one of the best examples of a secure data environment: SSO, 2FA, RBAC, secure views, you name it. That makes it difficult to misuse data access... but not impossible. The aforementioned security features are entirely dependent upon and trusting of the identity of the user. If someone can present the correct sequence of bytes over the internet to Snowflake, then they can pretend to be someone else.  

In a world where you can hardly trust your food to be delivered with integrity, how can you trust a solution that depends entirely upon the validity of the user? If someone will steal your lunch, then someone with access to your data will certainly steal, or get targeted by criminals, for a number of reasons far more lucrative than your $3 taco.

What do we do about it?

Extend the idea of Zero Trust into the SQL layer, of course! For an in-depth look at Zero Trust, you can check out our webinar with Forrester Analyst Heidi Shey (Forrester coined the term Zero Trust, for reference). But for our purposes here, let’s just define it as “never trust, always verify”. In other words, each time someone wants access to data, confirm they should be able to do that. Think of it like an ATM - you walk up to an ATM and verify your identity in order to withdraw cash. Even after your identity is verified, you can still only get out a certain amount. If you go across town to a different ATM, you’ve got to verify your identity again to request money. But if you have reached your daily limit of withdraw then you are done. It doesn't matter that it is actually you asking for the money, the bank assumes it isn't you.

Verifying a user’s identity means a lot of things other than just 2FA, SSO, and RBAC. If someone’s credentials get stolen, you’d think it’s virtually impossible to know, right? Nope.  

Every time a user wants to access data – regardless of their identity, role or title – you should check their previous use of data (per minute, hour, day, week, etc.) and other factors like what device or application they’re coming from. If it doesn’t match up with typical user patterns, then you know there’s a problem.  

Regardless of what information the user is trying to request, if the rate of data consumption breaks a limit that the business deems appropriate, then the user (whoever they are) should not get the data. Period. Even if their title is CEO, they should not be able to access all the PII in the table in one query. Why? Because more access = more risk.  

Having the ability to not only know when too much data is being queried but also being able to stop it in real time is game-changing and will effectively subdue the threat of credentialed access breaches.

Why is Snowflake a great example case?

Snowflake Cloud Data Warehouse has shown the world that separating compute of data from storage of data is the best path forward to scale data work loads. They also have an extensive security policy which should make any CISO/CIO comfortable moving data to Snowflake.

So why would Snowflake need Zero Trust at the SQL layer given the statement above? It comes down to the shared customer responsibility model that comes with using IaaS, PaaS, or, like Snowflake, SaaS. As you can see below, with any SaaS provider the customer still has two very important problems left to solve: identity and data access.

Gartner Identity + Data

Okta, for example, does a great job of solving for the identity portion of the matrix. And Snowflake has done everything possible to help with the data consumption side (I would say that using Snowflake is the safest way today to store and access data), but there is still this last remaining “what if” out there: what if someone steals my credentials? Or decides they want to do something malicious? The insider threat to data is very difficult for any organization to handle on their own.

What does Snowflake + Zero Trust SQL look like?

It starts by enhancing your Snowflake account with a solution that can detect and respond to abnormal data consumption in real-time. This will give your organization complete control over each user’s data consumption, regardless of access point (due to cloud integration).

This means that every time an authorized user requests data within Snowflake, they get evaluated and verified by a Zero Trust risk engine (like the ATM example).  If abnormal consumption is detected based on the policies of the governing risk engine, then you can cut off access in real-time. The best part is that because it is integrated into Snowflake, users don’t see any changes to their day to day, your Snowflake bill won’t increase, and your security team can finally stop credentialed breaches and SQL injection attacks for good.

The 2023 Verizon Data Breach Investigations Report is out and once again, people are a key security weakness. "74% of all breaches include the human element through Error, Privilege Misuse, Use of stolen credentials or Social Engineering," according to the report.

So, why haven’t we (“we” meaning all of us who care about protecting data) stopped the credentialed access threat? Do we not realize it’s happening (despite Verizon’s annual reminder), do we not see it as a priority, or is there another cause? And, if we’re not going to stop credential theft, how can we make sure data is secure despite the danger?

It is Possible to Stop Credential Theft, but People Continue to be the Problem

Usernames and passwords have been used since there have been log ins to digital accounts. It’s actually not a bad way to secure access. The problem is mostly people.  

There are 3 easy ways we can stop or slow down credential theft:  

  1. Use better passwords
  2. Stop falling for phishing emails  
  3. Use Two Factor Authentication (2FA)

You’ll notice that these all require users to make an extra effort. Some of the best passwords are essentially a long, completely random string of characters – that also turn out to be almost impossible for a normal person to remember. Unfortunately, easy-to-remember passwords are also easy to hack. And if people do use a strong password, they reuse it, diminishing its effectiveness. A Google survey found 65% of people reuse the same password across multiple sites. The same Google survey found that only 24% of people use a password manager.  

Almost the opposite is true for clicking on phishing emails. It may not be that people aren’t careful, but that they too quickly respond and do as they’re asked. Phishing emails are now more customized than ever, utilizing info shared on social networks or purporting to be from high level company execs. When an employee receives an urgent email demanding they log into a seemingly familiar tool, using their company username and password, to carry out a CEO request, too many people just want to do what they need to do to stay in the CEO’s good graces.  

Multi-factor authentication (MFA) has its own struggles. At the RSA 2020 security conference, MSFT said that more than 99.9 percent of Microsoft enterprise accounts invaded by hackers didn’t use MFA, and only 11% of enterprise accounts had MFA enabled. There are several reasons a company might not utilize MFA: it’s not seen as priority, the quality and security can vary between solutions, and employee pushback. Sixty-three percent of respondents in a security survey said they experience resistance from employees who don’t want to download and use a mobile app for MFA. Fifty-nine percent are implementing 2FA/MFA in stages due to employees’ reluctance to change their behavior.

Assume Credentials are Compromised, and Your Data is at Risk

Knowing that the easiest and best ways to stop credentialed access threats are undermined by people being people, we’re simply better off assuming all credentials are compromised. Stolen credentials are the most dangerous if, once an account gets through the front door it has access to the entire house including the kitchen sink. Instead of treating the network as having one front door, with one lock, we need to require authorization in order to enter each room of the house. This is actually Forrester’s “Zero Trust” security model – no single log in or identity or device is trusted enough to be given unlimited access.  

This is especially important as more data moves outside the traditional corporate security perimeter and into the cloud, where anyone with the right username and password can log in. While cloud vendors do deliver enterprise class security against cyber threats, credentialed access is their biggest weakness. It’s nearly impossible for a SaaS-hosted database to know if an authorized user should really have access or not. Identity access and data management are up to the companies utilizing the cloud platform.  

Omer Singer, Head of Cyber Security Strategy at Snowflake, explains why it’s important to take a shared responsibility approach to protecting your data in the cloud.

That means companies need a tool which doesn’t just try (and often fail) to stop threats at the door. You need a cloud-based data access control solution like ALTR that never believes a user is who their credentials say they are. Every time an apparently “authorized” user requests data in the platform, the request is evaluated and verified against the data governance policies in place. If abnormal consumption is detected, then access can be cut off in real-time. Even seemingly authorized users aren’t allowed to take whatever they want.

A Solvable Problem

The U.S. has a long history of solving big problems – we came together during WWII to ramp up wartime production of military supplies and equipment, and more recently, we helped fund the miraculous creation of the COVID-19 vaccine in less than a year. It’s a little baffling that we continue to allow the credentialed access threat to harm our industries and damage data security. It’s a solvable problem that I hope we start taking seriously. And, I hope we see a different threat at the top of next year’s Verizon report.  

To learn more about how ALTR solves this problem, watch our webinar with Snowflake and Q2: A Security-First Approach to Re-Platforming Data in the Cloud

Pete Martin is an ALTR co-founder and Director of Product Marketing, and James Beecham is a co-founder and ALTR’s Chief Technology Officer. Since establishing ALTR  with their other partners, the two have been immersed in the world of data, data governance and data security. After listening to their passionate exchanges about the shifting industry, the exploding ecosystem and where they see it all going, we decided to invite an audience to join their conversations every Friday as part of our new LinkedIn Live program: The Data Planet.

In our premiere episode last Friday, the guys covered topics including:  

  • How technology advances and cultural changes have brought us into this new Age of Data
  • Why data is now both a liability and an asset
  • How the shift to a data-driven world and the sudden, urgent need to protect data is like being on the Amazing Race
  • Whether protecting data on-prem and in the cloud is actually one question or two. And if we’re all planning to attend the “cloud party”
  • Finally, what they’re looking forward to hearing about at this week’s Snowflake Summit (hint: product announcements around secure data sharing and the NBC-Universal use case)  

In future episodes the guys will also host industry partners and leaders as guests. Take a look, and we hope you can join us weekly to discuss the rapidly shifting data landscape. Attend our next episode here.  

Get the latest from ALTR
Subscribe below to stay up to date with our team, upcoming events, new feature releases, and more.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.