ALTR Blog

The latest trends and best practices related to data governance, protection, and privacy.
BLOG SPOTLIGHT

Data Security for Generative AI: Where Do We Even Begin?

Navigating the chaos of data security in the age of GenAI—let’s break down what needs to happen next.
Data Security for GenAI

Browse All

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

If we learned anything at Snowflake Summit (and we did – a lot!) it’s that the data governance space is as confusing as it is frenzied. Nearly every company is at some stage of moving to capitalize on cloud data analytics, while the regulatory environment around data continues to increase the urgency for privacy and security. Every single data governance, control, security session we saw was completely packed, indicating that many companies are now ready to focus on protecting sensitive data. Yet, some of the options in the market are misnamed, confusing and frustrating for buyers. There are many focused providers and even other adjacent software markets like data catalogs are starting to offer basic features. Also, Snowflake itself continues to roll out very powerful, albeit very manual-to-implement, governance features.

We’re hoping to not only clear up some of the FUD around data governance solutions, but also set the bar for how easy, functional and cost-effective data governance can and should be.  

A new paradigm for controlling data access and security at scale

Last week we announced our new policy automation engine which combines governance features like access control and dynamic data masking with security controls like data usage limiting and tokenization, leveraging metadata like Snowflake Object Tagging and implemented and managed without code. With this new data governance solution, we’ve maintained our commitment to cloud-native delivery that supports best-in-category time to value and zero cost of ownership beyond our very reasonable, by user by month subscription.

For ALTR this is the realization of our vision for data privacy and security driven by people who can best accomplish it – the people who know the data – by assembling disparate tools into a single engine and single POV across the enterprise.  

Built on a flexible, secure cloud foundation that leverages and automates Snowflake’s own features

ALTR is a true cloud-native SaaS offering that can be added to Snowflake using Partner Connect or a Snowflake Native App in just minutes. It integrates seamlessly with Snowflake without the need to install and maintain a proxy or other agent. Our microservices infrastructure takes full advantage of the scalability and resilience of the cloud, offering extremely high availability with multi-region support by default, because your data governance solution simply cannot go down.

Importantly, our service is built the same way as Snowflake itself and leverages Snowflake’s native features whenever possible. Those powerful features all involve writing SnowSQL to implement, and we automate them so that you don’t have to scale them yourself and you can go completely no-code.  

We have also been Soc 2 Type 2 and PCI DSS Level 1 certified for years, and we maintain a highly disciplined security culture in our technical teams. Across various data sources we offer multiple integration types, including Cloud-to-Cloud Integration, Smart Database Drivers, and Proxy solutions. They all connect and use the same ALTR cloud service.  

A data governance solution to both scale and safeguard

All of this comes together in an easy-to-use solution that delivers combined data governance and security for thousands of users across on premises and in cloud data storage. Automation of access policy can unlock months of person-hours per year in writing and maintaining policy, in the same way that ETL/ELT providers automating data pipelines saved data teams huge amounts of time in provisioning data.  

In addition, unlike all the other providers in the data governance space, the ALTR solution moves beyond traditional data access policy tools like RBAC and dynamic masking into data security functionality like data usage limits and tokenization. We feel that your policy around data should contemplate your credentialed users and also extend to use cases where credentials might have been compromised or privileged access is an issue. For us, data policy is about both control and protection, and those policies should be developed and enforced with both in mind.

The future is bright – for data governance solution buyers

We’ll continue to extend our solution by deepening our policy engine’s capabilities with new policy types, expanding our support for a greater variety of data sources and data integrations, and building out more seamless integrations with like-minded players in the data ecosystem (such as more ELT and Catalog providers). All of this is driven and directed by a growing community of customers who are innovating in data and showing us where they need us the most. As the technology space moves forward, the options available to those still searching for a data governance solution will come into focus and the best choice will be clear.  

It was amazing to soak up the data ecosystem at Snowflake Summit a few weeks ago, but until we meet again, one way to stay up to date on all the last industry topics is by watching what our partners are talking about...virtually. Below are some of the most interesting recent blog posts…  

Matillion + Snowflake: Increase Your Speed to Insight with Data Modernization

Today’s insight is tomorrow’s old news so getting there quickly is critical. This blog post shares how data modernization preparation through automation allows enterprises to spend less time getting data ready for the ETL process and more time finding those nuggets of knowledge. Snowflake and Matillion get users there faster. (Snowflake also clearly sees the power of the combination...)

BigID: Utah Pushes Privacy Legislation to the Forefront

Another state, another privacy act. While a new federal privacy regulation is making the rounds, the Utah legislature unanimously passed the Utah Consumer Privacy Act (S.B. 227) on March 3, 2022 — one day before it adjourned for its 64th session. The Governor who will have twenty days to decide whether to sign, not sign, or veto the bill. If signed, this would become the fourth state privacy act in the United States following California, Virginia, and Colorado. BigID explains how they can help customers comply with the new regulation – and all state, federal and international privacy laws.  

Alation: How Fifth Third Bank Democratizes Data Access via a Data Mesh with Alation and Snowflake

Despite being one of the largest and well known US consumer financial services institutions, Fifth Third Bank did not have the proper structure to support their data growth needs. Alation stepped in to help the bank democratize data usage by creating a more self-serve structure ¾ delivering the right people the right data to do their jobs. We heard a lot about the idea of “data mesh” at Snowflake Summit so it’s interesting to see it in the wild.  

Snowflake: A CDO’s Field Guide to Finding Value in Data

The rise of the Chief Data Officer (CDO) was one of our top 2022 predictions, and it looks like Snowflake is seeing it come true. CDOs are taking the lead on modernization, creating the data value chain and delivering data-driven insights that require collaboration across the organization. This blog post explains how the other “D” in “CDO” is “diplomacy” and provides some tips for demonstrating the value of data and its impact across the business.  

After Summit, we’re curious to see the next hot industry topics! Don’t worry, we’ll keep you posted.  

Q2 was a busy and inspiring quarter for ALTR. Between the excitement of day-to-day operations, our bi-annual GTM kickoff at our Florida office, and multiple events attended, we did not fall short of excitement the past three months.

Notably, many members of our team enjoyed a full week in Las Vegas, NV at Snowflake Summit 2023. We had the great honor of sharing a speaking session with Matillion, co-hosting two after-hours events with Passerelle and with Matillion, and enjoying a week of learning as much as we could from our partners, customers, and friends in the data ecosystem. We’ve compiled the big announcements our partners made this year and some of our highlights of the event below.

Snowflake - Industry Recap

Snowflake shared a Snowflake Summit recap blog post, in which they dive into the data landscape and data forecast across multiple business sectors: Financial Services, Retail & Consumer Packaged Goods, Healthcare & Life Sciences, and Manufacturing, Telecom, and Advertising, Media, and Entertainment.

They write, “As we continue to revolutionize the way businesses operate, allowing [data users] to solve their most pressing problems and drive revenue through the Data Cloud, the insights, expertise, and experiences we offer at Summit have continued to grow. And this year was no different! Powered by our latest product offerings and the ways in which we’re enabling companies to leverage AI as a competitive advantage, this year’s event was electrifying, with tons of fantastic product demos, customer sessions, keynotes, and more. […] This recap offers the high points of how the Data Cloud is reshaping the data landscape for a large number of industries.”

Read More from Snowflake here

Exciting News in the Data Ecosystem

Matillion Announces the End of Slow, Fragmented, Expensive Data Pipelines

Matillion launched a new productivity platform for data teams this year at Snowflake Summit. This platform is built with the intention to empower the full data team to move, transform, and orchestrate data pipelines as day-to-day business users — regardless of technical knowledge.

“Matillion makes data work more productive by empowering the entire data team – coders and non-coders alike – to move, transform, and orchestrate data pipelines faster. Its Data Productivity Cloud empowers the whole team to deliver quality data at a speed and scale that matches the business’s data ambitions.”

Read More About the Data Productivity Cloud here

Alation Launches Connected Sheets in Snowflake

Alation announced a new way to access data from Excel and Sheets directly in Snowflake using their new product offering: Alation Connected Sheets. This offering from Alation is another step in the right direction of streamlining data productivity using the platforms business users use most frequently.

“With Alation Connected Sheets for Snowflake, business users can now instantly find, understand, and trust the data they need – without leaving their spreadsheet. This product enables business users to access Snowflake source data directly from the spreadsheets in which they work without the need to understand SQL or rely on central data teams. Alation Connected Sheets also enable data governance teams to set access policies on which users can access various data objects.”

Read More from Alation here

Tacos, Margaritas, and Good Data

Passerelle’s Data Oasis

One highlight of Snowflake Summit 2023 were the two after-hours events ALTR had the honor of co-hosting. Alongside Passerelle, Talend, SqlDBM, Equifax and GrowthLoop, we enjoyed a fajita bar, custom ALTR-ita’s, and fantastic conversations with our partners in the data ecosystem. Nearly every stage in the data lifecycle was represented at this event, allowing attendees a full scope of data productivity and data security.

Carolyn Fernald, the Marketing and Event Coordinator behind Data Oasis, wrote, “We hosted a really fun event with TalendSqlDBMALTREquifax, and GrowthLoop (Flywheel) — the room was abuzz while folks completed a scavenger hunt and learned about new offerings from the partners.”

Thanks for hosting a fantastic event, Passerelle!

Matillion’s Fiesta in the Clouds

The other after-hours event ALTR co-hosted was Fiesta in the Clouds alongside Matillion, Amazon Web Services, Dataiku, Deloitte, and ThoughtSpot. This event took over both floors of the Chayo Restaurant on the LINQ Promenade in Vegas, offering attendees the opportunity to connect over good data, good tacos, and great margaritas. Each of the co-hosts of this event showed up offering a great time to be had by all hosts and attendees!

Kathy O’Neil, Director, Customer and Partner Programs at Matillion, wrote, “HUGE THANK YOU to the 900+ people that joined Matillion for Tuesday night’s Fiesta in the Clouds at Snowflake Summit and to our sponsors, Amazon Web Services (AWS), ALTR, Dataiku, Deloitte, ThoughtSpot - it was an incredible evening!”

We are already counting down the days to Snowflake Summit 2024 in San Francisco where ALTR will be a Blue Square sponsor!

The famous motorcycle stunt rider Evel Knievel, who holds the world record for the most broken bones in his lifetime, once said, “I did everything by the seat of my pants. That's why I got hurt so much.” He was talking about daring feats, like jumping his red-white-and-blue motorcycle across the Snake River Canyon. But he could have been talking about business intelligence (BI) and data security.    

Performing daredevil feats takes careful planning and a team. Knievel did everything when he started, from setting up the jumps to writing his promotional press releases. But it was a painful process and took time and energy away from his primary focus, performing. So he recruited a team who could handle myriad tasks while he focused on executing the stunt. 

The same goes for unlocking business intelligence to have a complete 360° view of your organization. You may have been initially successful with data analysis using BI and data platforms themselves, like Tableau and Snowflake. That’s fine for business data, but what about regulated data? Not including sensitive information like personally identifiable information (PII) leaves essential data on the table and obscures the view of your business operations.  

Attempting BI projects in the cloud without a holistic approach for BI governance is like trying to jump a motorbike across a canyon all by yourself. You need a methodical process to seamlessly integrate BI, data governance, and the data warehouse.  

Here are four steps to a successful analytics governance strategy:.  

  1. Implement a data control and protection solution that integrates with your cloud data warehouse. The governance solution should employ contextual info provided by the BI tool to distinguish users from each other. Sending through information on the specific BI user making the request simplifies things. The database admin needs only to configure and manage a single, shared BI service account, yet gain per-user visibility and governance as though every data end-user had their own account.  
  1. Set up the appropriate policies in the governance tool. You should be able to apply policies that restrict access at a granular level that includes by user role or by database row and then audit every instance. BI admins gain flexibility and controls to dial in security policies for sensitive data. 
  1. Use governance tool / BI tool integration that allows you to a) split out the service account access by individual users and b) place thresholds/rate limit policies by individual users or roles. Operators can adjust the policy thresholds to optimize collaboration while preventing data theft or accidental exposure.  
  1. Set up alerts to collaboration tools like email and Slack to give you a heads up when a user’s access is out of compliance. As part of the auditing trail, you want to know exactly when and who is trying to access sensitive data and if any patterns are abnormal.  

 BI and analytics governance creates accountability and enables access to secure and trusted sensitive content for users, so you don’t have to fly by the seat of your pants to deliver a complete view of your organization to business leaders. 

Want to jump ahead? Consider ALTR, Tableau and Snowflake Together

Read our eBook to determine the best BI governance strategy for unlocking business value.  

If you’ve already selected Tableau and Snowflake, keep in mind that ALTR has developed a unique solution that employs contextual info provided by Tableau to distinguish users and allow you to apply governance policies to the data in Snowflake:    

  1. With a simple, one-time configuration of a SQL database variable in Tableau Server, the service account that Tableau uses to connect to Snowflake can send through information on which a Tableau user is making the request. 
  1. ALTR can then apply governance and security policy on that user as it would on any individual Snowflake account. 

ALTR is the only Snowflake provider delivering this high level of integration to solve a vast BI security headache.  

The combination of Tableau, Snowflake, and the award-winning ALTR SaaS platform delivers more value from your data in minutes while helping you avoid the headache of managing thousands of user accounts.  

See how it works

Today, most companies understand the significant benefits of the "Age of Data." Fortunately, an ecosystem of technologies has sprouted up to help them take advantage of these new opportunities. But for many companies, building a comprehensive modern data ecosystem to deliver data value from the available offerings can be very confusing and challenging. Ironically, some technologies that have made specific segments easier and faster have made data governance and protection appear more complex. 

Big Data and the Modern Data Ecosystem

"Use data to make decisions? What a wild concept!" This thought was common in the 2000s. Unfortunately, IT groups didn't understand the value of data – they treated it like money in the bank. They thought data would gain value when stored in a database. So they prevented people from using it, especially in its granular form. But there is no compound interest on data that is locked up. The food in your freezer is a better analogy. It's in your best interest to use it. Otherwise, the food will go bad. Data is the same – you must use, update, and refresh it, or else it loses value. 

Over the past several years, we've better understood how to maximize data's value. With this have come a modern data ecosystem with disruptive technologies enabling and speeding up the process, simplifying complicated tasks and reducing the cost and complexity required to complete tasks. 

But looking at the entire modern data ecosystem, it isn't easy to make sense of it all. For example, suppose you try to organize companies into a technology stack. In that case, it's more like "52 card pickup" – no two cards will fall precisely on each other because very few companies present the same offering and very few cards line up side-to-side to offer perfectly complementary technologies. This is one of the biggest challenges of trying to integrate best-of-breed offerings. The integration is challenging, and interstitial spots are difficult to manage.

If we look at Matt Turck's data ecosystem diagrams from 2012 to 2020, we will see a noticeable trend of increasing complexity – both in the number of companies and categorization. It isn't evident even for those of us in the industry. While the organization is done well, pursuing a taxonomy of the analytics industry is not productive. Some technologies are mis-categorized or misrepresented, and some companies should be listed in multiple spots. Unsurprisingly, companies attempting to build their modern stack might be at a loss. No one knows or understands the entire data ecosystem because it's massive. These diagrams have value as a loosely organized catalog but should be taken with a grain of salt.

A Saner, but Still Legacy Approach to the Modern Data Ecosystem

Another way to look at the data ecosystem—one that's based more on the data lifecycle is the "unified data infrastructure architecture," developed by Andreessen Horowitz (a16z). The data ecosystem starts with data sources on the left, ingestion/transformation, storage, historical processing, and predictive processing and on the right is output. At the bottom are data quality, performance, and governance functions pervasive throughout the stack. This model is similar to the linear pipeline architectures of legacy systems.  

Like the previous model, many of today's modern data companies don't fit neatly into a single section. Instead, most companies span two adjacent spaces; others will surround"storage," for example, having ETL and visualization capabilities, to give a discontinuous value proposition. 

modern data ecosystem

1) Data Sources

On the left side of the modern data ecosystem, data sources are obvious but worth discussing in detail. They are the transactional databases, applications, application data and other datasources mentioned in Big Data infographics and presentations over the past decade. The main takeaway is the three V's of Big Data: Volume, Velocity and Variety. Those Big Data factors had a meaningful impact on the Modern Data Ecosystem because traditional platforms could not handle all V's. Within a given enterprise, data sources are constantly evolving.

2) Ingestion and transformation

Ingestion and transformation are a bit more convoluted. You can break this down into traditional ETL or newer ELT platforms, programming languages for the promise of ultimate flexibility, and event and real-time data streaming. The ETL/ELT space has seen innovation driven by the need to handle semi-structured and JSON data without losing transformations. There are many solutions in this area today because of the variety of data and use cases. Solutions are capitalizing on the ease of use, efficiency, or flexibility, where I would argue you cannot get all three in a single tool. Because data sources are dynamic, ingestion and transformation technologies must follow suit.

3) Data Storage 

Storage has recently been a center of innovation in the modern data ecosystem due to the need to meet capacity requirements. Traditionally, databases were designed with computing and storage tightly coupled. As a result, the entire system would have to come down if any upgrades were required, and managing capacity was difficult and expensive. Today innovations are quickly stemming from new cloud-based data warehouses like Snowflake, which has separated compute from storage to allow for improved elasticity and scalability. Snowflake is an interesting and challenging case to categorize. It is a data warehouse, but its Data Marketplace can also be a data source. Furthermore, Snowflake is becoming a transformation engine as ELT gains traction and Snowpark gains capabilities.While there are many solutions in the EDW, data lake, and data lakehouse industries, the critical disruptions are cheap infinite storage and elastic and flexible compute capabilities.  

4) Business Intelligence and Data Science

The a16z model breaks down into the Historical, Predictive and Output categories. In my opinion, many software companies in this area occupy multiple categories, if not all three, making these groupings only academic. Challenged to develop abetter way to make sense of an incredibly dynamic industry, I gave up and over simplified. I reduced this to database clients and focused on just two types: Business Intelligence (BI) and Data Science. You can consider BI the historical category, Data Science the predictive category and pretend that each has built-in "Output." Both have created challenges to the data governance space with their ease of use and pervasiveness.

BI has also come along way in the past 15 years. Legacy BI platforms required extensive data modeling and semantic layers to harmonize how the data was viewed and overcome slower OLAP databases' performance issues. Since a few people centrally managed these old platforms, the data was easier to control. In addition, users only had access to aggregated data that was updated infrequently. As a result, the analyses provided in those days were far less sensitive than today. In the modern data ecosystem, BI brought a sea of change. The average office worker can create analyses and reports, the data is more granular (when was the last time you hit an OLAP cube?), and the information is approaching real-time. It is now commonplace for a data-savvy enterprise to get reports updated every 15 minutes. Today, teams across the enterprise can see their performance metrics on current data and enable fast changes in behavior and effectiveness.

While Data Science has been around for a long time, the idea of democratizing has started to gain traction over the past few years. I use the DS term in a general sense of statistical and mathematical methods focusing on complex prediction and classification beyond basic rules-based calculations. These new platforms increased the accessibility of analyzing data in more sophisticated ways without worrying about standing up the compute infrastructure or coding complexity. "Citizen data scientists" (using this term in the most general terms possible) are people who know their domain, have a foundational understanding of what data science algorithms can do, but lack the time, skill, or inclination to deal with the coding and the infrastructure. Unfortunately, this movement also increased the risk of exposure to sensitive data. For example, analysis of PII may be necessary to predict consumer churn, lifetime value or detailed customer segmentation. Still, I argue it doesn't have to be analyzed in raw or plain text form. 

Data tokenization—which allows for modeling while keeping data secure—can reduce that risk. For example, users don't need to know who the people are and how to group them to execute cluster analysis without needing exposure to sensitive granular data. Furthermore, utilizing a deterministic tokenization technology, the tokens are predictable yet undecipherable to enable database joins if the sensitive fields are used as keys.  

Call it digital transformation, the democratization of data or self-service analytics, the aesthetic for Historical, Predictive, Output – or BI and Data Science – is making the modern data ecosystem more approachable for the domain experts in the business. This also dramatically reduces the reliance on IT outside of the storage tier. However, the dynamics of what data can do requires users to iterate, and iterating is painful when multiple teams, processes, and technology get in the way.

Modern Data Ecosystem Today: a Lighter Lift leads to Other Issues

When looking back at where we came from before the Modern Data Ecosystem, it seems like a different world. Getting data and making sense of it was a heavy lift. Because it was so hard, there was an incentive to “do it once” driving large scale projects with intense requirements and specifications. Today the lift is significantly lighter – customers call it quick wins; vendors call it land and expand. Ease of use and User Experience have become table stakes where the business user is now empowered to make their own data-driven decisions. 

But despite all the innovations by the enterprise software industry, there are some areas that have been left behind. Like a car that got an upgrade to its engine but still has its original steering and wheels, the partially increased performance created increased risk. 

The production data & analytics pipeline that will remain unchanged for years is dead. Not only that, but there is also no universal data pipeline either. Flexible, easy to create data pipelines and analytics are key to business success.  

All the progress over the past decade or so has left some issues in its wake:

A single, stable, and linear data pipeline is dead

I once heard from a friend of mine who is a doctor that said, “If there are many therapies for a single ailment, chances are none of the therapies are good”.  I used to draw a parallel of that adage to data pipelines, but I realized that the analogy falls short.  It falls short since there are many data pipeline problems that need to be solved with their own needs. Data pipelines can be provisional or production, speedy or slow, structured or unstructured. The problem space where data ingestion and pipelines live is massive to the point where having a purposeful choice is the best play. Taking the 52-card pickup reference from earlier – you need a good hand to play with and in this game, you can choose your own cards.  

Lift and shift to the cloud can exacerbate as many problems as it solves

One of the methods of enterprise architecture modernization is obviously moving to the cloud.  Taking server by server, workload by workload and moving it to the cloud aka “lift and shift” has seemed to be the fastest and most efficient way to get to the cloud. And I would agree that it does but can inhibit progress if done in a strict mechanical way. Looking back at the data ecosystem models, new analytics companies are solving problems in many ways. So, the lift and shift method has a weakness that can cause one of two problems. Taking the old on-prem application and porting it to the cloud prevents any infrastructural advantages the cloud has to offer (scale and elasticity is the most common one).  It also inhibits any improvements from a functional architecture perspective.  It could be performing calculations in the BI or Data Science layer where it could perform better or be more flexible elsewhere. The one we will focus on later is governance from an authorization or protection point of view.

Governance needs to move away from the application layer

I touched on this in the last paragraph but deserves to be called out on its own. Traditionally data governance was a last mile issue. Determining who had the correct privileges to see a report or an analysis was executed when it was minted. In my opinion this is a throwback from the printed report era because you were literally delivering the report to an in box or an admin which can have its own inherent governance. The exact processes don’t follow through, but the expectations did. This bore an era of creating software where rudimentary controls were deployed at the database, but the sophisticated controls were in the application. Fast forward back to the present day. Analytics and reporting are closer to a BYOD cell phone strategy than a monolithic single platform for reports. This creates an artificial zero-sum game where analytic choice is at odds with governance. The reality is companies need to be analytically nimble to stay competitive. Knowledge workers want to support their company to stay competitive and will do whatever they can to do so.  The culmination of easy-to-use software that is impactful allows users to follow the path of least resistance to get their job done.  

How the Modern Data Ecosystem Broke Traditional Governance

While pockets of the data ecosystem have been advancing quickly, data governance has fallen behind when it comes to true innovation. With traditional software, governance was easy because everything else was hard: the data changed slowly, the output was static. The teams executing all this technology were small, highly trained, and usually located in a single office. They had to be because the processes were hard and confusing, and the technology was very difficult to implement and use. The side effect was that data governance was easier to do because data moved slower, was in fewer places and fewer people had access. The modern data ecosystem eliminated each of these challenges one by one. Governance became the legacy process and technology that couldn’t keep up with the speed of modern data. 

Does governance need to be hard or partially effective? I would argue not, if you apply it at the right part of the process: the data layer. Determining risk exposure is important. That is where the analytics metadata management platforms have done so well. The modern data ecosystem enabled the creation of a wealth of data, analytics, and content. Management from a governance perspective provides a view into where the high-risk content is.  Now we need an easy-to-use platform to execute on a governance strategy. 

CTA: Want to find out how you can protect sensitive data in your modern data stack? Get a demo

The journey to derive value from data is long, requiring data infrastructure, analysts, scientists, and data consumption management processes, among other things. Even when data operations teams move down this path, growing pains occur as more people demand more data as soon as data operations teams make progress. As a result, problems may arise quickly or develop gradually over time. Some strategies can help with this issue. A data team must recognize that these time-wasting issues are real and have a plan to tackle them.  

1. Data Access without Automation

When you create a data catalog or establish a procedure for users to locate and request data, administering access becomes difficult. In conventional data architectures, accessing sensitive data is frequently a complex procedure involving much manual work. For example, creating and updating user accounts for several services can be time-consuming. 

Put another way, no plan for data governance survives contact with users. So even if you build your data infrastructure in a legacy data governance model, you will be busy granting access to it. For instance, one global firm I spoke with had developed a data pipeline to move customer information from one on-premise system to their cloud data warehouse. They provided self-service access, but the demand was so significant that they devoted the following three months to granting access to that system. 

Solution:

  • No-code approaches allow you to quickly access or block access to a data set within a cloud data warehouse, associate that policy with specific users, and apply various masking techniques within minutes. 
  • You can also see your users, their roles, and the sensitive data they're accessing. 
  • You can then identify areas where you can apply access policy and make it easy to create an audit trail for governance
  • More mature organizations struggling with thousands of data users may already have a data catalog solution. Integrating a control and protection solution with the data catalog allows you to create the policies, manage them in the catalog, and automatically enforce them in connected databases. 

2. Manual Data Migration

Once you have established your initial cloud data warehouse and data-consumption schema, you will want to import additional data sets. However, manual data-migration approaches may slow you down and restrict your ability to gain insight from multiple sources. You can gain efficiency by refining your migration approach and tools instead: 

Solution: 

  • Implement an ETL SaaS platform to eliminate manual discovery and migration tasks. It simplifies connection to multiple data sources, collects data from various sites, converts source data into tabular formats to make it easier to perform analytics, and moves it to the cloud warehouse.  
  • Use a schema manipulation tool like dbt, which transforms data directly in the cloud data warehouse. 
  • Follow a three-zone pattern for migration—raw, staging, and production. 
  • Maintain existing access and masking policies even as you add or move data or change the schema in the cloud data platform. For example, every time an email address moves around and gets copied by an automated piece of software, you must apply masking policies. In addition, you'll have to create an auditable trail every time you move data for governance. 

3. Complicated Governance Auditing  

With more people accessing data, you should set up a data governance framework to guarantee all that data's protection, compliance, and privacy. When data teams strive to identify the location of data, they frequently look at query or access logs and build charts and graphs to determine who has accessed it. When a big data footprint has many users interacting with it, you should not waste time applying for role-based access or creating reports manually. 

Solution: To scale auditing, you should simplify it. Doing so will allow you to: 

  • Visualize and track access to sensitive data across your organization. Have an alerting system to let you know who, where, and how your data is accessed. 
  • Keep access and masking policies in lockstep with changing schema. 
  • Understand if data access is normal or out of normal thresholds. 
  • Create and automate thresholds that block access or allow access with alerts based on rules you can apply quickly. 
  • Automate classification and reporting to show granular relationships, such as the same user role accessing different data columns. 

Should this even be your job?   

The most significant time sink is data engineers, and DBAs handle data control and protection. Is this even logical? Since you're taking data from one place to the next and knowhow to write SQL code required by most tools to grant and restrict access, it has fallen to data teams

But is that the best use of your time and talents? Wouldn't it make more sense for the teams whose jobs focus on data governance and security to be able to manage data governance and security? With the proper no-code control and protection solution, you could transfer these tasks to other teams – invite them to implement the policies, pick which data to mask, download audit trails, and set up alerts.Then, once you get all that off your plate, you can move on to what you were trained to do: extract value from data. 

Financial Services was one of the first industries to recognize data's significant potential to understand customers better, improve services and streamline operations. However, the financial services industry faces unique data challenges despite data’s apparent benefits. Financial information is highly sensitive; as such, data governance in financial services is driven by strict regulations to protect it. This has led to conflicting forces: the urge to hold sensitive data in highly secure, on-premises databases owned and managed by the organization’s IT and security teams versus the drive to move critical data to a centralized cloud.

Until now, misconceptions around compromised data protection and privacy and the ability to achieve regulatory compliance have deterred many financial institutions from cloud adoption. However, with the cloud’s undeniable cost, scalability and flexibility benefits, the question for financial institutions is not whether to migrate to the cloud or not but how to transition to the cloud securely. 

Virtual Fireside Chat: Secure Cloud Migration for Financial Institutions - Best practices for navigating cloud migration challenges Financial Institutions face with TDECU, Matillion, Wavicle and Snowflake

ALTR and Snowflake make data governance in financial services quick and easy. We help Financial Institutions comply with the strictest financial regulations while also making data available across the business. We also enable firms of all sizes to follow the Cloud Data Management Capabilities (CDMC) framework for protecting sensitive data in a cloud/hybrid cloud environment. ALTR, which seamlessly integrates with Snowflake, automates the configuration and enforcement of access, data masking, usage limits, and tokenization policy so financial institutions can spend less time worrying about data and more time using data.

1. Personalize Customer Experiences with a 360 View   

Serving financial customers is about customizing their experience, leveraging a complete view of the customer on a single platform to power analysts, data scientists and applications. Inevitably PII and PFI will be used to provide services and products that lead to customer growth. ALTR helps you automate and scale access to and protection of your customers’ data without having to write, test, and maintain intricate policies. 

  • By extending ALTR’s security and governance layer to the newly migrated Snowflake Data Cloud, financial institutions can feel comfortable with analysts and data scientists accessing data in Snowflake. 
  • Without ALTR, financial institutions would have to manually replicate the highest levels of protection using Snowflake’s code-based governance and security features. ALTR automates these tasks away and streamlines data protection wherever it needs to go within the organization.
Case Study: Security Becomes a Business Enabler at this Leading Mortgage Firm - An enterprise-wide data governance and security solution from ALTR lets data users focus on delivering value

2. Automate Enterprise-wide Data Governance

Moving to the Data Cloud can help financial institutions meet regulatory requirements while sharing live access to data through a single platform. Snowflake Data Cloud is with ALTR as a PCI DSS Level 1 Service provider that is Soc 2 Type 2 certified and highly available across multiple regions; you can automate away policy enforcement headaches and make enterprise-wide data governance in financial services once a just a goal, a reality. 

  • Multiple integration methods allow ALTR to protect customer data from creation to on-premises and cloud data warehouses to multiple banking applications – from start to finish; ALTR covers the data. 
  • A single pane of glass across all the touch points that store or process sensitive data means there is always a single view of truth about data access and security. 
  • SaaS-based delivery means data can flow in and out of the company walls, from the data center to Snowflake, without any hiccup in service or protection.

Case Study: TDECU Takes a Data-Driven Approach to Supporting Its Members’ Financial Journeys - Seeking to increase data reliability and accelerate data-driven decision-making, TDECU began its cloud migration with Snowflake. It turned to ALTR to protect member data and ensure regulatory compliance.

3. Monitor Security Controls in Real-Time

Financial institutions can monitor compliance against enterprise security controls like NIST, ISO, CIS or others, using Snowflake with ALTR. Security teams can send ALTR’s rich data usage and event information back into Snowflake to provide the backbone for security analytics and auditing. 

“Through their native cloud integration with Snowflake’s platform, ALTR’s approach to providing visibility into data activity in Snowflake’s Data Cloud provides a solution for customers who need to defend against security threats,” said Omer Singer Head of Cybersecurity Strategy, Snowflake. 

  • A leading mortgage firm loads all alerts and signals to Snowflake Security Data Lake – regardless of source – enabling the security team to bring all enterprise data stores under surveillance.   
  • According to the firm’s CISO, “With ALTR’s integrations and scalability, I'm not just solving this problem on Snowflake. We're starting with and building on Snowflake, but we’re embracing a solution that can solve this problem across the enterprise for us.”   
Data Governance Financial Services
Blog: Moving to the Cloud Doesn't Have to Be Daunting for Small and Mid-size Financial Institutions - Financial Institutions can easily and safely move your enterprise data warehouse to Snowflake + ALTR

Wrapping Up  

For any industry, migrating to the cloud can seem daunting. This is especially true in financial services, where extensive regulations have created the illusion that cloud deployments are challenging and risky. ALTR helps remove this risk with automated enterprise-wide data governance that allows financial institutions of all sizes to control and protect regulated financial data in the cloud as securely as in the data center. 

See how ALTR can help improve your financial services data governance .

When one of my relatives started participating in a research study run by a major west coast university, it got me thinking about the personal data higher education organizations gather and administer. For this study, he shared all his previous medical history; handed over personal data including name, address, and social security #; and goes in once a week to have his vitals read and recorded. This study includes hundreds of people and is just one research project in one department in a huge organization.  

Universities are really like large conglomerates that can include healthcare facilities, scientific and medical research laboratories as well as businesses that market to potential students for recruiting and alumni for fundraising. This makes universities a substantial target for bad actors looking for rich sources of data to steal. In April 2021, Gizmodo revealed that several leading US universities had been affected by a major hack, with some sensitive data revealed online in an attempt to pressure ransom payments. The COVID-19 pandemic brought increased threats to higher ed which are negatively impacting school credit ratings, according to Moody’s.  

And because universities span across multiple industries, they must comply with multiple relevant laws including PII and HIPAA privacy protections in addition to specific regulations around student data laid out in the Family Educational Rights and Privacy Act (FERPA). FERPA puts protections around “student education records” including grades, transcripts, class lists, student course schedules, and student financial information.  

This all makes data privacy more complicated, but even more crucial for higher education organizations.

New insights, better student outcomes

The good news is that all this data is worth the risk. It’s leading to medical innovations like the new drug trial my relative is participating in. It’s also leading to better outcomes for students. After seeing research from other universities on the power of momentum in student success, Michigan State University analyzed 16 years of student data and found a similar result. Students who attempted 15 or more credits in their first semesters saw six-year graduation rates close to 88 percent compared to the university’s overall 78 percent graduation rate. This led the school to launch a campaign encouraging students to attempt 15 credit hours to graduate faster and save money.

Arizona State University, Georgia State and other leading universities developed adaptive learning technology that continually assesses students and gives instructors feedback on how they can adjust to class or individual student needs. This contributed to a 20% increase in completion for 5,000 algebra students. The University of Texas at Austin is doing something similar to Michigan State by using more than 10 years of demographic and academic data to determine the likelihood a student will graduate in four years. Students identified as at-risk are offered peer mentoring, academic support and learning through small affinity groups.

Unified data governance across the university data ecosystem

As higher ed organizations strive to optimize use of data, they’re migrating workloads to cloud data platforms like Snowflake to optimize financial performance and provide streamlined analytics access. They’re making data available to increasing numbers of users; collecting and sharing sensitive data including PII and PHI, while complying with relevant regulations; and distributing data across multiple tools for data classification and business intelligence.  

This would be difficult enough with the siloed processes some universities face. Trying to coordinate this across multiple departments, conflicting architectures, and legacy tools can be almost impossible. In addition, giving more and more users access to more data creates more and more risk, exposure, attack surfaces, and potential credential threats. “All or nothing” access doesn’t work at scale. In order to ensure that they’re meeting regulatory requirements, universities must have visibility into who is accessing what data, when and how much, have the power to control that access, and document that access on command.  

ALTR can help by unifying data governance across the data ecosystem. ALTR can help you know, control and protect all your data, no matter where it lives. We do this by providing discovery, visibility, management, and security capabilities that enable organizations to identify sensitive data, see who’s accessing it, control how much is being consumed, and take action on anomalies or policy violations. And ALTR works across all the major cloud databases, business intelligence, ETL, and tools such as Snowflake, AWS, ThoughtSpot, Matillion, OneTrust, Collibra, and more.

Don’t let data privacy or protection concerns become a roadblock to the essential research and insights universities provide. Protect data throughout your environment and stay compliant with all the regulations with unified data governance from ALTR.

Interested in how ALTR can help simplify data governance and protect your sensitive data? Request a demo!

After many in depth, heated conversations around the ALTR office about the dynamics of the data governance and security industries, in 2021 we decided it was time to jump into the online conversation with both feet! January is a great time to look back at our top data governance blog posts for the year, and there are some clear trends in what we covered: we feel passionately that data governance without security just isn’t enough, that knowing your data means knowing how your data is being used and why, and that Snowflake + ALTR delivers an easy yet powerful solution to the data governance and security problem.

And we can tell those topics resonated with you looking at our top data governance blog posts for the year….  

Data Governance requires visibility

The Hidden Power of Data Consumption Observability  

Seeing who’s consuming what data, where, when, why and how much lets you make better decisions for your business.  

Do You Know What Your Tableau Users Are Doing in Snowflake?

If you’re using a Tableau service account to let your users access Snowflake data, you may not be able to see what data users are accessing or govern them individually. ALTR can help.  

Why “Why?” is the Most Important Question in Governing Data Access

When you know why data is being used, you can more easily create purposed-based access control policies that custom-fit your organization, are simpler to automate, and reduce risk.  

Data Governance also has to include control and security

The Shifting Lines Between Data Governance and Data Security

Automated policy implementation lets data governance teams control data access, allowing for greater effectiveness and efficiency across process.

No Matter What You Call It, Data Governance Must Control and Protect Sensitive Data

The Forrester Wave™: Data Governance Solutions report shines a light on the Data Intelligence/Governance space.

Thinking About Data Governance Without Data Security? Think Again!

Traditional data governance without security puts your company and your data at risk. A complete data governance solution must include security.

ALTR + Snowflake = an easy and powerful data control and protection solution

Go Further, Faster with Snowflake and ALTR

Boost your value from Snowflake more quickly by embracing ALTR complete data control and protection right from the start

Why DIY Data Governance When You Could DI-With ALTR for Free?

What if controlling and protecting your data could be free, easy and in some cases, more powerful? See why doing it with ALTR Free is better.

Data Governance blog topics coming in 2022

This year we plan to dive deeper into where the data governance industry is falling short, why data privacy and protection will continue to become more and more critical, and how you can make sure you’re prepared for what’s coming down the pike in 2022.  

Sign up for our newsletter to get the latest first!

data governance blog

Consolidating your business data in cloud data warehouses is a smart move that unlocks innovation and value. All your data in one place makes it easier to connect the dots in ways that were impossible or unimaginable before. For instance, a retail chain can optimize sales projections by analyzing weather patterns, or a logistics company can more accurately predict costs by accounting for the salaries of all the people involved in a shipment.  The key to making a project like this successful is to overcome the cloud migration challenges that can pop up along the way.

Sensitive PII: Cloud Migration Challenge and Opportunity

Getting those new data-fed insights is a process that starts with moving the data to a consolidated cloud data warehouse like Snowflake. An extract, transform, and load (ETL) migration technology partner simplifies moving or loading the data from each of your company’s locations into a cloud data warehouse to make it analytics-ready in no time. Migrating data is what these companies do best. Data governance and sensitive data security are not their priority, however, which is a tremendous concern when the most valuable data is often the most valuable – both to the business and to bad actors. That makes sensitive data migration one of the biggest cloud migration challenges. Confidential information like customer PII, which includes email, home addresses, or social security numbers, can be extremely useful to analytics. For example, it can help marketing teams know where, when and to whom they should target a specific offer if they can determine what age, sex, location are mostly likely to buy. However if breached, customer PII can cause create significant risk of legal exposure and to your reputation.  

The need for high levels of data protection and secure access can cause significant tradeoffs in data usability and sharing, which adds risk and complicates matters for analytics teams. Even the built-in security and governance capabilities of data warehouses require a level of database coding expertise that is costly to implement and time-consuming to manage at scale. Distributed enterprises need a thoughtful yet simpler approach to protecting data in the cloud that keeps information airtight and doesn’t slow down access and progress. 

Overcoming Cloud Migration Challenges with Integrated Data Security

Before you migrate data to the cloud, let’s understand how cloud migration data security can help overcome your cloud migration challenges - and why some solutions fit better than others for your specific business needs. We all know that we must protect sensitive data in order to comply with appropriate regulations and maintain the trust of our customers. What is not always clear is if the same standards for storing and protecting data on-premises also apply to data in the cloud. 

These requirements include using NIST-approved security or standards for at-rest data protection. At a minimum, we must ensure there’s not a single door for hackers to get through, known as a single-party threat. If data can be de-obfuscated by just by one person, the protection method doesn’t count. For example, simply reversing a medical record is not enough. Encryption meets this requirement because you need both the encrypted data and the key to unlock it in order to access the original data.  

For data in the cloud, however, you need to rethink tooling and management decisions. Let’s look at methods for data obfuscation including encryption, but through a cloud lens. You’ll quickly run into several issues:  

  • You can’t expect to connect your on-prem key management solution to a cloud data warehouse like Snowflake and have it work at scale
  • Someone who gets the key can decrypt all the data stored in your centralized data warehouse
  • You also need fail-safes to prevent users with privileged access, like the Snowflake administrator, from being able to view the data if they’re not supposed to

To avoid these access and encryption issues, some security methods rely on transforming data through “one-way” techniques like hashing before storing the hash in the cloud. Hash codes ensure privacy and allow users to still know the dataset comprises, for example, social security numbers. However, an authorized user who needs the real social security number won’t be able to retrieve it, because once hashed, the data cannot be recovered in the cloud database.  

Even anonymization techniques, such as storing the data as a range, limit the application of data. You might not need an individual anonymized data point today, but you may very need it later. Your business may depend on allowing some authorized users to have access to the original data, while ensuring it is meaningless and opaque to everyone else.  

If analytics is the goal of your sensitive data migration, then the preferred security solution is tokenization for its ability to balance data protection and sharing.

4 Major Benefits of Tokenization for Cloud Migration

When it comes to solving security-related cloud migration challenges, tokenization has all the obfuscation benefits of encryption, hashing, and anonymization, while providing a much greater analytics usability. Let’s look at the advantages in more detail. 

  1. Tokenization replaces plain-text data with a completely unrelated token that has no value if breached. There’s no mathematical formula or key; the real data remains secure in a token vault.    
  1. You can perform high-level analysis just like you could on real data, without having access to the real thing. In contrast, you have limited analytics capability on anonymized data and none on hashed and encrypted data. With the right tokenization solution, you can feed tokenized data directly from the warehouse into any application, without requiring data to be unencrypted and inadvertently exposing it to privileged users.   
  1. Retaining the connection to the original data enables more granular analytics than anonymization. Anonymized data is hamstrung by the original parameters, such as age ranges, which might not provide enough granularity or flexibility for future applications. With tokenized data, analysts can create fresh slices of cloud data as needed, down to the original, individual PII. 
  1. Tokenization combines the analytics opportunity of anonymization with the strong at-rest protection of encryption. Look for approaches that limit the amount of previously masked data that can revert to its original form (de-tokenization) and also issue notifications and alerts for de-tokenized data so you can ensure only approved users get the data. 

Embedding Tokenization in Your Cloud Migration Data Pipeline

One of the best approaches to solve your sensitive data cloud migration challenge is to embed data security and governance right into your migration pipeline. ALTR has partnered with ETL leader Matillion to do just that. ALTR's open-source integration for Matillion ETL lets you tokenize data through Matillion so that it's protected in the flow of your cloud migration. The ALTR shared job is used to automatically tokenize, as well as apply policy on sensitive columns that have been loaded into Snowflake.

See how it works: 

Wrapping Up

Given the volume of data being generated and collected, enterprises are looking for ways to scale data storage. Migrating their data to the cloud is a popular solution, as it not only solves data volume problems, but also offers numerous advantages. While it would be nice to flip a switch and be in the cloud, it’s not that simple! Moving to the cloud requires a strategy and a big part of that is data security. Tokenization solves one of the biggest cloud migration challenges: sensitive data migration. It delivers tough protection for sensitive data while allowing flexibility to utilize the data down to the individual, allowing companies to unlock the value of their cloud data quickly and securely.  

Organizations are moving more data to the cloud than ever before. According to a 2019 Deloitte study of more than 500 IT leaders, security and data protection is the top driver for moving data to the cloud, moving ahead of traditional reasons like reduced costs and improved performance. With cyber security threats continuing to increase in volume and effectiveness, it makes sense that companies would look to cloud providers and online data warehouses for expert, enterprise level security and protection.  

But while some companies are comfortable with this shift, others feel like it’s too much too fast. Because they've been used to maintaining data within their own data centers with their own security solutions and policies, moving both data and security to a third party can seem like there’s just too much outside IT and security managers' control. Various security providers have stepped in with products that deliver that control, many using a cloud proxy server. While at first glance this may seem like an ideal solution, it’s really a step back from the benefits of moving to the cloud. Let's look at when it makes sense to use a cloud proxy for data security and when it does not.

A cloud proxy may seem like a good compromise, but it comes with limitations

Vendor-provided cloud proxy security solutions do have some advantages: they can allow you to set up your own policies and maintain custom control over who accesses data, when and how. They can also allow you to centralize control across a wide variety of cloud data stores because they’re not tied to a specific platform or API. But along with this level of control comes additional work for your team.  

You’ll have to worry about deploying, maintaining, upgrading and scaling this additional component in your infrastructure – responsibilities you’ve tried to avoid by moving to the cloud. If you choose a standalone cloud proxy you’ll run into the privacy issues of sending data through a third party. You’ll also need to modify all your applications to go through the cloud proxy. And if you have applications that don’t go through the proxy, you can’t see what users are doing with your data, let alone stop them. The applications that do go through the proxy may run into issues when cloud platforms make configuration changes you’re unaware of, forcing downtime. Many of the largest cloud applications and data platforms, like Microsoft Office 365, discourage proxy use.  

Control of your data, without owning the infrastructure: the future of cloud platforms

The good news is that today you have options that didn’t exist even a few years ago. As more and more applications, workloads, and data move to the cloud, more and more supporting infrastructure is moving as well. Leading cloud apps and platform providers are building in ways for you to have the same kind of hands-on control over your data you had when it lived on your infrastructure or even via a cloud proxy security solution – because it is your data. Salesforce, for example, allows users to disguise and tokenize emails in their platform using a third-party service, making them visible only at the point of sending a mass email via marketing automation software. The leading SaaS providers understand they can gain competitive advantage by helping users who were not as comfortable with the move to the cloud get comfortable.  

Cloud proxy

Snowflake is a leader in this space as well, seeing themselves as part of that larger cloud ecosystem. Snowflake provides an extensible platform that allows you to choose to run the platform’s powerful security tools or integrate third-party security solutions like ALTR that sit beside the data, instead of between the data and your users like a proxy would. You get all the benefits of a cloud solution – scalability, stability, low maintenance – and all the advantages of running your own security – the ability to mask data so it’s invisible to Snowflake, for example. This allows you to maintain that sense of “checks and balances.”  

Today’s best platforms understand that they’re part of a cloud infrastructure and data ecosystem, and they’re allowing other products to plug and play natively vs using a cloud proxy to provide those features. The future of sensitive data in the cloud is integrations like this, and cloud data stores will continue to launch features that allow users to control their data more intimately.  

The benefits of a cloud platform with control of your sensitive data. That’s where the future is.

Want to see how ALTR integrates with OneTrust and Snowflake? Check out our on demand webinar, "Simplifying Data Governance (and Security) through Automation." Click here to learn more!

cloud proxy

Imagine you’re in the midst of a brand-new data governance project. You’ve done your data discovery and classification – you know where your sensitive data is. Now you just have to write the policies that determine who gets access to it. But how should you approach setting up those permissions? This blog will walk you through considerations for RBAC vs ABAC vs PBAC. If you already know what those mean, great. If not, don't worry, we'll explain.

Let's Start with RBAC or Role-Based Access Control

Starting with marketing, you think, “Marketing sends emails to customers, so they need access to customer information.” It just makes sense, right? This would be “role-based access control” (RBAC) – users get access to data based on their spot in the company org chart.  

But if you start to dig deeper, it gets more complicated – does the VP of brand need customer email access? Does a copywriter? Unless you’re actually in marketing – or product development or finance – it can be difficult to know what roles actually need access to specific sensitive data in order to do their jobs. Does everyone in finance need access to all the financial information? Does a UX Designer in the product team need access to all the technical product specs? People on the same teams can have very different jobs and need very different tools and access. And, if people have access to sensitive data that’s not essential to their jobs, it just increases the risk of potential theft or loss.  

rbac vs abac vs pbac

How could you find out who needs what? Do a survey or reach out to each individual to ask? Or work with the department head who may not even know exactly what each person needs? Now that can get really time consuming. Let's take a look at some other approaches - RBAC vs PBAC vs ABAC - and don't worry, we'll explain each.

Asking “Why” Leads to "PBAC" or Purpose-Based Access Control

You could go ahead and set up your role-based access controls – give access to the marketing team – and then see what happens. See who uses what data, when and how much. And then you can ask the most important question: why? You can get very targeted about why a user is accessing certain data at certain times. Maybe the Marketing Ops Specialist is accessing 3,000 customer emails on Thursdays to send out a coupon for the weekend. Not only does watching consumption allow you to narrow in on the data being accessed, but it also acts as a source of truth. Instead of relying on people to accurately tell you what data they need, you can see what data they’re actually using.  

Once you understand the reason the data is being accessed and agree that this is an acceptable business usage, you can put policies in place that only allow this data to be accessed by this user at this time in this amount. We call this purpose-based access control. Your policies are no longer about whether someone is “on the marketing team”, but rather “this person needs to send emails for the marketing team.”  

PBAC vs "ABAC" or Attribute-Based Access Control

Purpose-based access control might be considered a form of “attribute-based access control” (ABAC) where the attribute is simply “purpose.” Focusing just on purpose has a few advantages over full-fledged ABAC though: first, it’s significantly less complicated and it’s much more accurate. Instead of setting up and maintaining thousands of attributes about the user, the data, the activity, and the context, you’re just focusing on the one that really matters: why the data is needed. And since it’s based on real consumption information, there is no guessing who might need what when or where. Instead of trying to write rules for a limitless number of contingencies, purpose-based access control narrows in on actual data usage and need.  

rbac vs abac vs pbac

This leads to another key advantage: if you know “why” someone needs the data, you also know “how much” they need. Going back to our marketing email example, you can look at the pattern of consumption over the last few weeks and find that they generally only send out about 3,000 emails each time. It would be out of the ordinary to send 10,000. So, you can set up your “this person sends marketing emails” purpose-based policy to have a limit of 3,000 at a time, only on Thursdays. If they need to do something out of the norm, they can ask for that permission. Setting thresholds like this protects privacy and reduces the potential for unintentional data leaks or credentialed access theft.  

Knowing “why” also puts you better in line with privacy regulations that require companies to not only provide what data they store on people, but also for what purpose. Having that information at your fingertips makes it easier to comply with the law.  

Finally, because it’s based on just one variable – does this person need to do this activity – it’s much simpler to automate approvals and access when a new person joins the team or if the activity shifts to another team member.  

Build Purpose-Fit Policies by Watching Consumption from the Start

Now imagine you had included watching consumption as part of your data discovery and classification project – as you were finding sensitive data, you were also finding out how it was being used. Maybe marketing isn’t the only group using marketing data. You would have a true-to-life map of data usage throughout the company, you could build your purpose-based access controls right the first time based on real need, and you would spend less time adjusting policies, making exceptions, and tightening access.  

Purpose-based access control is as simple as RBAC in that it focuses on just one variable, yet it can be a better fit to your org because it’s based on how data is actually used, it’s simpler to automate, and it reduces risk by limiting access to a narrow amount of data only to those who need it for a specific purpose.

Now, the next question is, “Why aren’t you doing this already?”  

For our suggestions in building out your data governance ecosystem, check out our latest eBook, 5 Steps to Secure Cloud Data Governance.

Get the latest from ALTR
Subscribe below to stay up to date with our team, upcoming events, new feature releases, and more.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.