Simplify attribute-based and fine-grained data access control
Enjoy the cost savings and flexibility you want from Amazon EMR and the sensitive data protection you need from Okera.
Okera helps the world’s largest organizations analyze big data safely, securely, and responsibly.
Struggling to use big data responsibly at the scale and velocity required to innovate?
Okera nScale co-locates on your Amazon EMR cluster, so no matter how big your data lake, or how many compute nodes spin up, Okera hums along to protect every query.
Big Data Presents Unique Security Challenges
The separation of storage and compute is one of the most impactful and consequential innovations in modern computing. But the separation introduces a data security gap. Without an integrated database, where do you define data access controls? Nowhere? Everywhere?
Ambiguity is risk. It holds back companies who want to migrate sensitive data workloads to the cloud, but are reluctant because they don’t know if they’ll be able to comply with data security and privacy regulations.
Solve Your Hardest Data Access Governance Problem
With Okera, you can have it all: the agility of cloud computing, cost benefits of separation of storage and compute, collaboration with non-technical data stakeholders to accelerate compliance with data privacy regulations, and better security at lower effort.
- Advanced yet simple-to-manage data access management
- Centralized IT control with the ability to delegate authority and accountability to business, security, and privacy stakeholders
- Powerfully simple row-level security dramatically reduces policy complexity
- Fine-grained access control (FGAC) down to the column, row, and cell level
- Attribute-based access control (ABAC) reduces errors and enables economy of scale
With Okera, data policies are separate from data compute, which is separate from data storage. Create and manage platform-agnostic policies in Okera, configure EMR to bootstrap the nScale enforcement fleet, and you’re done!
Okera nScale for Amazon EMR
Okera nScale is a distributed data policy enforcement fleet that runs on Amazon EMR. It is a data security control layer that operates between your S3 data lake and popular compute frameworks such as Spark, Hive, and Presto.
Transparent to users
Seamlessly supports the most demanding EMR workloads
Bootstraps with your cluster
Ideal solution for ephemeral workloads and those with unpredictable scaling requirements
Broad compatibility and version support
Enables more business use cases on a cost-effective and powerful platform
Zero Trust in Practice
With Okera, you can implement zero trust: simply deny EMR clusters all access to S3. No more managing complex IAM roles for each cluster or reconciling user roles.
Fewer configuration requirements means fewer opportunities for error.
Secure Data Access Isolation
Your compute engine (Spark, Hive, Presto) receives user query requests, and through a lightweight plugin reaches out to the Okera policy engine off-cluster for authorization.
The Okera policy engine vends temporary credentials to nScale — not the compute engine. nScale processes are co-located on-node with Spark, Hive, or Presto workers. User code — including custom UDFs — never touch the data lake.
Data access is delegated to Okera nScale so it can securely retrieve specific S3 buckets. Within this isolated process nScale applies data authorization policies, such as dynamic row-level filters, hiding columns, and data tokenization and masking.
nScale then streams cleaned, authorized data to the compute workers for analytics and business logic processing.
Co-Location for Extreme Elasticity and Performance
Okera nScale co-location provides the elastic scalability needed for big data environments.
Simply bootstrap nScale to load as your Amazon EMR cluster scales up, and terminates along with nodes that scale down. nScale remains in perfect sync on each node for exceptional performance and to support extreme elasticity.
Cost Savings & Reduced Attack Surface
Instead of replicating data into multiple security zones, with Okera you can maintain a single authoritative version of your data.
You pay less because you reduce redundancy and operating costs.
You also minimize risk because fewer data copies means a smaller attack surface and less opportunity for data to get into the wrong hands.
Compare Okera nScale with EMR Record Server
Okera is an AWS Advanced Technology Partner.
Okera nScale and Amazon EMR Record Server address the problem of secure data access at scale.
Both use a distributed enforcement fleet that is purpose-built to enforce data policies. The fleet receive temporary credentials to retrieve data from S3 buckets, then pre-processes data for security before sending cleaned data to the compute engine.
See how Okera nScale and Amazon EMR Record Server are different.