Skip to content

Data Lake Security

Protect sensitive data at scale and gain business agility

As new users and workloads are onboarded to the data lake, security and governance become more of a priority - and in many cases, a hindrance to the data scientists and analysts seeking to leverage data for competitive advantage and business innovation. The costs of failing to protect sensitive data are high and can include regulatory penalties, reputational damage, even a direct loss of customers.

Okera enables data platform teams to provide secure data access at scale, service the enterprise governance requirements, and enable self-service data analytics to increase usage and adoption of the data lake and ensure its success within the enterprise.

Automatically discover and tag sensitive data

Register metadata from multiple sources, such as object storage and relational databases, for full data lake visibility. Sensitive information such as credit card number or IP address are automatically tagged as the data lands in the lake, but data stewards can also create custom tagging rules. Okera can also leverage metadata from external sources, whether business metadata or technical metadata, for a full unified view.

Dynamically enforce fine-grained access control

Eliminate the need to create multiple copies of a single dataset in order to control access for different use cases. Okera enforces data access policies dynamically at run-time, so each user will only see the data they are authorized to view. Access restrictions can be applied at the file, column, row, and even at cell-level, and Okera supports a variety of de-identification types, including masking, redaction, tokenization, and even differential privacy.

Gain security without compromising on scalability

Okera is cloud-native, fully containerized, and horizontally scalable like a distributed data analytics engine. It supports policy enforcement across analytical workloads involving petabytes of data and multiple concurrent users while avoiding bottleneck . Based on the complexity of the policy and the capability of the analytics engine being used, Okera chooses the most optimal path of enforcement in order to maintain the balance between security and performance.

Audit the data lake to maintain least privilege

Oftentimes users have access to data that they don’t actually need to do their jobs, which unfortunately can pose a huge security risk. Okera’s Inactivity Reporting feature reveals the overlap of “Who can access this data?” and “Who is actually accessing this data?” Data stewards can use these insights to remove those users from that particular role or LDAP group, so that the enterprise can maintain the principle of least privilege.