Apache Ranger is an open-source project for providing data access control in a Hadoop ecosystem. It is (now merged with Cloudera as) a complete solution for effecting data governance and access controls in the cloud.
- Implementing a data lake in the cloud on S3
- Need to consider access control for their use cases
- Need a governance model to support big data processing, analytics, and ML
Apache Ranger’s Comprehensive Access Control System for Several Hadoop Components
Cloud data lakes provide lines of business a broad platform for analytics and machine learning. The goal is to get insights from data that will inform business decisions and drive value for customers. The platform teams that support data lakes want to enable more adoption, which means more lines of business, more product and solution partners, auditors, and regulators. They all will need access to the same data, but in a form that suits their roles and responsibilities.