Automating Data Sharing Agreements: Collibra and Okera

What are Data Sharing Agreements

When data sets originate from a company’s own data sources, it is critical to understand which data sharing agreements and policies are applied to mitigate the security risks related to data. For each piece of information which falls under regulations like GDPR, companies must have rules or agreements on how to process this information securely, or risk fines and reputational damages to the business.

Data sharing agreements establish the arrangement between data producers and data consumers on how data sets can be used, with terms and conditions. For example: Sales growth information (e.g. datasets) that is available for the Risk team to generate internal reports only.

Policies are the statements of intent implemented by a set of rules and applied on a data set. The detailed data security rules, sometimes called safeguards, help to define in the most granular detail and are usually applied to each data element inside a data set (e.g. which department or which role can access sensitive information).

In large companies, analytical efficiency or “time spent finding datasets” is one of the Key Indicators companies are striving to improve – with minimum data related risks, of course. The question then becomes, “how do we make our data available for data consumers, and protected at the same time?”

Okera and Collibra, both leaders in their respective spaces, efficiently solve this particular issue with their joint solution: Collibra is used as a governance platform, and Okera provides secure and dynamic data authorization.

Data governance and data access work as a single smooth process in which data consumers combine Data Elements together (creation of a Data Set) and ask for the data source access without exposing the company to any data related risks.

From the “analytical efficiency” point of view, Collibra provides a rich toolset to minimize time for finding, understanding and trusting information within your company. Okera automates the provisioning of secure access, enforcing security rules (safeguards) on the fly, on top of a single source of truth data set, which minimizes storage costs.

Collibra’s Machine Learning (ML) module can recognize metadata sets (Data Elements like Columns, Tables, Schemas) ingested from a data source, and suggests classification tags. For example: columns like “First Name”, “Last Name” can be tagged as “PII: name” and “PII: last name.” After Data Elements are tagged in Collibra, they can be brought into Okera via a bidirectional sync between the two platforms, and used in a “PII” Safeguard Policy.

Example Use Case: Pharmaceutical R&D Teams

The Situation: A pharmaceutical company wants to accelerate time to market for a new vaccine product.

The Problem: It can take years from the start of Research and Development (R&D) processes to the vaccine product production. The ability to leverage data for the research has been identified as a critical component. In the current state of processes, R&D users can NOT efficiently discover patient data and request access to patient data sources.

Solution: To improve understanding of the patient data the Collibra Platform is used to catalog external patient metadata sets from 50 different data sources. Okera’s dynamic data authorization is used to minimize the data exposure risks and automate data access provisioning when R&D users access the data in real-time.

The Data Sharing Agreement (“Data Sharing Agreement between R&D users and patient data sources”) is created and approved by the Governance Council and Data Source Owners. The Governance Council documents more granular aspects of the Data Sharing Agreements with Safeguard assets (e.g: “PII must be tokenized/hidden from R&D users”). Assuming all the metadata from the patient data source is ingested into Collibra, the ML Module can classify data elements by suggesting that columns or file fields contain information like Name, Email and Address. Collibra Tags are created (“PII:Personal Information”, “PII:Contact Information”), synchronized with Okera, and Okera will use them to manage and enforce Security Policies.

As a result, each time R&D users query the patient data source, the real Patient First Name, Patient Email Address, and Living Address will be tokenized (or masked, redacted, etc) on the fly in accordance with the governance policy. There will be no change to the actual data source and no duplicate views created.

Benefits and Conclusion

In addition to automation for data provisioning process, there are other huge benefits which companies gain by implementing the “Data Shopping Experience” process using Collibra and Okera, including:

Minimizing storage capacity and data delivery time
Reducing data exposure risks
Reduction in time spent for data access provisioning
Administration cost-savings
Centralizing data access

For more information, please check out the 5-minute demo of a Data Shopping Experience in action.

If you’d like to learn more about enabling a “shopping for data” experience that will have your analysts finding and accessing new data within minutes, schedule your consultation with an Okera expert.

Automating Data Sharing Agreements: Collibra and Okera

What are Data Sharing Agreements

Example Use Case: Pharmaceutical R&D Teams

Benefits and Conclusion

Recent Posts