Use Case Description
This Playbook describes how to perform Data Profiling using ActiveNav Cloud by discovering the objects within unstructured data repositories and applying Feature Extraction rules to identify items of interest.
A data profile provides you insight into key characteristics of your repositories that can inform and drive information management activities. You can focus on intrinsic properties of objects such as their size, type and age to understand more about unstructured data use in your organization.
The visualization tools within ActiveNav Cloud guide will allow users to assess the overall make up of data, focusing on elements such as data age, type and size. Retaining dark data that no longer has value to the organization represents cost and risk due to the unknown elements that may be contained.
The review capabilities of the platform can be used to involve users in identifying data that is no longer needed and can safely be removed. After the review process validated findings can be exported to be remediated according to the preferred approach of the business.
After performing the initial discovery and acting on the results, refresh of the discovered data on a periodic schedule allows easy monitoring of the status of the data catalog, and supports the maintenance of the desired data profile.
Use Case Flow
Our recommended project flow for the Data Profiling use case is built upon the core steps outlined in the following sections.
In essence there is no one size fits all approach. The size of your project, the user group that will be involved, and the type of repositories you target are all factors that will influence the best way to follow the overall process in your organization.
You may have already begun to use ActiveNav Cloud, in which case you can select specific elements of the Playbook to guide your use of the platform.
Regardless of the pattern you choose we recommend paying close attention to the details outlined in the Preparation phase to ensure that you have all necessary elements in place.
Linear approach for smaller projects
For smaller scale projects which target a smaller number of repositories or areas of the business you may be able to achieve the majority of preparation, planning, and collector deployment activities in advance of beginning your discovery process.
You can then gradually work through the Discover > Review > Extend Scope process to build your unstructured data catalog and assess results.
Iterative approach for larger projects
For larger projects, it may not be reasonable to try and prepare everything before you begin your discovery activities and it would present too much of a delay before you begin to see results and learn about your data.
In the latter case we recommend taking a pragmatic iterative approach:
- Prepare sufficiently to discover your highest priority locations - deploying Collectors, acquiring credentials and planning Business Unit associations.
- Begin the Discover - Review process for this location.
- Continue preparation for additional locations to be discovered
- Continue to prepare further locations while discovery and cleanup has started for earlier locations
- Repeat these steps to steadily build out your unstructured data catalog in parallel to review activities
1. Preparation
Activity Description |
Preparing the key elements of your data discovery project will help ensure that the initial deployment of ActiveNav Cloud will run as smoothly as possible. Taking the time to identify locations, prepare credentials, and to engage with key users will allow you to achieve results as quickly as possible. |
Goals |
A project plan is in place to allow the initial deployment of ActiveNav Cloud to run smoothly. |
Participants |
Project Sponsor, IT team, Project / Program Manager |
Pre-requisites |
|
Outputs |
|
2. Service Deployment and Configuration
Activity Description |
This stage outlines the work that is required to progress your project plan by deploying on premise connectors, configuring them, and validating their operation. |
Goals |
|
Participants |
Project Manager, IT Team, Application Administrator |
Pre-requisites |
|
Outputs |
|
3. First Discovery
Activity Description |
This stage may represent the first full scale discovery of the project, or the first discovery for a specific repository type. It will enable you to understand:
|
Goals |
Begin to understand the characteristics of your data; act on initial findings |
Participants |
Application Administrator, Data Analysts |
Pre-requisites |
|
Outputs |
Data has been discovered and locations assigned to Business Units for review |
4. Profile Review
Activity Description |
This stage represents the review of the data profile characteristics and how to use the insights to make informed remediation decisions. It will enable you to understand:
|
Goals |
Understand the shape of your data both globally and by ownership, and identify areas where data can be remediated |
Participants |
Analyst, Business Unit data owners |
Pre-requisites |
|
Outputs |
A manifest of responsive files or containers that require an action has been exported from ActiveNav Cloud |
5. Remediation Workflow
Activity Description |
Once Data Analysts have completed the review activities that are appropriate for the organization, or for specific Business Units, then the findings must be remediated. There are a number of options that can be considered depending on the scale of the project and any existing business processes. |
Goals |
Reduce overheads by creating an efficient workflow and capture appropriate audit records. |
Participants |
Project / Program Manager, IT Team |
Pre-requisites |
|
Outputs |
|
6. Extend Discovery Scope
Activity Description |
Once the discovery and review process has been established, it should be scaled out to extend the discovered catalog of unstructured data to address further locations and repositories. |
Goals |
Address entire contracted data scope and build a complete view of sensitive data |
Participants |
Application Administrator, Data analysts, IT team |
Pre-requisites |
|
Outputs |
Prevalence and management of sensitive data understood. |
7. Monitoring and Response
Activity Description |
Once you have built a catalog of your unstructured data with ActiveNav Cloud, and addressed the findings of the initial discovery, you will enter a monitoring phase. At this point you can use regular re-discovery of Data Sources to ensure that you do not allow re-growth of content within your unstructured data that does not comply with the desired data profile |
Goals |
|
Participants |
Project Manager, Application Administrator, Data Analysts, IT team |
Pre-requisites |
Initial discovery and review has been performed on the Data Sources to be monitored. |
Outputs |
Up to date visibility of data profile characteristics |