Enterprise Data Preparation

Instructor Led | Data Catalog | 2 Days | Version 10.5.1

Course Overview

Gain the required knowledge and skills to transform, cleanse, and enrich data in the data lake using an Excel-like spreadsheet interface in a self-service manner. Using Informatica Enterprise Data Preparation (EDP), publish and share knowledge with the rest of the community to analyze the data using third-party BI or analytic tools. Applicable to version 10.5.1.

Objectives

After successfully completing this course, students should be able to:

  • Understand the architecture and components of EDP
  • Configure the EDP application services
  • Prepare the model repository service for use with EDP
  • Configure a Hive resource to represent the data lake
  • Search the catalog for data that resides in and outside the data lake
  • Discover the lineage and relationships between data that resides in different enterprise systems
  • Create and add data assets to the project
  • Explore data in the worksheets
  • Apply the data preparation steps to sample, combine, cleanse, transform and mask the data
  • Publish the prepared data to the data lake
  • Save the recipe as a mapping
  • Operationalize the mapping
  • Schedule and monitor publication
  • Share the published data with other collaborators
  • Analyze the data using third party BI or analytic tools

Target Audience

  • Data Analyst
  • Business User

Prerequisites

  • None
Agenda
Module 1: Enterprise Data Preparation Overview
  • Major Business Challenges
  • Overview of Enterprise Data Preparation
  • Overview of Data Discovery and Analysis Process
  • Architecture & Components
  • Enterprise Data Preparation Concepts
    • Data Lake
    • Data Asset
    • Projects, Worksheets, and Recipes
    • Data Publication
    • Data Visualization
  • User Tasks overview
  • Lab: Getting Started
Module 2: EDP Administration Overview and Tasks
  • Role of an Administrator in EDP
  • Tasks of an Administrator in EDP
  • Tools used by the Administrator in EDP
  • Lab: Configure EDP Service
Module 3: Discover Data
  • Search for Data Assets
  • Data Asset Views
    • Overview view
    • Data Preview
    • Lineage and Impact View
    • Relationships
  • Lab: Understand the Home Page and View Assets
Module 4: Work with Data Assets
  • Upload a Data Asset
    • Uploading a File Directly to the Data Lake
    • Uploading a File with CLAIRE® Discovered Structure
    • Uploading a File with User Defined Structure
  • Import a Data Asset
  • Export a Data Asset
  • Download a Data Asset
  • Delete Data Assets
  • Create and Manage Projects
  • Add assets to the project
  • Lab: Upload and Download a Data Asset
  • Lab: Import and Delete a Data Asset
  • Lab: Create a Project
 
Module 5: Prepare Data
  • Data Sampling Overview
    • Sampling Table Data
    • Applying Filters
    • Sampling JSON Files
  • Explore Data in the Worksheet
    • Worksheet overview
    • Column overview
  • Data Blending
    • Joining Worksheets
    • Merging Worksheets
    • Using Lookup
  • Categorize Data
  • Using Formulas
  • Using Recipes
  • Summarize Data
    • Aggregate Data
    • Pivot Operation
  • Using Rules
    • Passive Rules
    • Active Rules
  • Using Window Functions
  • Apply One Hot Encoding
  • Lab: Sample and Filter Data
  • Lab: Explore Data in the Worksheets
  • Lab: Blend Data in the Worksheets
  • Lab: Categorize Column Data
  • Lab: Using Formulas
  • Lab: Transformation of Data
  • Lab: Pivot Data
  • Lab: Aggregate Data
  • Lab: Apply rules 
  • Lab: Apply Window function
  • Lab: One Hot Encoding
Module 6: Data Publication and Scheduling
  • Data Publication Overview
  • Scheduling Overview
  • Saving a Recipe as a Mapping
  • Operationalize a Mapping
  • Lab: Publish prepared Worksheets
  • Lab: Schedule a Publication of a Worksheet
  • Lab: Save recipe as a Mapping
  • Lab: Operationalize Mapping
Module 7: Data Visualization
  • Data Visualization Overview
  • Prerequisites Overview
  • Create and Share a Notebook
  • Lab: Create Apache Zeppelin notebook for Visualization


Enroll Now

Back to Course Overview

Power User Axon for Community Users (Instructor Led or onDemand) Axon Content Curation (Instructor Led) Axon for Power Users (Instructor Led) Axon Data Governance (Professional Certification) Axon Data Governance (Professional Certification) Axon Data Governance (Professional Certification) Some more content to make this bigger asdf asdf asdf

Informatica offers programs to extend learning in convenient and economic packages. Programs include self-paced subscriptions as well as bundled instructor led training and certifications. Each program is curated around a specific skillset to enable customer success.

365University Data Governance Annual Subscription

Informatica MasterPass Education Subscription

Informatica Learning Library

Data Governance & Privacy Journey Master

View Full Course Offerings