View Course Agenda

Data Engineering Integration for Administrators

Instructor-Led | Data Engineering | 3 Day | 10.4 

Data Engineering Integration: Administration

Course Overview

Applicable for users of software versions 10.4. Set up a live DEI environment by performing various administrative tasks such as Hadoop integration, Databricks integration, security mechanism set up, monitoring, and performance tuning. Learn to integrate the Informatica domain with the Hadoop and Databricks eco-system leveraging Hadoop’s lightning processing capability, and Databricks’ analytics cloud platform technology to churn huge data sets.


infa-university-lp-buttons.png 

Objectives

After successfully completing this course, students should be able to:

  • Prepare and list the steps for installation and configuration of DEI 10.4
  • List the steps to enable Kerberos on the Domain
  • List the steps to upgrade DEI from the previous versions to 10.4
  • Create Cluster Configuration Object for Hadoop integration
  • Set up Informatica Security that includes different Authentication and Authorization mechanisms
  • Tune the performance of the system
  • Monitor, view and troubleshoot DEI logs
  • Monitor using REST APIs and log aggregator

Target Audience

  • Administrator 

Prerequisites

  • None

Agenda

Module 1: Introduction to Data Engineering Integration Administration

  • Data Engineering and the role of DEI in the big data ecosystem
  • DEI Components
  • DEI architecture
  • Roles and responsibilities of Informatica DEI Administrator
  • DEI engines: Blaze, Spark, and Databricks
  • DEI 10.4 features

Module 2: Data Engineering Integration 10.4 Installation and Configuration

  • Basic setup for installation
  • Plan the Installation Components
  • Steps to install the DEI product
  • Steps to create and configure Application Services
  • Steps to install the Developer client
  • Steps to uninstall Informatica Server
  • Demo: DEI 10.4 installation

Module 3: Enable Kerberos Authentication on the Domain

  • Kerberos concepts
  • Kerberos protocol authentication steps
  • Single and Cross realm Kerberos authentication
  • Prepare to enable Kerberos Authentication on the Domain

Module 4: Upgrade Data Engineering Integration to 10.4

  • Informatica upgrade overview
  • Informatica upgrade support
  • Steps involved in the upgrade process
  • Steps to upgrade DEI 10.2.2 Server to 10.4
  • Steps to upgrade DEI Developer client from 10.2.2 to 10.4
  • Demo: Upgrade DEI Server and DEI and DEI Developer Client 10.2.2 to 10.4

Module 5: Hadoop Integration

  • Cluster Integration overview
  • Data Engineering Integration Component Architecture
  • Prerequsites for Hadoop integration
  • HDP integration tasks
  • Create a Cluster Configuration
  • Integration with Hadoop
  • Lab: Create Cluster Configuration Object
  • Lab: Explore Cluster Configuration Views
  • Lab: Cluster Configuration Privilages and Permissions

Module 6: Security Overview

  • DEI security
  • Security aspects
  • Authentication overview
  • Authorization overview

Module 7: Kerberos Authentication and Ranger Authorization

  • Kerberos Authentication
  • Ranger Authorization
  • Pre-steps to run mappings in a Kerberos-Enabled Hadoop Environment
  • Run mappings on a cluster with Kerberos authentication and Ranger authorization
  • Lab: Execute Pre-steps for Running Mappings in a Kerberos-Enabled Hadoop Environment
  • Lab: Run Mappings in a Kerberos-Enabled Hadoop Environment

 Module 8: Operating System Profiles

  • Operating System profiles for Data Integration Service
  • Operating System profile components
  • Configure system permissions for the Operating System profile users
  • Enable the Data Integration Service to use Operating System profiles
  • Execute a mapping using OS profiles
  • Lab: Execute a mapping using OS profiles

Module 9: HDFS and Fine-Grained Authorization

  • Authorization
  • HDFS permissions
  • Fine-Grained authorization
  • Lab: Access Directories with HDFS Permissions
  • Lab: Run a Mapping with HDFS Permissions
  • Lab: Restrict Ranger Permissions for Hive Tables and Columns
  • Lab: Run a Mapping with Fine Grained Authorization

Module 10: Data Engineering Recovery

  • DIS processing overview
  • DIS Queuing
  • Execution Pools
  • Data Engineering recovery
  • Monitor recovered jobs
  • Lab: Recover DIS and execute a Mapping using Data Engineering Recovery

Module 11: DEI Performance Tuning

  • DEI Deployment types
  • Sizing recommendations
  • Hadoop cluster Hardware tuning
  • Tune Spark performance
  • infacmd autotune commands
  • Lab: Tune DIS and MRS using infacmd Autotune command

Module 12: Monitoring and Troubleshooting

  • Hadoop Environment Logs
  • Spark Engine Monitoring
  • Blaze Engine Monitoring
  • Log Aggregation
  • Customer pain points and solutions
  • Lab: Monitor a Mapping using Log Aggregator

Module 13: Databricks Overview

  • Databricks overview
  • Steps to configure Databricks
  • Databricks clusters
  • Notebooks, Jobs, and Data
  • Delta Lakes

Module 14: Databricks Integration

  • Databricks Integration
  • Components of the Informatica and the Databricks environments
  • Run-time process on the Databricks Spark Engine
  • Databricks Integration Task Flow
  • Prerequisites for Databricks integration
  • Demo: SEt up Databricks connection
  • Demo: Run a mapping with Databricks Spark engine

 

 
infa-university-lp-buttons.png 

Back to Course Overview


QUESTIONS?

Instructor Led | Data Engineering | 3 Day | Version 10.4

Print Friendly and PDF