IBM Infosphere Datastage Online Training
Advanced Datastage Course Content
1. Installation Run-Through of Server and Parallel Jobs and Deployment
● Run-through the installation process
● Start the Information Server
● Identify the components of Information Server that need to be installed
● Describe what a deployment domain consists of
● Describe different domain deployment options
2. Business Glossary
● Understand and Identify sources of metadata
● Differentiate between business and technical metadata
● Understand metadata as an asset and potential liability
● Understand and apply features and functions of the IBM Information Server Business Glossaries
3. DataStage Utilities
● Analyze the performance of a job
● Estimate the resources needed by a job
● Use all the options of orchadmin
4. MQ Stage
● Connect an MQ queue manager
● Read messages from an MQ queue
● Write messages to an MQ queue
● Retrieve the message payload
● Specify header information to retrieve from a message
5. Metadata in the Parallel Framework
● Explain schemas
● Create schemas
● Explain Runtime Column Propogation (RCP)
● Turn RCP on and off
● Build a job that reads data from a sequential file using a schema
● Build a shared container
6. Repository Functions
● Perform a simple Find
● Perform an Advanced Find
● Perform an impact analysis
● Compare the differences between two Table Definitions
● Compare the differences between two jobs
● Describe the parallel job compilation process
● Explain OSH
● Explain the Score
7. Solution Development Jobs
● List and describe the Warehouse jobs
● Understand the stages and techniques used in the Warehouse jobs
8. Run Jobs from command line
● Create job in designer and run it perform the same from command line
● Run all the jobs that are developed from the command line
● Examine the OSH
● Examine the OSH Dump
● Examine the Dump Score
● Examine the Performance by setting enviroment variables
● Simulate the complete jobs using different environment variables and see the behaviour
9. OSH
● OSH Flow
● OSH architecture
• Difference between job and OSH
• Conversion of OSH to Process Manager
• Process Manager Architecture
• Dump Score
• Dump score with Disable and enable Combination
• How osh works and usage of it?
Job Flow Architecture
Compile time and Runtime Architecture – 8
Under Unix
setup server – 8
install client on windows – 2k and xp – 1
running jobs from command line with all dependencies – 6
Multiservers – rpc – 3
Different betwen multi-instance and non-multi instance scenario – 2
Major issues on client – 1
Basic commands required for datastage to work with Datastage – 2
Identify and attach process – 1
Complete stack for simple job from OSH and from DSJOB – 1
uv commands – 2
JOB MONITORING – 8
• see the behaviour with APT_MONITOR_SIZE
• see the behaviour with APT_MONITOR_TIME
• see the job monitoring values in director
• see how row distribution happens – equal or unequal
• how to run the jobmon from command line
• how to use jobmon ports and run the jobmon on different ports
Sample test cases
316 sample jobs with real time scenarios in mind
Debugging
At Compile Level
At Runtime Level
• simple jobs troubleshooting
• complex jobs troubleshooting
• debug issues with peek
• debug issues with copy
• troubleshoot issues with OSH
• debug issues OSH PID’s from the command line
• troubleshoot issues with RT_STATUS
• troubleshoot issues with RT_LOGS
• troubleshoot hang and crash issues for a given job
• identify defuncts for a given job and workaround resolution for the same
Performance Measurement and Tuning
• Measure parallel jobs performance using performance measurement tools like resource estimaion, top etc.,
• Identify the bottlenecks for a given job/s
• Tune using Environment Variables
• apply optimization techniques using APT_DISABLE_COMBINATION and other environment variables
• Tune using Buffer Settings
• Apply Server side tunables
• Apply DS Engine side tunables
• With cleanup activities – like purge settings
• With RT_LOG Settings
• With UV Commands or from the client
• Execution of jobs or sequencers in parallel by using best optimal way of taking care of dependencies etc.,
• Avoid network issues from client to server by using shell scripting etc.,
• Apply database tunables[if there is any database usage on a given jobs]
• Check disk usage and pools
• Change/optimize all the configuration files for all the jobs to avoid resource crunch issues
• Optimize all OS level parameters
• Check all project level settings which are applied to all the jobs
• Change/optimize all jobmon settings and relevant java settings
• Selection of proper partitioning technique based on the business need
custom stage
build op
wrap stages
What is new in 8.5?
Introduction : Overview of what’s new and what’s the same.
Deployment
What gets deployed
Possible deployment configurations
New Suite Installer
Administering DataStage
Information Server web console
What’s New in DataStage Administrator?
What’s new in Designer?
Transformer Stage Enhancements
Parameter sets
Null processing
Loop processing
Group processing
Lookup stage enhancements
Range Lookups
Repository Functions
Flexible folder organization
Repository search
DataStage component export enhancements
Impact analysis
Job and Table Definition differences
Working with Relational Data
Connector stages
Data Connection objects
Reject links
Multiple input links
SQL Builder
Complex Flat File Stage
SCD stage
XML stage
Intersecting with other IS tools : Just slides
Shared metadata Repository
FastTrack : Point them to FastTrack course
Metadata Workbench : Point them to MWB course
Course Reviews
No Reviews found for this course.
Write a Review