Tuesday, July 31, 2012

Domains+Nodes+Service Manager+Application services

Domains+Nodes+Service Manager+Application services -> Informatica PC

Nodes
When you install Power Center Services on a machine, you add the machine to the domain as a node. You can add multiple nodes to a domain. Each node in the domain runs a Service Manager that manages domain operations on that node. The operations that the Service Manager performs depend on the type of node. A node can be a gateway node or a worker node. You can subscribe to alerts to receive notification about node events such as node failure or a master gateway election.

Gateway Nodes
a gateway node is any node you configure to serve as a gateway for the domain. One node acts as the gateway at any given time. That node is called the master gateway. A gateway node can run application services, and it can serve as a master gateway node. The master gateway node is the entry point to the domain.

The Service Manager on the master gateway node performs all domain operations on the master gateway node. The Service Manager running on other gateway nodes performs limited domain operations on those nodes.

You can configure more than one node to serve as a gateway. If the master gateway node becomes unavailable, the Service Manager on other gateway nodes elect another master gateway node. If you configure one node to serve as the gateway and the node becomes unavailable, the domain cannot accept service requests.

Worker Nodes
A worker node is any node not configured to serve as a gateway. A worker node can run application services, but it cannot serve as a gateway. The Service Manager performs limited domain operations on a worker node.
Domains Overview
Power Center has a service-oriented architecture that provides the ability to scale services and share resources across multiple machines. High availability functionality helps minimize service downtime due to unexpected failures or scheduled maintenance in the PowerCenter environment.

The PowerCenter domain is the fundamental administrative unit in PowerCenter. The domain supports the administration of the distributed services. A domain is a collection of nodes and services that you can group in folders based on administration ownership.

A node is the logical representation of a machine in a domain. One node in the domain acts as a gateway to receive service requests from clients and route them to the appropriate service and node. Services and processes run on nodes in a domain. The availability of a service or process on a node depends on how you configure the service and the node. For more information, see Nodes.

Services for the domain include the Service Manager and a set of application services:

Service Manager. A service that manages all domain operations. It runs the application services and performs domain functions on each node in the domain. Some domain functions include authentication, authorization, and logging. For more information, see Service Manager.

  Application services. Services that represent PowerCenter server-based functionality, such as the Repository Service and the Integration Service. The application services that runs on a node depend on the way you configure the services. For more information, see Application Services.

The Service Manager and application services control PowerCenter security. The Service Manager manages users and groups that can log in to PowerCenter applications and authenticates the users who log in to PowerCenter applications. The Service Manager and application services authorize user requests from PowerCenter applications. For more information, see Security.

The PowerCenter Administration Console consolidates the administrative tasks for domain objects such as services, nodes, licenses, and grids and for users, groups, and roles. You manage the domain and the security of the domain through the Administration Console.

To use the SSL protocol to transfer data securely between the Administration Console and the Service Manager, configure HTTPS for all nodes on the domain. You can configure HTTPS when you install PowerCenter or using infasetup commands. The Administration Console uses the HTTPS port to communicate with the Service Manager. The gateway and worker node port numbers you configure for communication with the Service Manager remain the same. Application services and PowerCenter Client applications communicate with the Service Manager using the gateway or worker node port.

If you have the high availability option, you can scale services and eliminate single points of failure for services. Services can continue running despite temporary network or hardware failures.

Service Manager

The Service Manager is a service that manages all domain operations. It runs within Informatica Services. It runs as a service on Windows and as a daemon on UNIX. When you start Informatica Services, you start the Service Manager. The Service Manager runs on each node. If the Service Manager is not running, the node is not available.
The Service Manager runs on all nodes in the domain to support the application services and the domain:



Application Service Support: - The Service Manager on each node starts application services configured to run on that node. It starts and stops services and service processes based on requests from clients. It also directs service requests to application services. The Service Manager uses TCP/IP to communicate with the application services.
Domain support: - The Service Manager performs functions on each node to support the domain. The functions that the Service Manager performs on a node depend on the type of node. For example, the Service Manager running on the master gateway node performs all domain functions on that node. The Service Manager running on any other node performs some domain functions on that node.
Domain Functions Performed by the Service Manager:-
·        Alerts
·        Authentication
·        Authorization
·        Domain Configuration
·        Node Configuration
·        Licensing
·        Logging
·        User Management

Application Services

Application services represent PowerCenter server-based functionality. Application services include the Repository Service, Integration Service, Reporting Service, Metadata Manager Service, Web Services Hub, SAP BW Service, and Reference Table Manager Service. When you configure an application service, you designate the node where it runs.
You can also create a grid to run on multiple nodes and assign an Integration Service to run on a grid. When you run a workflow on the grid, the Integration Service distributes workflow tasks across nodes of the grid.
When you install PowerCenter Services, the installation program installs the following application services:
Integration Service
Repository Service
Reporting Service
Metadata manager Service
SAP BW Service
Web Services Hub
Reference table manager services 


When you configure an application service, you designate a node to run the service process. When a service process runs, the Service Manager assigns a port number from the port numbers assigned to the node.
The service process is the runtime representation of a service running on a node. The service type determines how many service processes can run at a time. For example, the Integration Service can run multiple service processes at a time when you run it on a grid.
If you have the high availability option, you can run a service on multiple nodes. Designate the primary node to run the service. All other nodes are backup nodes for the service. If the primary node is not available, the service runs on a backup node. You can subscribe to alerts to receive notification in the event of a service process failover.
If you do not have the high availability option, configure a service to run on one node. If you assign multiple nodes, the service will not start.

Integration Service

The Integration Service runs sessions and workflows. When you configure the Integration Service, you can specify where you want it to run:
On a Node: - If you do not have the high availability option, you can configure the service to run on one node.
On a Grid: - When you configure the service to run on a grid, it can run on multiple nodes at a time. The Integration Service dispatches tasks to available nodes assigned to the grid. If you do not have the high availability option, the task fails if any service process or node becomes unavailable. If you have the high availability option, failover and recovery is available if a service process or node becomes unavailable.
On multiple nodes: - If you have the high availability option, you can configure the service to run on multiple nodes. By default, it runs on the primary node. If the primary node is not available, it runs on a backup node. If the service process fails or the node becomes unavailable, the service fails over to another node.

Repository Service

The Repository Service manages the repository. It retrieves, inserts, and updates metadata in the repository database tables. If the service process fails or the node becomes unavailable, the service fails.
If you have the high availability option, you can configure the service to run on primary and backup nodes. By default, the service process runs on the primary node. If the service process fails, a new process starts on the same node. If the node becomes unavailable, a service process starts on one of the backup nodes.

Reporting Service

The Reporting Service is an application service that runs the Data Analyzer application in a PowerCenter domain. You log in to Data Analyzer to create and run reports on data in a relational database or to run the following PowerCenter reports: PowerCenter Repository Reports, Data Profiling Reports, or Metadata Manager Reports. You can also run other reports within your organization.
The Reporting Service is not a highly available service. However, you can run multiple Reporting Services on the same node.
Configure a Reporting Service for each data source you want to run reports against. If you want a single Reporting Service to point to different data sources, create the data sources in Data Analyzer.

Metadata Manager Service

The Metadata Manager Service is an application service that runs the Metadata Manager application and manages connections between the Metadata Manager components.
Use Metadata Manager to browse and analyze metadata from disparate source repositories. You can load, browse, and analyze metadata from application, business intelligence, data integration, data modelling, and relational metadata sources.
You can configure the Metadata Manager Service to run on only one node. The Metadata Manager Service is not a highly available service. However, you can run multiple Metadata Manager Services on the same node.

SAP BW Service

The SAP BW Service listens for RFC requests from SAP NetWeaver BI and initiates workflows to extract from or load to SAP NetWeaver BI. The SAP BW Service is not highly available. You can configure it to run on one node.

Web Services Hub

The Web Services Hub receives requests from web service clients and exposes PowerCenter workflows as services. The Web Services Hub does not run an associated service process. It runs within the Service Manager.

Reference Table Manager Service

The Reference Table Manager Service is an application service that runs the Reference Table Manager application in a PowerCenter domain. Use the Reference Table Manager application to manage reference tables that contain reference data.
The Reference Table Manager Service is not highly available. You can configure it to run on one node.




SOURCE:
informatica-power-center.blogspot.com


Sunday, July 29, 2012

MDM Intro


Why MDM is in so much Glory?
-------------------------------------------
In last few years, big product Line Company realized, that ERP and Data ware house are not able to help much in tackling the issue such as, data redundancy, Duplicity, inaccurate or inconsistent data. Data ware housing is simple and straight solution. But it will take lot more, to manage business data, the data which is flowing into the business, or different line of business. It will take lot more data rules, process rules, Trust rule to consume the data in an efficient way. And once you do all this, you prepare something worth, and then those records need to flow in the Business, those line of business, from where it actually got collected.
MDM is state of art design. It is way much bigger concept than data ware housing. You can say that, it is reverse of data warehousing at least in one way [m sure... :)]. It’s a big topic to discuss that, what is difference between Data ware housing and MDM… and I are not going to fight on this
We keep history in Data ware house, but in MDM hub. We discard history; keep only single copy of fact.

What is Master Data Management?
--------------------------------------------------
MDM is a tool that removes duplicates and creates an authoritative source of master data.
Master data are the products, accounts and parties for which the business transactions are completed.
The root cause problem stems from business unit and product line segmentation, in which the same customer will be serviced by different product lines, with redundant data being entered about the customer (aka party in the role of customer) and account in order to process the transaction.
The redundancy of party and account data is compounded in the front to back office life cycle, where the authoritative single source for the party, account and product data is needed but is often once again redundantly entered or augmented.
MDM has the objective of providing processes for collecting, aggregating, matching, consolidating, and quality-assuring, persisting and distributing such data throughout an organization to ensure consistency and control in the ongoing maintenance and application use of this information.


What is master data management exactly? It is a combination of processes and technology that help us to manage such data in a better way.

Processes: - Data Stewards/governance groups, business rules.

Technologies: - master data repository, data integration tools and data Quality tools and many more other application.

Predecessors of MDM: - I can say that before 2003. This kind of data approach Were handled by custom built application [processes spreadsheet] and also Using data dictionaries kind of things.
                                But now we talk about MDM is now generally taken to be the Term that encompasses all of these approaches regardless of the domain (Customer, product, sales etc). But it is also depends on the design approach. It can be for the single domain like customer or can be built using all the
Domain.


Vendors: - Numerous vendors are competing for MDM solution, they claim they have
All the necessary functionality to support the MDM in their tool. But really, if we decide by our self, what are the basic criteria, on which we can decide, which tool we should go for. in my opinion, if i have to select some tool for MDM solution, I will look for below parameters.

è Governance support -- the creation, update and the retirement of master data definitions
Come down to business processes. There should be capabilities in the tool that will
Keep track of all the data flow, through various stages.

è Business rule deployment -- Business rules are something that drive the update
Of master data can themselves be stored in a repository and may include.
§  Rules around Business Processes.
§  Derived Data
§  Business hierarchies.

è Data rules implementation -- Data rules are different than the business rules that could be following
§  Validation rules like value should be integer or character.
§  Dependency rules like if Oil well type is exploration, then fields can be 'DRY' or 'Success'
§  Matching rules [to identify potential duplicate]
§  Naming standards or Internationalization of names or attributes.
§  Stats about data.
                                               
                                Above is the different way of applying the data rules. But the Primary goal for all above speech is Data should be accurate, correct, current, complete and relevant. There is no. of tools that provide data quality capabilities.  But before that, we also need to know the discrepancies, at least we will be able to decide, that what data rules, we need.

è Data Provision -- this aspect of functionality within an MDM application covers how master data is to be accessed by the people. This includes.
§  Reporting
§  Search
§  Browsing
§  Security [Diff roles for different level access]
§  publication for open access
§ 
è Data profiling capabilities: - this is also one of the major criteria, which we should look for in the tool. Data profiling gives us the clear picture of what is the data stats, like how many NULL value, MAX, MIN, UNIQUE and many more kind of stats. These Stats helps us in the next level to cleansing of the data.

è Master data Storage: - at the heart of many MDM projects is a repository or database in which golden copy master data is held. There is need of understanding that why a data warehouse is not a true MDM repository, and vice versa. A data warehouse, because its purpose is to produce reliable information, should have only 'Clean' consolidated data stored within it. Yet an MDM repository should be able to keep track of such incomplete data and track it through the stages of it becoming golden, ideally retaining an audit trail of the steps involved.

è Data movement and Synchronization. :- data movement and synchronization is about how the data’s coming into the HUB and how we are maintaining the whole Publish and subscribe policy among all the connected application, be it ERP downstream application or legacy application.




Tuesday, July 24, 2012

Absurd Interviewer

Last Weekend, i went for an interview in some XYZ Company. The selection process was so absurd there, it was like they were taking  LKG admissions. they called 10 guys at a time and given one scenarios to each of the guy and asked them finish those. The Vacancy was going on for 4 - 5 years Exp guys. what the heck, is this the way, to handle experienced guys, and how come 1 question could be the criteria for selection or rejection. Even though, the guy who was handling these things. he was simpy a dumbhead, not replying to any of the questions, that it's right or not. I am not even sure, that guy was having of domain knowledge or not. so Funny....!!!!!!!!!