ServiceTechMag.com > Archive > Issue XLIX: April 2011 > The Integration Between EAI and Part I
Jose Luiz Berg

Jose Luiz Berg

Biography

Jose Luiz Berg is a long term project manager and a systems architect with Enterprise Application Integration (EAI). In the past few years, Jose focused his work on implementing Service Oriented Architecture (SOA) for large Brazilian telecommunication companies. He graduated in computer networks, but also has a lot of experience working as a programmer in commercial programming languages, in last 25 years. Jose believes that SOA is one of the most important advances in software development in last decades. As it involves not only a change in the way we work, but also a significantly changes how companies see themselves and their IT resources. This advancement may be a risk, as many companies are being convinced by bad software vendors that SOA is only creating Web services, however they are not focusing on what it really stands for. By doing so they are not realizing that this is important part of the history in the making.

Contributions

rss  subscribe to this author

Bookmarks



The Integration Between EAI and SOA - Part I

Published: April 17, 2011 • SOA Magazine Issue XLIX PDF
 

Introduction

This article is intended to present the relationship between Service-Oriented Architecture (SOA) and Enterprise Application Integration (EAI). Some people have recently said that EAI is dead, it was then replaced by SOA, creating some confusion and enforcing a culture of "forgeting all the past". This is causing a lot of problems and pushing SOA implementation to a complete failure. To estabilish this relationship we are going to start telling the story of EAI, estabilishing its main principles and goals, showing how SOA combines with EAI, and how they could join to create a new and advanced architecture.


History of Systems Integration

In the beginning was the verb “mainframes”. In these "super" computers, the environment was completely controlled. Since the hardware was proprietary, all software was developed by the equipment manufacturer or by the internal development team of each company. At that time, there was only database integration with many systems (usually developed in the same language) accessing data in the databases (sorted sequential files). There was no strict concept of system, but collections of individual programs that shared the same execution environment

In the late 1970's there, began appearing a few personal computers, however, they were very expensive and limited. Finally in 1981, IBM introduced the first of this PC, which brought a silent revolution that went unnoticed by most people at that time. The hardware was no longer proprietary and could be manufactured by anyone. Likewise, the software could be developed and purchased from any company and the user began to choose which of them they preferred to use. Such a choice was the seed of the entire system integration because in a computer company several different systems could be used for each area. It made no sense that the information had been retyped for use in each system, so needed to communicate somehow, to allow synchronization of information.

In the 1990's we saw the ERP systems which were aimed at solving the problem of synchronization of data. Since its modular architecture was developed to provide separate modules for each business area sharing the database, allowed data to be exchanged between them. The problem with this approach was that the suppliers didn’t have all the modules required for the company, and there was no compatibility between modules from different vendors. Also, the modules not always met the needs of each area of business that when opted to use another system, returned to the same problem of having to synchronize information.



Figure 1 – Espagueti Problem

It became obvious that in a large company, the same strategy would be required for systems to exchange information. They would need to avoid repetitive work and re-entering data, and ensuring that all systems use the most updated information. Initially this strategy was to exchange files on floppy disks, and with the introduction of local networks, through file sharing. At this time therealso began to emerge relational databases, using client server architecture, and facilitating the sharing of data. With the evolution of networks and the adoption of the TCP/IP, there began to appear integrations using sockets, allowing a higher speed when sending online information.

At that time, systems integration was called "middleware" or software that runs among other software, allowing communication between them. With the exponential growth in the integration needs, the complexity of middleware systems has grown exponentially. It has characterized the problem called "spaghetti", i.e. the exponential growth in complexity of integrating with the increase in the number of participating systems.

Around 1998 there began appearing a new concept which was the use of an integration bus. With this concept, instead of each system connecting directly to other systems, all systems are connected to the bus. The bus has the intelligence to decide what systems will provide or receive information. With this new concept, the "spaghetti" problem was resolved because the inclusion of a new system on the bus did not have a major impact on the overall integration complexity. Beyond this concept began to appear specific tools for integration. These tools provided the bus and also features such as data-mapping, persistence, state machines and connectors to allow communication using various technologies such as databases, sockets, FTP, HTTP, and special connectors for major ERP and CRM systems.



Figure 2 – Integration Bus

Almost at the same time came the standard for distributed components by Microsoft or DCOM (Distributed Component Object Model), which allowed a program to execute a function on a remote computer without the need for an external agent to perform the communication. This was possible because the support was built into the operating system itself. DCOM was based on fusion of the COM (Component Object Model) with the RPC (Remote Procedure Call) which was already used in X/Java systems for RMI or CORBA calls, but that had not avenged by the lack of standardization in the X world, since every operating system flavor used slightly different and sometimes incompatible versions. The use of COM/DCOM was successful, and allowed the development of systems and applications. However, it only works between computers using Windows, and most servers and enterprise systems at this time have X systems. With this limitation in mind, Microsoft and IBM have sought a way to allow the same functionality across heterogeneous systems. The solution was to use HTTP as the data transport layer. With the growth of the Internet, this service was standardized and available on all platforms and the emerging XML standard for defining the data that was being exchanged. Thus is how Web service standards were created.

Today, Web services are the most commonly adopted standard for systems integration, and the technology in fashion is service-oriented architecture (SOA). The purpose of this document is not to conceptualize SOA, but to show the relationship between EAI and this architecture. It is to present the points where the two meets, because despite some statements of the market that SOA "buried" the EAI, it is still alive, and ever more active, even if hidden under some other layer of "SOA tools." If you want more information on SOA, search for Oasis, SOA Magazine, SOA Manifesto, The Open Group Architecture Framework, or at the very end of the list, see my presentation called "SOA Facts & Actions".


Integration Concepts

An EAI system has some principles and goals. The main concept is to use an integration bus. It controls the communication, without the bus it would be considered only a midleware. In addition to that, here are other concepts that are also important to review:


Low Coupling

The concept of coupling in EAI can be defined as the complexity induced in the whole integration by changes in one single component. Any change on a system participating in the integration should generate little or no change on other components. Ideally, the coupling is the smallest possible, and the goal of EAI is to allow the complete abstraction of systems. This would allow a system that meets a certain area of business to be changed without affecting other components, and changing only its communication with the bus.


Data Model

Usually in an integration there are various data models involved because each system has its vision of each piece of information. One of the primary functions of EAI is to provide the coupling, i.e., convert the data from the source model to the destination model, and vice versa. However, if we use the source model to pass by the bus data, we would be violating the first concept and inducing a coupling. This is because a change in the data model from the source system will generate impacts on integration of other systems. Therefore, the ideal is that the bus has its own data model and provide the mapping of data to its internal model.

In reality there are several other reasons to use a bus data model, although this generates more work than using the systems data models. If you look at a simple interface that sends data from system A to system B, we have two data conversions: System A Model -> Bus Model -> System B Model. While utilizing systems models you have only one: System A Model -> System B Model. Some of the other reasons are discussed in the later chapter specific to the data model. For now, it is important to understand that this is one of the pillars for integration.



Figure 3 – Data Models: Simple interface data conversions


Separation of Functions

The goal of integration is to allow communication between systems and not to steal its functionalities. This involves converting technologies, formats and data domains, and criticize requests that are not compliant to the standards. The problem is with the increase in the resources offered by modern EAI tools. They make it possible to implement various functions, but often creates features that should be in integrated systems and not in the integration tool. Sometimes it is very difficult to identify the difference because the boundaries are blurred. One of the important parameters to verify this limit is the paternity of data. The basic rule is that EAI generates no information, only transforms information generated by the integrated systems. If your implementation is generating information, you should suspect that you are extrapolating their functions.


Granularity

One of the most important settings for the EAI is the granularity of each interface because it sets the unit of work and performance of the interface. It is often said that EAI systems should not perform batch processing, and really ETL systems are much more efficient for this type of integration. However, that doesn't mean the EAI should not be used, only that there must be specific interfaces prepared for this type of processing. The problem typically occurs when someone has the idea of reading the batch file and publish each record to the existing interface to be processed individually. It may seem simple, but it will end up with your online processing.

The most important concept is that the granularity of the interfaces should always be constant and compatible with the type of process, i.e., if your procedure begins with a batch interface should continue until the end of processing.


Level of Cooperation

On integration, the level of cooperation indicates how the system relates to the integration bus, i.e. how it exchanges information with the outside world. We can define three distinct levels of cooperation:

  • Intrusive - The system is a bundle that has no programming or integration API. In this case, it is usual to read/write directly to the system database. The system doesn't notice that this process occurs. This type of integration must be built with care, and it will generate inconsistencies with the native data.
  • Cooperative - Although the system has closed functionalities, it also has an API for integration and some mechanism to receive and make calls to external systems. In this case, the integration is easier, but limited to the functionality provided by the system, and the technology available.
  • Programmable - Most modern systems have some programming language or script, that allows them to receive and make calls using various customized technologies. This is the optimal level of integration, allowing the system to be created with features "out-of-box". if required, you can add others.

Complexity of the Problem X Solution

In systems integration there is no magic. If you have a complex problem, its solution is complex. For a simple problem, its solution is simple. Whenever someone says to you that there is a simple solution to a complex problem, re-evaluate the problem and look at it from a different perspective. Did you overestimated the problem or are you being deceived. This seems like a simple concept, but many projects have failed because they forgot this concept and relied on fast solutions.

With these concepts in mind, and knowing more about the history, principles and concepts of systems integration, let's now begin to understand the components and separate the functions so that whe can have efficient integration, with low coupling and a good maintainability.


Integration Layers

The main function of EAI is to allow communication between systems. For doing that you also need to "know" what information needs to be sent or received from each system and in what time. This knowledge is not part of the interface because a system only needs to know how to send the information to the bus. We have the separation of processing in two distinct layers:


Communications Layer

With communication we can understand all tasks required to enable the exchange of information with the outside world. This communication can be one-way or two-way, synchronous or asynchronous, locally or remotely, and you can use one or a set of technologies. The main components of this layer are:

  • Adapters - These are the agents responsible for the "physical" communication. They can use various technologies and protocols. The most common are HTTP (S), FTP, files, database client, TCP/IP sockets, SOAP, REST, Corba, COM/DCOM, RMI, etc. In General, data is exchanged in the "physical" format of the external system.
  • Conversors - These are the components responsible for the conversion of data from the external format to a format more "friendly" to the bus, so it can be processed easily, rapidly and evenly. The format depends on the bus, but most modern systems typically use XML.
  • Processors - These are required in complex integrations. They are generally used when external systems have a logic of communication that needs to be implemented. The bus should wait through several events to compose a request, breaking a request on various other specific events, wait or perform other tasks required for communication with the systems, or any similar taks. It should remember however that we are not running business rules in this point, but only specific rules to allow communication with a system. Processors generally use state diagrams to govern its functioning, and may have persistence mechanisms to hold the information until processing.
  • Mapmakers - Its function is converting data from the system model to the the internal model of the bus, i.e. change the "logical" format. This can result in concatenating attributes, use of conversion tables, copy data from one attribute to another, perform the calculation and processing routines, validating and critisizing information and numerous other functions, but not forgetting that the bus cannot create any information. The bus can only transform what was received. Mapmakers work under demand of processors, and data mapped are returned to them for processing.
  • Routers - Once the information has been received, converted, processed and mapped, the next step is running the business rules. This information should be sent to the correct component in the business layer. This is the role of routing. It receives information from external systems and forwards it to the correct internal component, or the internal component receives information and forwards it to the external system using the router. Routers function as elements of abstraction, preventing communication and business layers need to know each other to exchange information, and therefore reducing the coupling of the system.


Figure 4 – Communication Layers


Business Layer

When it comes to the business layer, the information is already in the format and data model of the bus. So its function is determine what should be done with each request. It performs what is called "business rule" of the bus. To allow this intelligence, the following components are required:

  • Processors - This is where the business logic is executed. The communication layer usually consist of state diagrams and persistence mechanisms to store the information. It may also have integrated workflow systems and user interaction support to enter data and information on the progress of tasks. Usually programming language and scripts allows complex tasks to be implemented. However, you should be very careful that all processing is independent of the systems involved, in order to ensure independence and low coupling. Any processing that depends on a particular system should be done in that system’s communication layer.
  • Mapmakers - Usually the information need to be mapped in order to run the business requests. This implies copying data, composing and decomposing them into smaller requests, validating and critisizing domains to match informations, and countless other activities. It is important to remember that we have already received data in the correct “physical” and “logical” models. We are only converting data to match the required business rules, but the mapping output needs to be also in the same model. Mapmakers work under demand of processors and data mapped are returned to them for processing.
  • Routers - In the same way as in the communication layer, we need to have a rotedor as an element of isolation between communication and business layers. Although initially this seems to be redundant, it introduces the concept of platform. Imagine a scenario where your company has three different billing systems, each with its own customers. Your business rule says that if a purchase order has already been terminated, then the request must be sent to the billing. However, you did not know which of customer billing systems should be used. If this decision is taken in the business processes, you would have a high coupling between them and the systems, which is undesirable. By using this router, the business process knows only to send the request to the billing platform, and the router will then identify the correct system and forward the request to the router for the underliyng communication layer.

Figure 5 – Business Layer

The separation of processing in these two layers can address one of the main goals of EAI, which is the independence of the specific systems. Theoretically, their business rules are separated from the communication, then a system could be replaced by another similar only by changing your communication layer. It is clear that a change of this scale always involves changing business processes, but this way, the change can be made generating the lowest possible impact to the company.


Conclusion

The next article will continue by going through the data model, business process model, EAI & SOA, and the conclusion.