ServiceTechMag.com > Issue LXIV, July 2012 > Semantics Enabling Next Generation SOA - Part I
Johan Kumps

Johan Kumps

Biography

Johan Kumps is a SOASchool Certified SOA Architect - Analyst at RealDolmen, a Belgian consultancy organisation based near Brussels. For several years Johan has been involved in the service oriented analysis and design phase of several Service Oriented Architectures for both Flemish and Belgian Federal Government.

In this statute Johan designed the Federal Service Bus (FSB) platform for the Belgian Federal Government using Oracle technology. The FSB is an implementation of the Enterprise Service Bus pattern enabling different government divisions to exchange information using SOA concepts. Currently Johan is fulfilling the roles of SOA Architect and SOA Analyst in several SOA projects. Thanks to his everlasting interest in new things and taking what he's doing to the next level, Johan came across Semantic Web technologies a few years ago. Some time ago he realized that this new approach to the Web was becoming more and more mature. Johan started his own investigation and rapidly came across some colleagues that had already been investing time in this matter. No further convincing was necessary to make Johan see that he needed to sink his teeth in the combination of semantic technologies and service oriented principles. As a committee member of the JBoss ESB project, he started with the implementation of Semantic SOA features in the JBoss ESB platform.

Contributions

rss  subscribe to this author

Bookmarks



Semantics Enabling Next Generation SOA – Part I

Published: July 26th, 2012 • Service Technology Magazine Issue LXVI PDF
 

Abstract: This two-part article series discusses how creating and maintaining service based architectures can be a significant challenge and a considerable investment. IT staff must carry out all of the tasks associated with the discovery, composition and invocation of services. Coping with millions of services, solely through human effort isn't feasible (and I'm not even taking environmental and context changes into account). However, there should be an approach to drive traditional service-oriented architecture to a more dynamic and more flexible pillar in an enterprise architecture through innovative automation.


Introduction

Service-Oriented Architecture (SOA) is a style of software architecture that advocates reusable and intrinsic interoperable units of logic referred to as services. This means that a business application is now just another service composition. In current SOAs, the detection and usability analysis of suitable services for a specific client application is limited to manual human intervention.

The human intervention, performed by the project's architect or analyst, mainly focuses on the selection, negotiation, constraint validation and decision making steps in the business process impacting the flexibility and agility of the solution architecture. Current automation approaches rely heavily on syntactical representations, restricting collaboration, without standardization agreements. Supporting dynamic integration and process automation on an intra- and inter- enterprise level requires addressing the heterogeneity in integration, dynamic constraint validation and runtime agreement negotiations.

In recent years, a lot of research on the Semantic Web, described by Tim Berners Lee has been done. Semantic Web technologies can help overcome the challenges we face today in traditional SOA projects. Realizing service-orientation's full potential will require integrating SOA and Semantic Web technologies. In this article I will explore how and where semantic web technologies can be adopted to lift traditional SOA to a next level, and form a solid foundation for flexible business solutions and innovative pervasive computing platforms.


The Service Oriented Principle

SOA originally brought eight, nowadays well known, principles and influences to software engineering supported by and applied in a distinct approach to the analysis, design and implementation phases. Within the context of this article, I'll focus on the Service Discoverability, Service Composability and Service Loose Coupling principles. For each one of them we aim the maximized usage of machine-readable metadata, describing the available information and resources, enabling the automation of a substantial portion of the work related to the application of the SOA principles mentioned above. Let me take the Service Discoverability principle as an example to make my point. Using the currently available tools (UDDI registry/repository, search engines, ...), it is very easy to get lost in a myriad of services, or discover irrelevant ones since only a keyword based search facility is available. For instance, if we search for a service by using the ‘hotel' keyword we will find all kinds of information mixed with services descriptions. In the table below I've listed some of the possibilities:

  • Tourist information in the neighborhood of hotels
  • Websites or travel agencies providing hotel booking services
  • Schools providing courses for hotel personnel
  • ...

In the best case we find the wanted service, however in most cases, the precision and recall of the query is low, requiring a human to filter out the usable portion or we don't find the requested service because none of the available services mention the ‘hotel' keyword, leading to a less flexible, less dynamic solution.


The Semantic Paradigm

The Semantic Web (Web 3.0) vision was conceived by Tim Berners-Lee defining the Semantic Web as "a web of data that can be processed directly and indirectly by machines". The main goal of the Web 3.0 is to define a system that enables machines to understand the meaning of the data being provided, searched for, shared and exchanged. Such an understanding requires that the information sources be semantically structured. While the evolution of Web 1.0 to Web 2.0 increased the participation rate of the end-user, by enabling them to interact and collaborate with each other in a social media dialogue as creators of information or content, the transition to the Web 3.0 has more to do with a technology shift. This transition implies that we move from a web of documents to a web of semantically structured data. Web 3.0 gives meaning to the contents of documents and links between them. Content in terms of "entities" such as people, locations, phone numbers and "relationships" between those entities, such as person A, "lives on" address B, "has car" C which is "insured by" insurance company D.

In contrast with documents (PDF, doc, spreadsheets), data can be interpreted by software agents. In order to make sure data can be used by machines, the semantics of this data should be available in a machine understandable manner. This is where Semantic Technologies come into play for knowledge management. These technologies are built on syntaxes which use the Universal Resource Identifier (URI) to represent data in triples-based structures using the Resource Description Framework (RDF). RDF is a framework for representing information about resources in a directed graph formed by triples, expressed using an XML syntax. Each triple (knowledge statement about the real world) is a complete and unique fact made up of subject, predicate and object. The subject is the resource described by the statement. The predicate is the property of the subject identified uniquely by an URI. The object is the value of the subject's property.

Triples can be represented graphically as a directed labeled graph, removing any possibility of confusion over what are subject and objects. By convention the subject is shown as an oval, the predicate as an arc and the object as a box. This convention is depicted in the diagram below, representing the statement: "The book has the title The Semantic Web".



Figure 1 – Example of a basic triple


A knowledge base describing a specific domain is a collection of RDF triples stored in an ontology. An ontology is thus a formal representation of domain concepts and the relationships between them. Next to capturing domain knowledge in a standard, machine interpretable manner, ontologies can be used to literally reason about the model and data, revealing new insights in the form of additional, not explicitly modeled triples. Subsumption, equivalence and disjointness are some examples of relationships between ontology concepts that could be discovered by the engine. This kind of information is also called inferred knowledge, which is one of the supporting principles of the semantic discovery process discussed later in this article.

To allow standardized description of subjects, objects and the relations between them and other ontological constructs, a RDF Schema (RDFS) was created. RDFS is designed to be a simple data-typing model for RDF. Using RDFS it is possible to model statements like Car is a type of MotorizedVehicle, and that MotorizedVehicle is a sub type of Vehicle.

More detailed ontologies can be created with Web Ontology Language (OWL). OWL is a language derived from description logics, and offers more constructs over RDFS. It is syntactically embedded into RDF, so like RDFS. It provides additional standardized vocabulary (equivalentClass, intersectionOf, inverseOf,...), useful when modeling domain knowledge.

For querying RDF data as well as RDFS and OWL ontologies, a SQL-like language called Simple Protocol and RDF Query Language (SPARQL) can be used.


When the Semantic Web meets SOA

The foundation of semantically enabled solutions is the domain ontology. The main goal of an ontology is to share a common understanding of the structure of descriptive domain information among people, software agents and between people and information systems, next to enabling reuse of domain knowledge, introducing standards and increasing interoperability. Some examples:

  • If I say "house" and you say "maison" how do we know we mean the same thing?
  • If I say "vehicle", how does the system knows if this includes buses, cars, trains, ...?

By making domain assumptions explicit in a standardized manner using ontologies, we achieve interoperability between formerly disparate systems, people speaking different languages as well as the interaction between people and information systems. People can use their own terminology or language to express their needs while the system is able to interpret and link the request to services the system provides.

Regarding Web Services or services in general, semantics can be used to give service components (functional description, inputs, outputs) a meaning. The integration of semantics also allows for the definition of new elements like preconditions and effects (post-conditions), which have not been regarded in syntactic descriptions. Preconditions are logical conditions that need to be fulfilled before the service can be executed. Effects describe changes in the functional context of the service after the service has been executed. Formerly this kind of information was stored in documents made available along with the WSDL contract, but unfortunately not interpretable by software agents. Together with inputs and outputs, preconditions and effects form the service profile.

It is necessary to have a model which can be used as a knowledge base. The most commonly used knowledge base format are ontologies. More specifically semantic service contracts can be modeled using a standard ontology language, OWL-S (formerly DAML-S). OWL-S is an upper ontology for services and will be used to capture semantic information about services in a service inventory. The goal of OWL-S is to support automatic service discovery, invocation, composition and interoperation.

OWL-S models three essential types of knowledge about a service:

  • The service profile tells "what the service does" in a way that is suitable for a service-seeking consumer (or software agent acting on behalf of a service-seeking consumer) to determine whether the service meets its needs. The profile models the inputs, outputs, pre-conditions and effect (IOPE's) of a service. The inputs and outputs in the profile refer to concepts in a published ontology.
  • The service model tells a consumer how to use the service, by detailing the semantic content of requests (inputs) and responses (outputs), the pre-conditions of the service, quality of service parameters, the possible effects the service has to its environment.
  • A service grounding specifies the details of how a software agent can access the service. Typically a grounding will specify a communication protocol, message formats, and other service-specific details such as port numbers used in contacting the service. The grounding also specifies unambiguously the way of exchanging data for each input or output specified in the ServiceModel.


Figure 2 – The top level of the OWL-S ontology


Thanks to the OWL-S Grounding part it is possible to ground a semantic service to a Web Service or to any other type of service (EJB, POJO, ...), maximizing the level of loose coupling between the consumer and the service technology.


Service Discoverability

According to Thomas Erl, the Service Discoverability principle can be defined as follows: "Services are supplemented with communicative metadata by which they can be effectively discovered and interpreted". In other words service discovery is the process of evaluating a consumer goal or request and returning a set of compatible services capable of fulfilling this goal. First the consumer needs to formulate a service request in order to find appropriate services in the registry. In order for a system to be able to automate the discovery process, the request should define a query based on semantic concepts instead of keywords. This way it is possible to define the goal more precisely and to derive relationships between semantic concepts defined in the request and a service offer. The service query also allows to define pre- and post-conditions for a service.

The service provider is responsible for the service advertisements, describing a service offer with respect to functional capabilities, non-functional aspects, etc. The complete semantic service lifecycle model is depicted in the diagram below:



Figure 3 – The semantic service lifecycle.


The matching component is crucial in the semantic discovery process. It takes the service request and matches it with the available service advertisements. The actual matching is a pairwise comparison of a service advertisement and a service request. More specifically, the matching engine calculates the similarity between the concepts in the service request and these used in the service advertisements using an ontology reasoner, like the Jena framework (http://jena.apache.org/).

The OWL-S service profile contains enough information for a matchmaker to determine whether a service satisfies the requirements of a consumer. Several matchmaking algorithms rely on the matching of inputs and outputs of the service profiles. Four matching degrees can be identified:

  • Exact
    In case an output of an advertised service is an equivalent concept to a requested output or if the advertised output is a superclass of the requested output, an exact match is considered between these concepts.
  • Plugin
    If an advertised output subsumes a requested output, the relation between these concepts is weaker as compared to the exact match since subsumption is inferred by the reasoner. Consequently the matchmaker component will infer whether the advertised output can be plugged in place of the required output.
  • Subsume
    The advertised output is a subset of the requested output.
  • Fail
    We consider to have a fail match between concepts in case none of the above conditions are true.

These degrees are ranked as follows:

Exact > Plugin > Subsumes > Fail

This means that exact is a more desirable match than plugin, etc.

The combination of four methods leads to a more fine-grained ranking scheme though. We identify the profile hierarchy, the input/output types, preconditions/effects and QoS constraints matching. Two of them, profile hierarchy and input/output matching are based on the above mentioned matching degree calculation.

Within the profile hierarchy matching phase all advertised services presenting a profile that at least subsumes the concepts expressed in the consumer request are considered as services potentially fulfilling the consumer's goal.

The input/output parameter matching searches for services that produce an output required as an input of a given service. A matching service is a service that can correctly perform a task if all input concepts defined in the advertisement are satisfied by the consumer and if all output concepts defined in the request are satisfied by the advertisement. Next to exact matching, when an output type is more specific (subsumes) than a required input type, matching is also considered successful.

A precondition of a service specifies a condition that needs to be satisfied before the service logic can be performed successfully. Contrasting with preconditions, post-conditions or effects specifies conditions that hold as a result of a successful execution of the service. The matching engine looks for services that result in an effect that can fulfill the preconditions mentioned in the service request. A service is considered a match if all preconditions defined in the advertisement are satisfied by the consumer, in other words, a service can fulfill a consumer's goal if all effects defined in the request are satisfied by the advertisement.

Next to the above mentioned data (input/output) and functional (preconditions/effects) semantics we can identify non-functional semantics referring to quality of service (QoS) or general requirements and contraints too. QoS constraints matching will be discussed within the context of a service composition later in this article.

Let us now look at an example of how a request is matched with advertised services using profile hierarchy and IOPE's matching. Suppose that the user's requirement is "to book a commercial flight from Brussels to London on 23rd of september 2012". The profile hierarchy matching engine searches for a subsume of the CommercialFlightProduct. Based domain knowledge base the concept FlightBookingService is obtained. The system then considers services that are instances of this concept, i.e. RyanAirFlightBookingService, BritshAirwaysFlightBookingService and the BrusselsAirlinesFlightBookingService.

Consequently the engine will perform an input/output matching. All services that provide an output that matches a required input are retrieved. Based on the domain knowledge base the engine determines that only the BritishAirwaysFlightBookingService can perform the booking using the given departureCity (Brussels), the arrivalCity (London) on the departureDate (23rd September 2012). The third phase is the evaluation of pre- and post-conditions. The engine searches for services that result in an effect that can fulfill the requested precondition.

Both the RyanAirFlightBooking and BrusselsAirlinesFlightBooking services request a flight number and a departure date as input parameters. Due to the fact that the consumer is not able to provide a flight number, neither of these services can fulfill the consumer's goal on their own. This is where service composition comes into play which is discussed in the next paragraph.


Conclusion

The second article in this two-part series goes over Service Composability, QoS – awareness, and the conclusion to this article.


Acknowledgements

I wish to thank my colleagues Jan De Bo and Steve Van Den Buys who reviewed and critiqued my article.