Jose Luiz Berg is a long term project manager and a systems architect with Enterprise Application Integration (EAI). In the past few years, Jose focused his work on implementing Service Oriented Architecture (SOA) for large Brazilian telecommunication companies. He graduated in computer networks, but also has a lot of experience working as a programmer in commercial programming languages, in last 25 years. Jose believes that SOA is one of the most important advances in software development in last decades. As it involves not only a change in the way we work, but also a significantly changes how companies see themselves and their IT resources. This advancement may be a risk, as many companies are being convinced by bad software vendors that SOA is only creating Web services, however they are not focusing on what it really stands for. By doing so they are not realizing that this is important part of the history in the making.
Security and Identity Management Applied to SOA - Part I Published: August 1, 2014 • Service Technology Magazine Issue LXXXV PDF
In recent years, security threats against the corporate systems is increasing exponentially, both in quantity and in complexity; despite the efforts of security systems vendors to accompany this growth, the war is being lost day by day, not so much because of lack of quality of the network controls, but mainly by the lack of knowledge of the developers on the subject, which results in insecure systems, especially considering the internally developed for corporate use.
Recently we have seen an explosion in the adoption of identity management systems, and the use of SSO (single sign-on) services to facilitate the management of credentials, whether in own infrastructure or in the cloud. The adoption of this technology already brings, by itself, improvements in the security of applications, because authentication is a critical service that happens to be centralized and controlled by expert systems, within well know security best practices. However, even with this improvement, there are numerous other security issues that still need to be addressed to ensure the security of the applications developed in house, especially when dealing with Web Services.
This document aims to present concepts on security and identity management applied in systems development, and notably in the construction of Web Services, and help minimize the lack of knowledge of the programmers on procedures and standards that allow operations to be performed safely by applications, but with the lowest possible impact on the process of development and operation.
For an application to be considered absolutely safe, it could not be connected in a network, and should run on a server without external drives or USB ports, being used by a single highly skilled operator, which had access to physical facilities through biometric controls. However, an application in that way would not have much practical utility (except for the scriptwriters of Hollywood). So, develop a safe application always involves a series of trade-offs between security and functionality of the application, i.e. reducing controlled aspects of security to allow the features designed for applying necessary business requirements.
However, the current scenario of system's development is not exactly that: it is normal that developers may know little or nothing about secure development, and consider where security will be provided by an external system to the application, such as network firewalls, application firewalls, identity management, IPS, IDS, development frameworks, or any other "magic" solution that will solve all your problems. Although these products are able to help, if the application is not designed with security as one of their business requirements, will be able to do very little, and the assessments of experts realize that today 75% of applications developed within the companies themselves have very serious security breaches, to the point of being one of the first points of attack when a hacker tries to get control of a company's servers.
With the use of Web Services, the situation becomes even more complicated, because the practical effect of their use is to break down the barriers between applications, making it all work like a great application, where each performs their specialized tasks and calls on other tasks. In this context, it is very common for applications to require authentication and perform security controls through ACLs when the user is accessing the interface, but abandon all controls when the same routine runs as a service, normally using a privileged user or without authentication, not registering even who was the user who requested the execution. Even worst is that, usually, these routines are exactly the most critical, because if they were not, no one would spend resources to wrap then into services.
To begin better understanding these problems, and how to solve them, we need first to introduce the main concepts on safety aspects involving the development of applications, mainly targeted for building services.
The Main Concepts of Security
Splitting security concepts into logical components makes it easier to understand and, therefore, easier to deploy. These logical components are confidentiality, integrity, non-repudiation, authentication, authorization, privacy, and audit readiness. These are the requirements that we need to keep in mind when designing a secure system. If only a few are considered, there are security holes.
The data from a networked system are always in transit or at rest. In the world of information security, the confidentiality term is used to define that data in transit between two communicating parties should not be available to third parties who have access to transmission media. There are two general approaches for confidentiality: one approach is to use a private connection between the two parts — a dedicated line, or a virtual private network (VPN). Another approach, used when data is sent through an untrusted network such as the Internet, is to use encryption.
In the public's mind (and most programmers), encryption is usually seen as synonymous with security. This increased public awareness of cryptography is due in part to the success of the secure sockets layer (SSL) for e-commerce business-to-consumer (B2C). One of the features of SSL is the encryption of data that passes between a browser and the server. The lock icon in browsers provides a pure graphic reminder that encryption is occurring (it also indicates that the site has been authenticated). It indicates that snoops cannot read the data being transmitted. Encryption is undoubtedly important for both confidentiality as the basis of other security technologies such as digital signatures. However, it is not some kind of magic powder added to a system in order to make it automatically safe. Remember, it is just one of the security components. A Web-based system is not "safe", simply because it uses SSL for confidentiality; the other safety requirements also needs to be addressed, as we will see below.
Never think about security and encryption as synonyms. Strong security depends not only on encryption technologies, but also in content filtering, care and good sense. Addition of encryption does not make a secure system, but only ensures that third parties cannot see data during their transfer to/from the server. Even in that case, if the certificates are violated, or the encryption algorithms are old and weak, they can still be broken.
In the field of information security, integrity has a special meaning: it does not means that information cannot be tampered, but that if it is, this can be detected. In an untrusted network, usually it is impossible to guarantee that data is tamper-proof when in transit to its destination. Therefore, checking whether there has been any tampering is the best thing we can do to ensure the integrity. In order to accomplish this task, we rely on mathematical algorithms known as hash algorithms. A hash algorithm receives a block of data as input and produces a much smaller block of data as output. This output is sometimes called "summary" (or hash) of the data. The summary related to the original data as follows: if even a single character is changed, and the hash algorithm is run again, the resulting summary will be completely different from the original, so comparing both hashes we can identify any tampering of data.
A hashing algorithm alone cannot ensure the integrity. Let us imagine the scenario where a hash is added to the end of a message before it is sent to a recipient. An attacker who intercepts the message could alter the data, construct a new hash and add it to the changed message. In this case, the change would not be detected, and integrity is broken. To ensure the integrity of the messages, we need a new feature, which is asymmetric encryption. Does not fit within the scope of this document to explain the whole process of encryption, so we will only present the main characteristics, in order to facilitate the understanding:
Modern encryption algorithms use always pairs of complementary cryptographic keys, usually called public and private keys. There is a lot of advanced math involved, but the main logic is that a text encrypted using one of the keys of the pair can be decrypted only using its complementary key. In reality, there is no technical difference between the two keys, except that one is kept secret (the secret), and other publicly disclosed (the public). Then we can encrypt a text with our private key, and any person who holds the public key can decrypt the message. However, only this functionality would not help a lot, because if there is a public key that can decrypt the data, anyone could do that. However, if we encrypt using the public key of the recipient, we ensure that only he can decrypt the message. There is yet another possibility, which is to encrypt multiple times: If user A wants to send a private message to user B, it can encrypt the message with user B's public key, and then re-encrypt with your own private key. So to be able to read the message, the receiver must use the public key of A (OK, it is public), and its private key. As only user B has access to his private key, we can then ensure that only user B will be able to read the message. In this process, in addition to ensuring the confidentiality of the message, we have another advantage: it ensures that whoever sent the message was the user (or at least someone who has access to his private key). This additional advantage will be important down the road when we start talking about non-repudiation.
Then data encryption process can be used in three different ways: If we use only the public key of a user, we ensure that only he can read the message, if we use our private key, we guarantee that we sent the message, and if we use the two keys, guarantee both things at the same time. So why do not we always use both keys? As we've said before, security is a matter of trade-offs: math required to encrypt and decrypt data consumes CPU and slows the delivery of messages, so we need to use only the features required for each case, ensuring the security, but using the least amount of system resources.
This whole process of encryption can be used to encrypt the hash calculated to the data. Thus, if the data is changed during transmission, this change will be detected on the target, ensuring integrity. We could simply encrypt the entire message, but that would be costly to the system. We said earlier that the hash is small, and has a fixed size. This means that your encryption takes a relatively small and constant time, while encrypt the data implies a time proportional to the size of the message. Therefore, we must always stick to the requirements: If we need confidentiality, we encrypt the entire message (or only sensitive part, as we shall see later), and if we need only integrity, we encrypt only the hash.
The inclusion of a hash in a message is denominated "Digital Signature". It is something similar to sending a printed signed document, authenticated by a notary. This digital signature is even safer than the printed, because any information that is tampered with in the document will cause it to be invalidated and rejected upon receipt.
As we saw earlier, if we use our private key to encrypt the hash, in addition to ensuring the integrity of the message, we guarantee that we have created the message. This is a very important feature, especially when dealing with Web Services that may execute from outside your network, where we have no control over authentication or audit services to verify who requested the operation.
Then non-repudiation is simply guarantee that the user who sent the message is the holder of the corresponding private key, and can ensure that it was really who sent the message, so cannot repudiate his authorship.
Authentication is one of the main functions of security, because in order to be able to control access to system resources, we need first identify who is trying the access. Most people confuse the authentication process with the typing of username and password, but in reality authentication consists of unambiguous identification of the party requesting access. Such identification can occur in several ways, usually characterized in three groups: "what I know", "what I got" or "what I am". The main forms of authentication used today are as follows:
As we can see, there are numerous forms of authentication, and the more modern and secure trend is to combine various forms, using two or three groups, especially when we are running sensitive system routines. Another interesting point is that each type of authentication offers different security features, so an application can accept authentication with login and password, but to a specific financial routine can also require a biometric authentication. This implies that the authentication system needs to tell the application what kind of authentication was done, so it can define the resources that the user may access using that authentication, and request a more secure if necessary.
Another relevant point is that even when we use the traditional process of login and password, the login information is only to identify the user, and the validation is done using the password. So today, in many applications, the concept of login has been replaced by any information about the user, like e-mail, national id, phone, mother's name, or any other that could identify the user. This means less code to the user remember, facilitating his access.
One of the important features in modern security systems is not storing passwords. Normally, when you change your password, the system calculates and stores only the hash of the password. When you inform your password for authentication, the system calculates the hash and compares with the stored, granting access only if they are equal. Thus, it is impossible to break into the system and steal passwords, because the system itself has no stored passwords. So if you forget your password, the administrator can even change it, but there is no way to recover the old password.
Authorization is a security requirement very related to authentication, because it acts on the credential of the authenticated user. However, it is important to define the difference between them: while the authentication is about "who you are", the authorization is about "what you are allowed to do". Just because a user was authenticated, does not mean that he is authorized to perform any operation. Authorization systems allow an administrator to manage a policy of access control to resources, which is stored in a corporate directory. This information about the user rights is provided for applications, and are validated against the security requirements of each routine, defining which routines should be allowed for each user.
With this definition, we can separate the authorization in two stages: establishing the required privileges to execute each routine, and validating if the user poses the corresponding privileges before executing the routine. These two stages execute preferably in different points and moments, being named respectively as PDP (policy definition point) and PEP (policy enforcement point).
In regular applications, the validation is done when the session is established, when all needed information is stored in the session and the menu is rendered containing only routines he is authorized to execute. With the introduction of services, however, this practice needs to change, because when a service executes there is no particular session, and no menu is used. Therefore, the authorization of a routine should be performed at the beginning of each routine, especially for critical ones. Of course, that the menu still need to be presented for regular application users, but in each routine, authorization shall be revalidated, despite this same validation has been performed on the menu. This practice is extremely important with the increase in the use of Web Services and modularization of the systems, where a single call may mean service chaining several routines, often without the knowledge of those responsible for the construction of the original routines. So if the authorization is not verified in the routine itself, opens up space for his unauthorized execution.
Another very important point in the construction of applications, especially when using application servers, is in the definition and use of ACLs. On these systems, the PDPs are built through ACLs, or access lists, consisting usually of an XML built by programmers that defines the roles needed to perform each routine. Although this practice normally does not pose any threat, it is necessary that additional care was taken on amendment of these ACLs, avoiding typing mistakes, and enforcing that no external user have enough rights even to read these XML files.
Privacy is a requirement that might not come immediately to mind when discussing information security. The concept is sometimes confused with confidentiality. We have already seen that the confidentiality is the requirement that data in transit is not available to eavesdroppers. The requirement of privacy belongs to the rights of the owner of the data. Many privacy protection laws are now in effect around the world, demanding that private data was not used without previous consent. Many Internet security violations reported are violations of privacy. When credit card data is stolen on the Internet, is rarely a violation of confidentiality in data transport level, because encryption is almost universally used for data in transit on the Internet. However, if sensitive data is stored in a database and do not have any type of protection (such as encryption), and there is an invasion of the system and the data is copied, we have a violation of privacy, because the data is disclosed without the permission of the owners. Web Services offer a new way to access information and, therefore, a new form of violating privacy, if security rules are not properly applied.
Availability is another requirement that is not seen as an obvious security requirement. However, if critical information is not available when needed, the result could be catastrophic for the business. As its Web service, security services require availability. For example, a certificate revocation list is used for non-repudiation. If it is unavailable, the nonrepudiation feature is lost.
Another important point is that the availability is not only defined as access or not to the resource, but the access in time to business needs. A service that takes too long to run also violates the availability requirement. How do we define at the beginning of the chapter, security is always a trade-off, and availability required for the service is one of the fundamental components in this evaluation.
Another aspect often underestimated by the developers is the audit, then let us first clarify this concept: audit does not mean simply generate log files with everything that happens, and then need to sweep these logs to find out the desired operations. An accurate audit means to define what operations need to be maintained and what information should be stored, creating an audit trail, preferably in a media different from the log records used to find errors in the system.
The introduction of services created a bigger problem for the audit, because many services may be executed in different applications, being often necessary to collect information on various trails and organize them by date of occurrence to figure the sequence of operations performed in a single business transaction.
However, the generation of audit trails (or even logs) must be done with great care, because sensitive data often end up being recorded in these files. Any beginner hacker knows that finding and reading log files is one of the easier ways to break the security of a system. The access to the file itself usually means a violation of privacy, and often, clear text passwords are found in these files. Another common problem is the wrong configuration of access permissions to files, allowing a hacker to execute any operation and after cleanup logs, covering his tracks.