(SPECS) TLS enforcement
This document is a draft!
This document was created in March 2014, thus some of the information within could be out-of-date!
When discussing about "cloud security" there are so many topics to be touched, ranging from technical ones such as coding mistakes, complex vulnerabilities, distributed attacks, new cryptographic primitives, analysis of old ones, debates around the trade-off between the value of data and the cost to secure it, etc. However very few of these discussions focus on server-side TLS (Transport Layer Security) usage [TLS] .
Probably one of the main reason why this topic, server-side TLS, is so sparsely covered is because this technology is taken for granted, and believed as easily solvable. In fact if one studies the available documentation for some of the most popular HTTP servers and proxies that also expose the TLS termination functionality --- such as [nginx-SSL], [HTTPD-SSL] , [LigHTTPd-SSL], [HAProxy-conf] --- one gets a very good idea of the configuration knobs and their syntax, but there is almost no information on how the various knobs should be tweaked to comply to a desired policy; two noteworthy exceptions are [SSL-Labs-BP] and [OWASP].
The consequence is that the services offered by many institutions and enterprises (including online-banking and e-commerce) are improperly secured, due to misconfiguration or misunderstanding of performance vs security trade-offs. For example according to [SSL-Labs], via their [SSL-Pulse] service, the following are a few telling numbers about the proper deployment of TLS in March 2014:
- ~75% of all the surveyed sites are insecure;
- ~30% allow clients to use ciphers weaker than 128 bits;
- ~25% still allow the usage of SSLv2.0, while only ~30% implement TLSv1.1 or TLSv1.2;
- only ~5% support forward secrecy with modern browsers;
- meanwhile more than 50% don't support forward secrecy at all;
It would be useful (and not only for the SPECS user) to implement a set of tools and services that help the operator in alleviating some of the issues mentioned in the previous section, namely proper configuration of server-side TLS. Some of these solutions can range from tools to assess the quality of security offered by a particular service deployment (this would be part of the monitoring platform), through expert systems which given a set of policies provide the user with a matching protocol configuration (independent of the used implementation), up-to providing concrete implementations of TLS termination services . However any solution that we provide, must expose a simplified set of concepts, as opposed the current approach of exposing all the TLS protocol and its extension knobs, thus preventing the operator to, proverbially, "shoot himself in the foot".
Therefore we can state that the primary goal is to provide the operator a coherent and orthogonal set of concepts, completely hiding the underlying technicalities, thus resulting in secure deployments. This is based on the observation that if one wants to tweak a certain knob (that is hidden), then one should know about all the other knobs and their interactions, therefore he is an expert in the domain and doesn't need such an aid. As a consequence the resulting solutions are (to our best effort) secure by default and any misconfiguration is impossible (i.e., secure by design).
From that primary goal, we can derive secondary ones such as the following: identifying what we shall call "capabilities"; that can be aggregated into "policies" for which we shall provide a model; followed by proper implementation and documentation.
After finding the main use-cases, we must identify a set of capabilities, each capability representing fine grained, but well defined and coherent, set of requirements, that can be translated into concrete configuration instances. For example one such capability can be "forward-secrecy" [ForwardSecrecy-1], which implies the usage of a certain cipher suite, but also disabling certain other features, possibly breaking backward compatibility with older software. One can easily observe that such capabilities can either overlap or even conflict with each other, thus the purpose of the developed tools is to advise the user on the implications of his choices. Furthermore, keeping in mind the primary goal of eliminating the possibility of misconfiguration, many (if not all) of these capabilities should have no parameters themselves; for example instead of providing a capability called "encryption-level" which is parametrized by the level (say "high" through "low") and one called "interoperability" (similarly parametrized) and letting the user choose and mix them, we should better create a few capability that evoke their actual use-case, such as "confidentiality", "interoperability" and "performance", because most likely the requirements of these are incompatible (i.e., older clients don't support better algorithms).
Nonetheless, documenting the resulting solutions, and providing best practices, should be as important as the actual implementation. Furthermore, we could think of also touching ancillary topics such as proper cryptographic material life-cycle, implementation vulnerability assessment, configuration validation (and even enforcement) of related technologies.
In order to meet the goals stated in the previous section, we must provide at least the following functionalities, however always keeping in mind that although we handle these aspects, they shouldn't be exposed to the operator, unless strictly required.
It must be noted that some of these functionalities will be embodied as custom tools or as extensions to existing solutions, meanwhile others are recurring throughout all of them (such as cryptographic material life-cycle, policy enforcement or vulnerability assessment). Therefore below we only list the crosscutting concerns, possibly hinting to their usage, but each solution will certainly involve a mixture.
Starting with the easiest task, we should provide the user with tools that enable:
- creation of CSR (Certificate Signing Request) for a set of specific use-cases, such as server or client, taking into account the best practices to derive other attributes, such as expiration, etc.;
- in case client certificates are required, the proper configuration of servers to verify CRL (Certificate Revocation List) and OCSP (Online Certificate Status Protocol) of the issuer CA (Certificate Authorities);
- in case client certificates are required, but only custom CA's are accepted, the management of their certificates independent of the OS provided trusted root certificates;
- if needed, a simplified local CA, for issuing certificates used internally inside the application, most likely between its components, or when authenticating clients based on certificates issued by the application itself; the focus, as opposed to other CA solutions, is on unsupervised automation, thus balancing this property with the requirement of security;
Cryptographic material life-cycle
Because in computer security, a system is said to be as secure as its weakest link, and in case of TLS the weakest link can in fact be the private key itself, the best solution would be to use HSM (Hardware Security Modules). However "in-the-cloud" --- letting aside the too many layers of virtualization, each adding potential attack vectors --- such a solution is unfeasible on a large scale due to its incurring costs. For example one could imagine using Amazon's CloudHSM [AWS-HSM] for a couple of large EC2 VM's playing the role of TLS termination; but using one for each internal VM (or even service) would be prohibitive.
Therefore, due to the fine balance between cost and security, we can accept a lower level of security, and manage the cryptographic material (i.e., the private keys) in software. However we must find better alternatives than the currently accepted solution of storing private keys as plain files without encryption; and although the automation can't be eliminated, we can strive to reduce the window of opportunity when the attacker can obtain the unsecured cryptographic material.
TLS cipher suites and main parameters (policy enforcement)
Although the selection of cipher suites --- which algorithms to be used for certain tasks such as authentication, encryption, digesting, etc. --- and standards (such as FIPS or PCI) can be segregated as different topics, they go hand-in-hand because choosing a certain cipher suite is mandated (or forbidden) by the standard to which the service must comply. Besides the cipher suite, the standards also specify acceptable values for other parameters, however these are less visible.
Thus, what we have called in the goals section "capabilities" and "policies" must now be translated into concrete values for a particular implementation.
Other TLS parameters and extensions
Not strictly related with security, there are a few aspects that must be taken into account when configuring certain services, thus we should provide the necessary tooling:
- SNI (Server Name Indication), which allows a certain service to have different identities (such as virtual hosts in case of web servers) [TLS-extensions];
- ALPN (Application Layer Protocol Negotiation) / NPN (Next Protocol Negotiation), which allows a certain service to provide multiple, possible incompatible, protocols (such as HTTP, SPDY, or the emerging HTTP/2.0, in case of web servers);
- TLS tickets or session caching, enhancing performance at the expense of security;
- TLS compression, whose enablement opens the service to certain exploints (such as [CRIME]);
- TLS renegotiation, which again opens attack vectors (such as [TLS-renegotiation]);
In addition to the actual TLS protocol and its extensions, there are a couple of related technologies which enhance the security, thus like in the previous case we should provide the necessary tooling:
- HSTS (HTTP Strict Transport Security), which allows a service to alert its clients that all communications (for a certain virtual host and within a certain time frame) must use TLS [HSTS];
- HTTP secure cookies, forcing certain cookies, especially those related to application authentication, from being sent on insecure channels;
- TLS Certificate Status Request;
Against all good will and best effort on behalf of the developers (third party or from the SPECS consortium), software defects do exist, and manifest themselves either as service unavailability (in the best case), or even service compromise (in the worst case). Therefore we should augment the provided tools and services with a module which periodically assesses the known vulnerabilities of the system.
However instead of active techniques (such as actively trying to attack the system) or passive ones (such as monitoring possible attacks) --- possibly implemented as complementary SPECS enforcement or monitoring modules --- which are not widely deployed or only periodically enabled because they incur heavy performance penalty, we would take a much simpler and focused route, that of periodically checking the running configuration against a database of known vulnerabilities (such as [CVE]) and alerting the operator of any matches .
The running configuration could be composed of:
- the actual software libraries and tools that comprise the solution;
- the employed TLS versions and extensions;
- the particular TLS cipher suite or parameters used;
- other information easily available to the solution (such as if HTTP compression is used, etc.);
As hinted in the previous sections, our purpose is not to reimplement any of the TLS protocols or extensions, but merely to properly combine and configure existing proven solutions. Therefore the following existing technologies are the best candidates to fulfill our goals:
- for X509 certificate handling, and any other specialized tool, both [OpenSSL] and [GnuTLS] seem viable libraries ;
- for TLS termination (independent on the application protocol) there are multiple candidates:
- well established ones, such as [HAProxy] or [stunnel];
- new developments or proof of concepts, geared towards performance, such as [bud] or [stud];
- for HTTPS termination (i.e., handling the TLS aspects and then reverse proxying the request to plain HTTP servers), although practically any production-ready HTTP server supports it, there only a few highlights:
- pure HTTPS terminators, such as [HAProxy] or [Pound];
- usual HTTP servers, featuring HTTPS reverse proxying capabilities, such as [nginx];