This document is part of the MODAClouds deliverable D6.1, thus superseded by it.
This document was created in March 2013, thus its contents might be currently outdated.
Although many of the surveyed, or other existing, solutions are production-ready --- or even better backed up by powerful companies in the IT sector --- and offer many features, we must focus our effort in determining if they are a good match with ModaClouds' requirements, described in a later section. Such a goal implies two separate conditions:
- first of all they should be suitable for our industrial partners' case study applications; this in turn implies matching the supported programming languages, palette of available resources and middlewares, and nonetheless security requirements;
- and in order to fulfill our project's goal, they must provide a certain flexibility, to allow our run-time environment to integrate, and provide enhanced services and support for the user's application;
Therefore, we are specially interested in the following aspects:
Actually one of the categories mentioned in the beginning of the section, which broadly describes what is the purpose of the solution and the range of features it offers.
- PaaS --- fully integrated solution that abstracts away all low-level details of the deployment and execution;
- application execution --- suitable only for application execution, meaning that it doesn't manage the host environment it runs in, like operating system, machine, etc.; (classical examples would be Tomcat and derivatives, Ruby on Rails, etc.;)
- application deployment --- as above suitable only for application deployment, thus implying that the environment be provided by other means; (classical examples would be package managers, Capistrano for Ruby, etc.;)
- server deployment --- suitable to deploy the entire host environment, possibly even including the application deployment, but it will still require an application execution solution; (classical examples would be Chef or Puppet;)
- task automation --- low-level tools that, if required, would allow to quickly implement our own solution that would fit in one of the above categories; (classical examples would be Ant for Java, Fabric for Python, etc.;)
- library --- the described solution is actually a library to be used inside our programs; here we include also platforms or frameworks, which although more complex than libraries, are still used only to develop applications;
- service --- referring to solutions which are stand-alone services, which on their own don't provide direct benefits, but which are either used as dependencies of our environment, or if integrated it would provide added value to it and then to our users; (for example database systems, various middlewares, logging or monitoring systems or SaaS, etc.;)
- standard --- although not a ready to be used solution, this could be a protocol, data format, guidelines or other kind of specification, that could prove useful to implement or follow ourselves;
Shortly, how mature, or production ready, is the solution? Does it have a supportive community built around it.
- emerging --- usually either a very popular solution, or one backed by a large company, but not yet reaching or surpassing the beta status;
- prototype --- maybe not the best solution to adopt, but it could have important features that we could leverage or re-implement;
- legacy --- although not a choice for most new developments, it could prove important to address, because it either has a large deployment base, or it is mandated by one of the case studies;
- application domain
What would be the main flavor of targeted applications?
- web applications;
- map-reduce applications;
- generic compute-, data-, or network-intensive applications;
- application architecture
Broadly matching a targeted application architecture.
- 2-tier applications --- monolithically applications that besides the data storage or communication layer, have a single layer handling all the concerts from user interface to logic;
- n-tier applications --- SOA-inspired applications where parts of the application are clearly identified as independent layers, and deployed accordingly;
- application restrictions
What constraints would the application (and part of our run-time environment) be subjected at?
- none --- the application is able to use all the features of the targeted programming language and the targeted framework, including full control over the run-time environment; moreover the application is able to interact with other OS artifacts (like file-system, processes, sockets, etc.); (e.g. Amazon Beanstalk;)
- container --- like in the case of no restrictions, except that interactions with the run-time or the OS are limited;
- limited --- the application is able to use only some features of the targeted language or framework, and most likely interactions with the run-time and the OS are limited (i.e. native libraries are forbidden, file-system access is restricted, etc.); (e.g. Google App Engine;)
- programming languages
- (self explanatory)
- programming frameworks
- Some solutions target, or at least are focused, a particular framework (such as Servlets for Google App Engine's Java environment, Capistrano tightly focused on Ruby on Rails deployment, etc.). Thus it would prove useful to know in advanced which are the officially sanctioned or preferred frameworks.
How can scalability be achieved?
- automatic scalability --- based on user defined policies the platform is able to provision and commit new computing resources; (i.e. the platform decides and executes;)
- manual scalability --- the user is able to control via a high-level UI or CLI the amount of provisioned and committed computing resources; (i.e. the operator decides, the platform executes;) (this implies that the platform is able to provision new resources by itself;)
- passive scalability --- the platform itself is able to scale if computing resources are manually provided by the operator himself; (i.e. the operator decides and executes, the platform only takes notice and reacts;) (this implies that the platform is not able to provision resources by itself;)
- session affinity
Usually PaaS' offer HTTP request routers (or dispatchers); how does they load-balance clients between the multiple available service instances? (How each client is identified depends on the internals of the PaaScould range from source IP address, to cookies.)
- transparent --- the solution provides automatic session replication between multiple instances (most likely through a shared database);
- sticky-sessions --- all the requests originating from the same client are routed to the same instance;
- non-deterministic --- (self-explanatory);
How can we pragmatically interact with the proposed solution?
- WS (Web Service) --- the interaction can be made through HTTP calls (either SOAP+WSDL or REST-full); (this implies that the is a public specification of such calls, or they are easily reveres engineered);
- WUI (Web User Interface) --- although this interface is provided remotely through HTTP, it's suitable for human operators and can't be easily consumed by an automated tool;
- CLI (Command Line Interface) --- there are command line tools that interact with the solution (most likely through HTTP or some form of RPC); (this implies that the input / output format are easily parseable by another tool, and as above specification is available);
- CUI (Console User Interface) --- the provided command line tools are not suitable for being invoked by other tools, because for example the input / output are human-centric and difficult to parse;
- API (Application Programmable Interface) --- the solution also provides a library that abstracts one of the previous interaction methods;
How would we be able to use the proposed solution?
- hosted --- the proper meaning of the term PaaS;
- deployable (closed-source) --- available for deployment in a private cloud, but the code is closed-source;
- deployable (open-source) --- available for deployment in a private cloud, and the code is available as open-source, thus enabling modifications;
- simulated --- there is an option to deploy locally a similar solution for development and debugging purposes;
If a developer uses a particular solution, how easy is to him to move to another solution having the same role?
- locked -- to move to a different solution would require massive rewriting of the application;
- portable -- possible with minor updates to the application;
- out-of-the-box -- the solution uses existing standards thus portability is guaranteed;
- Especially in the case of PaaS's, what additional resources or services (such as databases, middlewares, etc.) are available and managed directly by the solution, and thus integrated with the application life-cycle?
- monitoring coverage
Especially in the case of PaaS's, how much do the monitoring facilities cover and expose to the operator?
- none -- the solution provides no monitoring options (except maybe the listing of running processes or logging, etc.);
- basic -- the usual information that is comprised of CPU, memory and disk usage;
- extensive -- it provides many other metrics than the ones above;
- monitoring level
From which perspective, or at which level of the software and infrastructure stack, are the metrics provided?
- application -- the data is collected from within the application itself; (for example by using NewRelic, etc.;)
- container -- the data is collected from within the VM or the container; it could refer to the VM or the container itself or the whole running application;
- hypervisor -- the data is collected by the virtualization solution;
- fabric -- the data is collected at the infrastructure layer; (for example raw disks, load balancers, routers, switches, etc.);
- monitoring interface
- What technique --- such as standard, API, library, etc. --- is used to expose the monitoring information to the operator?
- resource providers
- Most of the PaaS don't also have their own hardware resources, but instead are built on top of other publicly accessible IaaS providers. Thus if the user needs services not offered by the PaaS itself, it could use that IaaS to host the missing functionality himself.
This characteristic pertains mainly to PaaS or PaaS-like solutions, and tries to asses if multiple applications can share the same instance of the PaaS.
- single application --- the entire PaaS instance is dedicated to only one application; (some deployable PaaS's fit into this category;)
- single organization --- the PaaS is able to host multiple independent applications, but they should belong to the same organization, mainly because the security model is restricted, or the scheduling model implies a fair behaviour; (almost all other deployable PaaS's fit into this category;)
- multiple organizations --- the PaaS is shared between multiple parties, each possibly with multiple applications; (all hosted PaaS's fit in this category;)
- resource sharing
This characteristic pertains mainly to PaaS or PaaS-like solutions, and tries to asses how are the application's components or services mapped on the provisioned VM's.
- 1:1 --- each component or service (from each application where applicable) is deployed on its own VM; such a usage pattern would better fit heavy-weight applications, that have few component or service types, featuring constantly high load; thus an instance wouldn't interfere with another, through shared resource consumption;
- n:1 --- more than one component or service (potentially from different applications in case of multi-tenancy) can be deployed on the same VM, thus sharing its resources; this usage pattern would allow cost savings, especially in development or initial deployments, until the product gains traction and increased load, where a 1:1 pattern would prove more efficient;
- Most of the solutions impose quantitative limitations (such as memory, bandwidth, storage, etc.) on the running applications, which could be of interest especially in determining the suitability for our case studies.
We should observe that not all of these properties or capabilities apply to all the surveyed solutions.