(MODAClouds) Google App Engine

Important

This document is part of the MODAClouds deliverable D6.1, thus superseded by it.

Important

This document was created in March 2013, thus its contents might be currently outdated.

Overview

Google App Engine is one of the first commercial PaaS's, and from a technical point of view it is the closest to the PaaS philosophy, that is it completely alleviates the programmer's concerns related with the infrastructure, either software or hardware, upon which his application runs. Although all this comes at the price of flexibility, because the developer has to keep to a very strict set of rules.

A GAE application is mainly composed of:

request handlers [GAE-1]
These are the "normal" instances available in GAE, that must conform to the Java Servlet API --- or the equivalents in other languages --- towards which requests are routed. However these handlers must obey some strict limitations, like response time, available hardware resources, etc.
backend handler [GAE-3]
From a programmer's point of view they are identical with the "normal" handlers, except that some limitations have lower bounds, like the possibility of running background threads, more memory, etc.; moreover the billing model is similar to that of a classical IaaS provider.
service
Various services already provided by Google, which are exposed to the applications via dedicated API's in the targeted language.
version
Each application can be deployed and re-deployed multiple times, each mapping to a particular version, which is individually accessible.

Characteristics

type PaaS
suitability production
application domain web applications
application architecture n-tier applications
application restrictions limited
programming languages Java, Python, Go
programming frameworks Java Servlets, Python WSGI
scalability automatic scalability
session affinity non-deterministic [GAE-1]
interaction CLI, WUI, WS
disponibility hosted
portability locked
services object store, memcache, mail, HTTP fetching, user management, XMPP
monitoring coverage extensive
monitoring level application
resource providers Google
multi-tenancy multiple organizations
resource-sharing n:1

Limitations

Access limitations:

  • sockets are completely disallowed, both for listening or connecting; the only way to communicate with the outside is through the API's that Google provides; (there is however an API for HTTP resource fetching or email sending;) [GAE-1]
  • the interaction with the clients must be confined to HTTP only; [GAE-1]
  • the application can't mutate the local file-system operations; [GAE-1]
  • (for request or task handlers) threads can't out-live the request life-span; moreover the API is custom for GAE; [GAE-1]
  • most system- or native-related calls, like JNI, or interpreter-related calls are forbidden; [GAE-1]

Quantitative limitations:

  • the maximum request or response size for an HTTP response is 32 MiB; [GAE-1] [GAE-2]
  • each HTTP request must be resolved in at most 60 seconds; [GAE-1]
  • there are a maximum of 50 threads for each request; [GAE-1]
  • the maximum size of a file in the package is 32 MiB; [GAE-1] [GAE-2]
  • the maximum memory available for a normal instance is 128 MiB; [GAE-3]
  • the maximum memory available for a backend instance is 1 GiB; [GAE-3]
  • other quotas are high enough, especially for the payed applications, that any even medium sized [1] application shouldn't worry; [GAE-2]

Although the previous paragraphs listed only the most important limitations, GAE has a very complex quota and QoS model: [GAE-2]

  • on the temporal axis we have either daily quotas and per-minute quotas referring to resource usage;
  • on the cost axis there are the billable quotas, on a daily basis, that ensure the application's operational budget is limited;
  • and the safety quotas, that ensure no application depletes available resources;
[1]We define a medium sized application as one handling a few hundred requests per second.

Notes

Although GAE has serious limitations in terms of development flexibility, it compensates by a high integration with other Google's products, especially Google Accounts, GMail, Google Drive, etc., by allowing the developer to leverage all those additional services and easily integrate them in his applications. As such GAE would be a prime candidate for a PaaS hosting an application based on Google's services.

Another limiting aspect, present in most other PaaS's, is the way in which the application's components communicate, and in this case there is only one proper solution, that of task queues, accessed through a customized API. These in turn map to HTTP requests, because even the backend handlers are only accessible via HTTP. The other information exchange solution is a shared data store, but this is less effective due to frequent polling. On the other hand, and due to the integration of the various offered services, these task queues feature atomic operations, within transactions spanning even over data store access.

It seems that GAE is tuned towards small response time applications, because the automatic scaling feature is available only for those applications where "most" requests are under a second [GAE-1]. Moreover the backend handlers are not automatically scaled; however the "dynamic" backends are automatically "woken" when needed, and then "disabled" when idle [GAE-3].

An interesting feature of GAE is the support for multiple application versions, easily accessible by altering the URL's host name [GAE-1], thus an older un-updated client is able to use an older variant of the service. Another interesting feature is the availability of the SPDY protocol --- a replacement of HTTP --- which makes GAE the single PaaS currently supporting it.

Related to data access, GAE provides its own set of data stores tailored for scalable applications; in case of the Java environment the developer has the choice of either JDO or JPA compatible interfaces, together a limited SQL-like query language, either a set of low-level interfaces to interact directly with the data store with its native data model [GAE-4]. Related with the additional services, again GAE provides a wide variety of services integrated by Google [GAE-5].

However if the developer needs a data store, middleware or service not part of Google's offering, the only solution is to host it outside the GAE --- for example inside Google Compute Engine --- and access it via HTTP-based requests; which unfortunately rules out most database systems and middlewares.

ModaClouds integration

Because GAE is a very peculiar PaaS --- whose feature set is not matched by other PaaS's, although there are prototypes or projects cloning it --- it stands out as a unique development and deployment target, and coupled with its access restrictions, it will require on our part more work to integrate that other hosted PaaS's.

On the other side the monitoring capabilities of GAE are very fine-grained, surpassing that of the other hosted PaaS's, making it a good part for the monitoring platform. However it lacks the ability to directly control the scalability of the normal instances, making it unsuitable for the self-adaptation platform.

References

[GAE-1](1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12) GAE Documentation --- The Java Servlet Environment
[GAE-2](1, 2, 3, 4) GAE Documentation --- Quotas
[GAE-3](1, 2, 3, 4) GAE Documentation --- Backends and Java API Overview
[GAE-4]GAE Documentation --- Datastore Overview
[GAE-5]GAE Documentation --- Java Service APIs