Note from 2025: This blog was originally written in 2011 as a series of short posts describing different architectural features of cloud-native systems. I’ve decided to leave these posts all in-line, in the form that I provided it to our publication team, rather than break it up. Please don’t get confused when I refer to the “next blog”–I’m talking about the next section in this document, not a separate post.
AWS first introduced the key components of its cloud architecture (the storage and compute virtualization components S3 and EC2) in 2006. When this blog was written in 2011, the cloud paradigm was still unfamiliar to many. In fact, lots of companies with virtualized data centers referred to their infrastructure as a “cloud” and thought they were already there! I wrote this series, in part, to disambiguate the public cloud from a virtualized data center, and to explain the advantages of cloud-native architectures.
Interesting to me looking back on it from 2025 is how well these principles pre-figure the microservices paradigm. In fact the term “microservices” was invented right around the time this was originally written. As the microservices paradigm emerged, the principles seemed very natural to me–and ideally suited for cloud-native systems.
Location anonymity
What is “The Cloud”? The central characteristic of the cloud is that the assets required to deliver a unit of functionality—computation, data access, networking devices, etc.—are in unspecified and potentially dynamically changing physical locations. This is, of course, where the “cloud” metaphor comes from—in physical clouds, knowing the location of each individual water droplet is not necessary to understand the behavior of the cloud as a whole. For many years a cloud symbol has been used on network diagrams to denote hidden and largely irrelevant complexity—for example, that symbol has been used for internet, or—prior to that—the public switched telephone network (PSTN).
Though their physical locations may be unknown, functional units deployed in the cloud are addressable, usually though not always via a URL (like a web page address) and programmatically via paradigms such as REST and protocols such as SOAP. However these assets cannot be addressed as conventional physical resources (traditionally hostname or hardware-specific IP address) as they would be in a corporate network, except perhaps as aliases for an anonymous system. Leveraging this “location anonymity” is actually what drives some of the most powerful aspects of the cloud paradigm.
It would be easy to read “location anonymity” only as “cloud means ignorance of location”, but a more useful characterization is “components of a well-designed cloud architecture shouldn’t assume nor care about having a specific location”. In other words, a cloud application or service is best architected in such a way that each of its constituent components can tolerate wide variations in the proximity of the other components and resources it needs, and in the physical location of the clients it serves. In this sense, cloud is the software equivalent of geographic globalization, where the physical location of human and other resources may be widely dispersed, with modern business practices and technologies making that location increasingly irrelevant.
In addition, not depending on the physical location of components is a key driver of cloud “elasticity”: Since a given system is anonymous, the function performed by that system can easily be realized as a “pool” of identical systems instantiated dynamically from the same boot image, front-ended by a “load-balancer”. The pool can grow or shrink in size depending on the workload characteristics of the application, allowing for a very high degree of scalability during peak loads, and economy when demand is small, since fewer resources are then deployed. In a physically-tied system, deploying more systems can take weeks if not months; even in a virtualized data center this process can involve considerable manual configuration effort. With the cloud, elasticity is intrinsic to the paradigm and when properly implemented only takes minutes.
Independence of system components from physical location necessitates that our cloud-based applications follow some important architectural principals if they are to be robust, dynamically scalable and high-performing. These principals, in turn, imply what I would consider a “functional” definition of the cloud: I would claim that systems that violate these cloud architectural principals are less cloud-oriented systems than systems that follow them—and vice versa.
To define the cloud, then, we will look at the unique opportunities and challenges it poses architecturally. This is much like trying to understand a physical material—like, say, unreinforced clay bricks—by the opportunities and challenges that constructing a building from them would pose. Physical masonry bricks and mortar, for example, cannot be used to construct long unsupported overhangs unless they are formed into an arch. Bricks and mortar are strong in compression—i.e. they can carry heavy loads—but not in bending; when bent, they tend to fracture. If you were to see a building that violates that architectural principal by, say, having a 100 foot by 100 foot unsupported brick overhang, you could with confidence infer that this portion of the building is not, in fact, structurally made out of traditional masonry bricks and mortar. It may look like bricks, but that part of the building must be made of some other material—reinforced concrete, for example—with a brick façade. That is, of course, unless it’s engineered to collapse!
Similarly, if you see an application that violates the cloud architectural principals we propose in the following blogs, I assert that it is not a fully cloud-based application. It may look like one or be called one, but it is not a “full” cloud application.
Toward an architecturally-based definition of The Cloud: Architectural Principals 1 & 2
Let’s look at the first two of the eight key principals GlobalLogic’s architecture team has found helpful in designing effective cloud systems:
- The underlying architecture should be distributed. Although cloud computing and distributed architecture are in theory orthogonal, we have found that distributing the application architecture is a very important step to capitalizing on the full benefits of cloud computing. Organizations taking their first step in moving an existing enterprise application unmodified into the cloud will quickly see the advantages of re-architecting their system in a distributed manner if they wish to fully gain the benefits of the cloud, for the reasons we outline in subsequent principals. For example, having all components access a single, massive relational database—while common in enterprise applications—is not an architecture that will lend itself to exploiting the advantages of the cloud as readily as a more decentralized system.
- Each component of a cloud-based application must be highly tolerant of variable degrees of latency between messages sent to or received from every other component. This is because when invoked, one resource might be deployed on a VM residing on a physical machine somewhere, say, on the West Coast of the United States while a resource it communicates with is located in India. When called another time, the resources might be co-located on the same subnet in the same data center in Europe. Called yet another time, they may be in, say, Argentina and Ukraine. That’s an extreme example, but it’s to make the point that in a fully cloud-enabled application you don’t have full control over where your components are. Even more to the point, you should not have to care. Components should not rely on their communication being synchronous, or even that the latency between components is low and constant, since resource location is dynamic. To be sure, some commercial cloud offerings let you specify an affinity for a particular datacenter or even for the same equipment rack—but applications that require such approaches are, in my view, using a workaround and are not architected to take full advantage of this new paradigm.
In the next blog in this series, we will discuss principals related to state and data consistency.
Toward an architecturally-based definition of The Cloud: Architectural Principals 3 & 4
In the previous blog in this series, we discussed the first two of the eight principals GlobalLogic’s architecture team uses in designing effective cloud systems:
- The underlying architecture should be distributed.
- Each component of a cloud-based application must be highly tolerant of variable degrees of latency between messages sent to or received from every other component.
The next two principals relate to state and data consistency:
- Cloud components should be as stateless as possible, with state maintained in a shared, distributed data store or cache. The dynamic “elasticity” of a cloud application is a key benefit, since instances of a given component can be created or destroyed to respond to the load at any given time. To take maximum advantage of this, the individual components in a pool must be as interchangeable as possible—in particular, they should ideally not carry state. Whatever state a component must carry should be maintained in a separate data service so it can be shared between component instances. This can be done via a distributed data store or caching mechanism, and can provide redundancy for fault tolerance, as well as support for elastic scalability.
- The system should tolerate eventual data consistency rather than enforcing immediate data consistency. Because distributed components operate asynchronously and with variable latencies, enforcing instantaneous global data updates can lead to race conditions, bottlenecks, retries, and general loss of system efficiency. Cloud systems are better architected to support local updates which propagate through the system over time and are consolidated eventually, as opposed to having a single monolithic data repository which is guaranteed complete and accurate at all times. While this may not be possible or desirable in every situation, note that even bank ATM systems are generally architected to enforce eventual, rather than immediate, consistency of your account balance; this approach is not the heresy it might seem to be. And “eventually” need not be a long time—the time to achieve consistency is commonly seconds, but may be shorter or longer depending on the needs of the system.
Toward an architecturally-based definition of The Cloud: Architectural Principals 5 & 6
In previous blogs in this series, we discussed four of the eight principals GlobalLogic’s architecture team uses in designing effective cloud systems:
- The underlying architecture should be distributed.
- Each component of a cloud-based application must be highly tolerant of variable degrees of latency between messages sent to or received from every other component.
- Cloud components should be as stateless as possible, with state maintained in a shared, distributed data store or cache.
- The system should tolerate eventual data consistency rather than enforcing immediate data consistency.
The next two principals relate to the physical locality of data to computational logic:
- Minimize data movement by moving the code toward the data or, better yet, act only on the data closest to each compute node. Because of the distributed components in a cloud application, having a centralized data source—such as a massive database—diminishes the efficiency of the system. This is because of the time it takes to propagate data to the distributed components, and because of those components trying to access or modify the same co-located pool of data with their variable latencies. To improve processing parallelism, it is better to move away from a physically centralized data store entirely, and toward a distributed one that enables each processing component to act against its local data. Those intermediate results may then be consolidated at a higher level. This is how Google search, for example, is able to achieve such high levels of performance against massive amounts of data (the entire internet in their case)—data is distributed across many compute nodes. In such a “MapReduce” operation, each of the “mapper nodes” searches only the relatively small amount of data contained in its local attached storage, with the results being consolidated globally at the reducer nodes. In applications where it is not feasible to so distribute the data, it is better to move the compute nodes as close as possible to the data store, to minimize the impact of data traffic and latency. This location bias definitely is a workaround that makes the application less “cloud” oriented, but it is a workable paradigm where legacy data stores, for example, must be incorporated into a new cloud application.
- Asynchronously redistribute data in the background. Elasticity is a fundamental feature of the cloud paradigm so systems must be architected to tolerate the fact that machines may come and go in a planned or an unplanned fashion. Because application state must be retained somewhere—hopefully in a shared and distributed data service rather than in each individual component—the systems which carry that state must themselves be tolerant of being created or destroyed. In other words, when the pool of stateful components grows or shrinks, data has to be redistributed across surviving or newly instantiated nodes. Also, new data coming into the system must be transferred to the components that will use it; this is true even in a MapReduce operation. Our preferred way to mitigate the performance and other impact of these data movement is to implement them as a “background” task that is done asynchronously from the processing tasks. The processing components that depend upon the data must then be architected to tolerate temporary data inconsistencies, but this has other benefits as we discussed above.
Toward an architecturally-based definition of The Cloud: Architectural Principals 7 & 8
We have now covered six of the eight key cloud architecture principals GlobalLogic’s architecture team uses to design effective cloud systems in previous blogs in this series:
- The underlying architecture should be distributed.
- Each component of a cloud-based application must be highly tolerant of variable degrees of latency between messages sent to or received from every other component.
- Cloud components should be as stateless as possible, with state maintained in a shared, distributed data store or cache.
- The system should tolerate eventual data consistency rather than enforcing immediate data consistency.
- Minimize data movement by moving the code toward the data or, better yet, act only on the data closest to each compute node.
- Asynchronously redistribute data in the background.
The final two principals deal with data movement:
- As you grow and shrink the pool of components, minimize data movement. Moving data is expensive in a cloud system since, in general, data must be assumed to at least sometimes travel at WAN speeds. In particular, the need for a “real-time” data transfer should be avoided due to its potential performance impact on the system. In systems where the data is distributed—for example, in the distributed web search scenario we mentioned previously—spawning or removing a stateful component requires that other stateful components surrender or acquire the data being managed by the new or deleted component. Because of fault tolerance, it is generally better to have data redundancy built into a distributed system, such that either accidentally or intentionally removing a stateful component instance does not necessitate an immediate data transfer to insure accurate results. By building the system to tolerate a degree of redundancy—which you can sometimes do, for example, by filtering out redundant results at a higher level of the system—data may be cached or lazily transferred asynchronously around the system to take advantage of the addition or compensate for the removal of a stateful component.
- Maximize the use of cache. Once your system is designed to tolerate data inconsistency issues, you can leverage cache more effectively. Use of a distributed cache is an important mechanism to support the separation of concerns that result in stateless components, asynchronous data distribution and eventual consistency as outlined above. Some excellent commercial and open-source alternatives are readily available. Memory-based architectures with terabytes of reliable and in some cases non-volatile storage are becoming a reality for cloud applications, and cache-centric architectures that follow the principals described above will be well positioned to exploit these new developments.
Now that we’ve outlined a set of architectural principals behind good cloud design, how can we apply these to form a functional definition of a “cloud based system”? In the next blog in this series, we will take an example and see.
Toward an architecturally-based definition of The Cloud: Is a virtualized data center a cloud?
We have discussed in preceding blogs in this series the eight architectural principals we keep in mind when designing a new large-scale cloud application. These are:
- The underlying architecture should be distributed.
- Each component of a cloud-based application must be highly tolerant of variable degrees of latency between messages sent to or received from every other component.
- Cloud components should be as stateless as possible, with state maintained in a shared, distributed data store or cache.
- The system should tolerate eventual data consistency rather than enforcing immediate data consistency.
- Minimize data movement by moving the code toward the data or, better yet, act only on the data closest to each compute node.
- Asynchronously redistribute data in the background.
- As you grow and shrink the pool of components, minimize data movement.
- Maximize the use of cache.
I would now argue, coming back to the physical building analogy I used early in this series, that applications and/or “platforms” which follow these principals are more cloud-oriented than those that do not—and I think now you should see why. But what about the infrastructure on which these systems are deployed? Can we use these architectural principals to say anything about the physical infrastructure itself?
Let us ask the question: Is a single virtualized data center a “cloud”? In the light of our discussion above I would re-frame that question to ask: “Does a virtualized datacenter provide support for cloud-based applications?” I would argue that while there is nothing to stop a cloud-based application from being deployed in a virtualized data center, there is also nothing about a single virtualized datacenter that requires or takes full advantage of the unique characteristics of cloud-based applications. So my short answer would be: No, a single virtualized data center is not a cloud. It does not impose the conditions of variable latency, location anonymity or other characteristics for which cloud applications are architected.
Obviously at some scale, a virtualized data center does become a cloud. But where does the tipping point occur? We will discuss this in our next and final blog in this series.
Toward an architecturally-based definition of The Cloud: Conclusion
In the previous blog in this series we asked ourselves: At what point does a “virtualized data center” become a “private cloud”? This is clearly at least in part a question of scale, but where is the dividing line? I would say that based on the conditions that need to be met by cloud applications as discussed in the architectural principals above, to be considered a “virtual private cloud” the following conditions should be met:
- There are multiple interconnected virtual private data centers with significant geographic separation between them (on multiple continents, for example), or applications are designed to scale elastically to utilize both the virtual private data center(s) and a publically hosted cloud in a “hybrid cloud” model
- Provisioning of new systems which host application components is entirely automatic and in response to load
- No human intervention is required to provision a new system or to deploy an application component on it; or to de-commission an existing system
- Deployed systems and components can freely or algorithmically migrate between the distributed data centers in response to load with and, at least ideally, without human intervention
Unless the above conditions are true, there is really no need to architect an application that follows all the above cloud architecture principals; a more conventional approach would be equally suited—unless an eventual migration to a “true” cloud environment is envisioned. I would not consider applications that do not follow the architecture principals above to be as cloud-enabled as those that do, however. I would also not consider a deployment environment to be a cloud if it does not meet at least the “virtual private cloud” criteria listed above—though that is a high bar, I admit.
Perhaps a more useful concept in this transitional time is a “degree of cloudiness”, rather than a simple “Cloud” / “Not Cloud” dichotomy. In reality, few existing applications or cloud infrastructures meet all the above criteria. This is where I believe the Cloud is headed, however, and it is in embodying these principals that we will see the real benefits of this new paradigm.
That being said, there is nothing wrong per se with not being a cloud or a “full” cloud application! There is plenty of scope for non- or semi-cloud applications and infrastructures, both within organizations and as commercial offerings. When an application or an infrastructure is not a “cloud” or is a “partial cloud”, I think we should admit it without apology, rather than trying to stretch the definition of “cloud” to cover it. A definition is not meaningful until it’s clear what it does not apply to. If everything is a cloud or a cloud application, then the term loses any meaning. And I think the concept of “The Cloud” is way too useful for us to let it just drift away!
My warm thanks to the GlobalLogic architecture team for their insights and for stimulating discussions on “The Cloud” and related topics.
Leave a comment