Skip to main content

Rethinking the way we build on the cloud: Part 2: Environments on the Cloud


The newly launched Mingle SaaS offering runs entirely on the AWS cloud. As discussed in our earlier blog on Layering the Cloud, because there is no existing system that we have to modify or integrate with, we've got the freedom to design the architecture from scratch. This has led us to rethink the role of environments in our development and deployment process.

What’s wrong with the traditional approach to environments?

In traditional data-center-based applications there are usually a small, fixed number of environments into which the application is deployed. For example there might be the production environment, a staging environment where candidate builds are deployed before they go into production, a test environment where new work is verified and a development environment where developers can run new code as part of a full system.
The availability and nature of these environments is strongly constrained by the availability of hardware and infrastructural systems for them to run on. We would like all our environments to be as similar to production as possible, but physical hardware and traditional infrastructural systems like databases are expensive and slow to provision.
In practice this means that there is a sliding scale of realism in the environments as you go from development to production. Development environments tend to be smaller than production (load-balanced applications may have only a single server when there are a dozen in production), they often use alternatives for some components (they may have a different database server) and they frequently share components (fileservers, databases, monitoring systems) when these are dedicated in production. Similar differences exist for test and even staging environments; although of course the hope is that they can be more realistic the closer they are to production.
This variation between environments causes several problems. The most obvious problem is that some bugs are found further down the pipeline, rather than being identified by developers as they are working on the code; this increases the cost of fixing the bugs. Another problem is that supporting the variation in environments increases the complexity of the code. And, more subtly, we end up making decisions that don't cause outright bugs but which cause our architectures not to be optimized for the real deployment environment, because developers are divorced from the reality of running the system in that environment.
The inevitable restriction on the number of environments also causes problems. Availability of environments is yet another dependency to be juggled, which can cause delays or influence us to miss out testing that we would like to do. And maintenance of the environments, cleaning up after running stress tests for example, also takes time.

How have we approached environments in the cloud?

We have found that building a system that runs entirely on the cloud has enabled us to reconsider our use of environments and ensure that we don't fall foul of any of these problems.
Stay tuned for our next blog where we discuss the principles we used; such as ad-hoc environments, shared-nothing environments and cookie-cutter environments to optimize our use of environments in the cloud.

Comments

Popular posts from this blog

How to construct a File System that lives in Shared Memory.

Shared Memory File System Goals 1. MOUNTED IN SHARED MEMORY The result is a very fast, real time file system. We use Shared Memory so that the file system is public and not private. 2. PERSISTS TO DISK When the file system is unmounted, what happens to it? We need to be able to save the file system so that a system reboot does not destroy it. A great way to achieve this is to save the file system to disk. 3. EXTENSIBLE IN PLACE We want to be able to grow the file system in place. 4. SUPPORTS CONCURRENCY We want multiple users to be able to access the file system at the same time. In fact, we want multiple users to be able to access the same file at the same time. With the goals now in mind we can now talk about the major design issues: FAT File System & Design Issues The  FAT File System  has been around for quite some time. Basically it provides a pretty good file structure. But I have two problems with it: 1. FAT IS NOT EXTENSIBLE IN PLAC...

Common Sense Identification of the Security Problems

Organizations make key information security mistakes, which leads to inefficient and ineffective control environment. High profile data breaches and cyber-attacks drive the industry to look for more comprehensive protection measures since many organizations feel that their capability to withstand persistent targeted attacks is minimal. But at the same time, these organizations make some key information security mistakes, that jeopardize their efforts towards control robustness. Although many firms invest in security technologies and people, no one has the confidence that the measures taken are good enough to protect their data from compromises. Below are the 10 worst mistakes which are common to find, and important to address in the path of mature information security posture. If you analyze the cyber security scenarios, and organizational capabilities, the prevailing trend is a vendor-driven approach. In many cases, security professionals adopt the attitude of procuring...

Ingesting IoT Sensor Data Into S3 With an RPI3

StreamSets Data Collector Edge is a lightweight agent used to create end-to-end data flow pipelines. We'll use it help stream data collected from a sensor. Due to the increasing amount of data produced from outside source systems, enterprises are facing difficulties in reading, collecting, and ingesting data into a desired, central database system. An edge pipeline runs on an edge device with limited resources, receives data from another pipeline or reads the data from the device, and controls the device based on the data. StreamSets Data Collector (SDC) Edge, an ultra-lightweight agent, is used to create end-to-end data flow pipelines in StreamSets Data Collector and to run the pipelines to read and export data in and out of systems. In this blog, StreamSets Data Collector Edge is used to read data from an air pressure sensor (BMP180) from an IoT device (Raspberry Pi3). Meanwhile, StreamSets Data Collector is used to load the data into Amazon's Simple Storage Service ...