Skip to main content

Integrity in the Cloud

All communication, SSH and HTTP(s) using HTTP Signature, is verified as accurate/legitimate.
For all data I upload to my CSP, I can attach a signature (MAC) to verify if the contents have changed.
For all data I upload to my CSP, I am sure it will be the same in the future.
To have mutual auditability, I now want to expand that list to include the following:
For all data I upload to my CSP, I can ensure I have the latest revision of my data. Currently, this can’t happen without much work on either the customer's or the CSP's end.
I can ensure I am charged the correct amount for data at rest and data in motion by my CSP. 
but what about data fingerprints on my data?

Keyless Signature (KS) is a system based on the building blocks of MACs, high resolution timestamping, hash chains and hash trees. KS provides: proof of integrity, proof of time and proof of signing authority. It is considered keyless as it is based entirely on formal methods of hash functions and not cryptographic keys; essentially, key management is thrown out of the window as it's not needed.

The best part of KSI is that by the nature of the technology data is tamper-evident, signatures never expire, and it can be used by the user, the CSP and an arbitrary 3rd party (i.e. auditor) to verify both the order of events and that the event hasn't been altered.

So if a CSP has  technology integrated into their APIs and portions of their subsystems like storage, then not only will each data object have its own signature, but all future revisions of the data will have it as well, ensuring the order of operations (ie., that this rev came before this rev). So now we have #1 - #4 accomplished, how about #5? To accomplish #5, a CSP will have to extend KSI into every piece of their architecture.

Let’s say that as a CSP, I have incorporated Guardtime into my hypervisor rsyslog and BSM. I can tell that every log event and every event that is audited via BSM occured in the order that they were logged, that they actually occured (no tampering), and who performed the event. I also built all APIs to use Guardtime to log all events (either directly using node-guardtime or using rsyslog). What does that get me?

I now have a system where every event (CSP administrative or user) is independently verifiable by the CSP, the user and any 3rd party. I can show these events to the user at anytime to prove to them we did X, they did Y and at any point where there’s disagreement, we can go to any 3rd party arbitrator for verification. Also, as a CSP, I can now provide all account details for user with 100% certainty, which eliminates over and under charging by the CSP by providing the most accurate billing information.

Mutual auditability was thought to be an unattainable goal just a few years ago. Today, we are closer than ever to realizing it. In previous incarnations of mutual auditability in Cloud Computing, there was a DARPA project based on Attribution-based Architectures where each entity of the system could not forge their actions to provide a near leak proof system. The details of the system were not widely published. However, in the public cloud age, such a system didn't exist, until now. When you combine the latest rsyslog, BSM, and Guardtime, the public cloud provides a higher level of security than anything that can be done in-house.

Comments

Popular posts from this blog

Python and Parquet Performance

In Pandas, PyArrow, fastparquet, AWS Data Wrangler, PySpark and Dask. This post outlines how to use all common Python libraries to read and write Parquet format while taking advantage of  columnar storage ,  columnar compression  and  data partitioning . Used together, these three optimizations can dramatically accelerate I/O for your Python applications compared to CSV, JSON, HDF or other row-based formats. Parquet makes applications possible that are simply impossible using a text format like JSON or CSV. Introduction I have recently gotten more familiar with how to work with  Parquet  datasets across the six major tools used to read and write from Parquet in the Python ecosystem:  Pandas ,  PyArrow ,  fastparquet ,  AWS Data Wrangler ,  PySpark  and  Dask . My work of late in algorithmic trading involves switching between these tools a lot and as I said I often mix up the APIs. I use Pandas and PyArrow for in-RAM comput...

Design of Large-Scale Services on Cloud Services PART 2

Decompose the Application by Workload Applications are typically composed of multiple workloads. Different workloads can, and often do, have different requirements, different levels of criticality to the business, and different levels of financial consideration associated with them. By decomposing an application into workloads, an organization provides itself with valuable flexibility. A workload-centric approach provides better controls over costs, more flexibility in choosing technologies best suited to the workload, workload specific approaches to availability and security, flexibility and agility in adding and deploying new capabilities, etc. Scenarios When thinking about resiliency, it’s sometimes helpful to do so in the context of scenarios. The following are examples of typical scenarios: Scenario 1 – Sports Data Service  A customer provides a data service that provides sports information. The service has two primary workloads. The first provides statistics for th...

Design of Large-Scale Services on Cloud Services PART 1

Cloud computing is distributed computing; distributing computing requires thoughtful planning and delivery – regardless of the platform choice. The purpose of this document is to provide thoughtful guidance based on real-world customer scenarios for building scalable applications Fail-safe   noun . Something designed to work or function automatically to prevent breakdown of a mechanism, system, or the like. Individuals - whether in the context of employee, citizen, or consumer – demand instant access to application, compute and data services. The number of people connected and the devices they use to connect to these services are ever growing. In this world of always-on services, the systems that support them must be designed to be both available and resilient. The Fail-Safe initiative  is intended to deliver general guidance for building resilient cloud architectures, guidance for implementing those architectures  and recipes for implementing these architectures...