Skip to main content

Integrity in the Cloud

All communication, SSH and HTTP(s) using HTTP Signature, is verified as accurate/legitimate.
For all data I upload to my CSP, I can attach a signature (MAC) to verify if the contents have changed.
For all data I upload to my CSP, I am sure it will be the same in the future.
To have mutual auditability, I now want to expand that list to include the following:
For all data I upload to my CSP, I can ensure I have the latest revision of my data. Currently, this can’t happen without much work on either the customer's or the CSP's end.
I can ensure I am charged the correct amount for data at rest and data in motion by my CSP. 
but what about data fingerprints on my data?

Keyless Signature (KS) is a system based on the building blocks of MACs, high resolution timestamping, hash chains and hash trees. KS provides: proof of integrity, proof of time and proof of signing authority. It is considered keyless as it is based entirely on formal methods of hash functions and not cryptographic keys; essentially, key management is thrown out of the window as it's not needed.

The best part of KSI is that by the nature of the technology data is tamper-evident, signatures never expire, and it can be used by the user, the CSP and an arbitrary 3rd party (i.e. auditor) to verify both the order of events and that the event hasn't been altered.

So if a CSP has  technology integrated into their APIs and portions of their subsystems like storage, then not only will each data object have its own signature, but all future revisions of the data will have it as well, ensuring the order of operations (ie., that this rev came before this rev). So now we have #1 - #4 accomplished, how about #5? To accomplish #5, a CSP will have to extend KSI into every piece of their architecture.

Let’s say that as a CSP, I have incorporated Guardtime into my hypervisor rsyslog and BSM. I can tell that every log event and every event that is audited via BSM occured in the order that they were logged, that they actually occured (no tampering), and who performed the event. I also built all APIs to use Guardtime to log all events (either directly using node-guardtime or using rsyslog). What does that get me?

I now have a system where every event (CSP administrative or user) is independently verifiable by the CSP, the user and any 3rd party. I can show these events to the user at anytime to prove to them we did X, they did Y and at any point where there’s disagreement, we can go to any 3rd party arbitrator for verification. Also, as a CSP, I can now provide all account details for user with 100% certainty, which eliminates over and under charging by the CSP by providing the most accurate billing information.

Mutual auditability was thought to be an unattainable goal just a few years ago. Today, we are closer than ever to realizing it. In previous incarnations of mutual auditability in Cloud Computing, there was a DARPA project based on Attribution-based Architectures where each entity of the system could not forge their actions to provide a near leak proof system. The details of the system were not widely published. However, in the public cloud age, such a system didn't exist, until now. When you combine the latest rsyslog, BSM, and Guardtime, the public cloud provides a higher level of security than anything that can be done in-house.

Comments

Popular posts from this blog

Python and Parquet Performance

In Pandas, PyArrow, fastparquet, AWS Data Wrangler, PySpark and Dask. This post outlines how to use all common Python libraries to read and write Parquet format while taking advantage of  columnar storage ,  columnar compression  and  data partitioning . Used together, these three optimizations can dramatically accelerate I/O for your Python applications compared to CSV, JSON, HDF or other row-based formats. Parquet makes applications possible that are simply impossible using a text format like JSON or CSV. Introduction I have recently gotten more familiar with how to work with  Parquet  datasets across the six major tools used to read and write from Parquet in the Python ecosystem:  Pandas ,  PyArrow ,  fastparquet ,  AWS Data Wrangler ,  PySpark  and  Dask . My work of late in algorithmic trading involves switching between these tools a lot and as I said I often mix up the APIs. I use Pandas and PyArrow for in-RAM comput...

How to construct a File System that lives in Shared Memory.

Shared Memory File System Goals 1. MOUNTED IN SHARED MEMORY The result is a very fast, real time file system. We use Shared Memory so that the file system is public and not private. 2. PERSISTS TO DISK When the file system is unmounted, what happens to it? We need to be able to save the file system so that a system reboot does not destroy it. A great way to achieve this is to save the file system to disk. 3. EXTENSIBLE IN PLACE We want to be able to grow the file system in place. 4. SUPPORTS CONCURRENCY We want multiple users to be able to access the file system at the same time. In fact, we want multiple users to be able to access the same file at the same time. With the goals now in mind we can now talk about the major design issues: FAT File System & Design Issues The  FAT File System  has been around for quite some time. Basically it provides a pretty good file structure. But I have two problems with it: 1. FAT IS NOT EXTENSIBLE IN PLAC...

Fetching Facebook Friends using Windows Azure Mobile Services

This tutorial shows you how to fetch Facebook Friends if you have Facebook accessToken. Here is the the code for Scheduled task called getFriends function getFriends() { //Name of the table where accounts are stored var accountTable = tables.getTable('FacebookAccounts'); //Name of the table where friends are stored var friendsTable = tables.getTable('Friends'); checkAccounts(); function checkAccounts(){ accountTable .read({success: function readAccounts(accounts){ if (accounts.length){ for (var i = 0; i < accounts.length; i++){ console.log("Creating query"); //Call createQuery function for all of the accounts that are found createQuery(accounts[i], getDataFromFacebook); } } else { console.log("Didn't find any account"); prepareAccountTable(); } }}); } function prepareAccountTable(){ var myAccount = { accessToken: "", //enter here you facebook accessToken. You can retrieve ...