Skip to main content

Cloud Security

1. Confidentiality

If you back up your data to the cloud, your Cloud Service Provider (CSP) shouldn’t be able to see the data you store. “They can’t guarantee that!” you say. “I have to encrypt the data before it leaves my machine.” That’s all well and good, until you are relying on your CSP for this service, like Server Side Encryption available in AWS S3 where they manage your keys. How does one ensure the management of these keys to prevent an administrator from decrypting your data?

2. Integrity

How do you know that your CSP didn’t log into your machines when you weren’t looking? Well, unfortunately in most cases, you don’t. Sure, you can check your logs, but the really good hackers (and even the script kiddies with good tools) can remove log entries and modify timestamps. You need a way to verify that you’re the only one logging into your machines. To expand on this, customers typically have the ability to manage their images and snapshots they run in the cloud. How does one ensure that no one has tampered with those images when they were made? Take the marketplace where companies buy and sell images to other users -- how can the buyer ensure the image they are purchasing does not contain any malware, viruses, or other malevolent software with or without the seller knowing?

3. Availability

This can be a plague on any CSP, as AWS has seen with a misconfigured application file -- it takes out an entire availability zone, as it did on April 20, 2011 for EBS. The solution is to have multiple availability zones. When one zone fails, there’s no security risk because another is ready to take over.

4. Mutual Auditability

This is the holy grail of cloud security, and it’s one that many providers are not aware of. What this means is that as an administrator, you can verify to the user that their actions are their actions, and that you didn’t complete any actions yourself. And as user, I can verify that my actions are my actions -- I can see that I’m the one that did them. More importantly, an auditor (or other 3rd party) can go in and determine which actions were completed by which parties.
For many people, these concerns aren't at the forefront of their minds -- but there are all kinds of advantages to this level of insight. 

Comments

Popular posts from this blog

Python and Parquet Performance

In Pandas, PyArrow, fastparquet, AWS Data Wrangler, PySpark and Dask. This post outlines how to use all common Python libraries to read and write Parquet format while taking advantage of  columnar storage ,  columnar compression  and  data partitioning . Used together, these three optimizations can dramatically accelerate I/O for your Python applications compared to CSV, JSON, HDF or other row-based formats. Parquet makes applications possible that are simply impossible using a text format like JSON or CSV. Introduction I have recently gotten more familiar with how to work with  Parquet  datasets across the six major tools used to read and write from Parquet in the Python ecosystem:  Pandas ,  PyArrow ,  fastparquet ,  AWS Data Wrangler ,  PySpark  and  Dask . My work of late in algorithmic trading involves switching between these tools a lot and as I said I often mix up the APIs. I use Pandas and PyArrow for in-RAM comput...

Build Data Platform

I'd appreciate your likes and comments). Additionally, it will result in my lengthiest blog post to date. However, regardless of the length, it's a significant achievement that I'm eager to share with you. I'll refrain from delving into unnecessary details and get straight to the main points to prevent this post from turning into a 100-minute read :). As always, I'll strive to simplify everything to ensure even those who aren't tech-savvy can easily follow along. Why? Everything has a why, this project too. (DevOps for data engineering) and I needed to apply them in an end-to-end project. Of course, this project is not the best one out there, but it helps me to quickly iterate and make errors. (And it reflects the reality of Modern Data Engineering, with beautiful tool icons everywhere). End Goal The end goal of this project is to have a fully functional data platform/pipeline, that will refresh our analytics tables/dashboards daily. The whole infrastructu...

Kubernetes Configuration Provider to load data from Secrets and Config Maps

Using Kubernetes Configuration Provider to load data from Secrets and Config Maps When running Apache Kafka on Kubernetes, you will sooner or later probably need to use Config Maps or Secrets. Either to store something in them, or load them into your Kafka configuration. That is true regardless of whether you use Strimzi to manage your Apache Kafka cluster or something else. Kubernetes has its own way of using Secrets and Config Maps from Pods. But they might not be always sufficient. That is why in Strimzi, we created Kubernetes Configuration Provider for Apache Kafka which we will introduce in this blog post. Usually, when you need to use data from a Config Map or Secret in your Pod, you will either mount it as volume or map it to an environment variable. Both methods are configured in the spec section or the Pod resource or in the spec.template.spec section when using higher level resources such as Deployments or StatefulSets. When mounted as a volume, the contents of the Secr...