From Cloud Security Alert to Open Source Bugfix

Perhaps you’ve seen vulnerability reports in your CI/CD pipeline or tools like NPM. Cloud infrastructure has these too and I was surprised to get an alert. Naturally, I had to investigate to see where I went wrong… (and of course mitigate the problem).

The security alert

It all started a few weeks ago when I received an email from a colleague with the following content.

Secure transfer to storage accounts should be enabled
Subscription ID: devrel-berndverst-demo-test
Resource: /subscriptions/XXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXXX/ resourcegroups/kubeflowrelease/providers/ microsoft.storage/storageaccounts/ > fairingXXXXXXXXXXXXXXXXX
1-Click mitigation link

Note: several similar results have been omitted

This was my gut reaction:

😱This cannot be. What did I do wrong? Did I mess something up? 😰

My colleague had used Azure Security Center to generate a report of security issues. Since I had never received such a message before (note I could have set up email alerts in Azure Security Center myself) I was a bit skeptical and first verified the authenticity of the email: Email domain, Sender Police Framework (SPF) check, Domain Keys Identified Mail (DKIM) all indicated the email was legitimate.

The 1-Click mitigation link (which I verified went to the Azure web portal domain) took me directly to the recommendations section in the Azure Security Center for my Azure account and jumped to the relevant entries. The Security Center recommendations documentation also had more information on the alert I had received.

What I did wrong (supposedly)

Azure Blob Storage accounts should be configured to only serve traffic over https.

The 1-step mitigation in Azure Security Center resolved the issue, but I could also have followed these instructions in the docs.

🤔 Why would I ever not want secure blob transfers? This does not sound like me.

Taking a closer look

I noticed that all affected blob storage accounts had programmatically generated names. Furthermore they all resided in my Azure resource group kubeflowrelease and contained the string fairing in the account names.

Days earlier I had ported an end to end tutorial for Kubeflow using the MNIST training set to Azure. Of course in the process I deployed Kubeflow to my Kubernetes cluster and went through the tutorial I wrote.

The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable.

The culprit: Kubeflow

💡Kubeflow somehow created the storage accounts in question

The Kubeflow Fairing helper library created the storage account without forcing secure transport.

How to correctly create secure storage accounts

Looking at the relevant documentation I discovered something very interesting:

By default, the Secure transfer required property is enabled when you create a storage account in Azure portal. However, it is disabled when you create a storage account with SDK.

In other words, the SDKs default to doing that for which the Azure Security Center alerted me, creating insecure storage accounts.

❓Kubeflow Fairing is written in Python. I wonder how to create secure storage accounts with the Azure Python SDK.

The Python SDK sample code at github.com/Azure-Samples/storage-python-manage does not explicitly set the secure option. The sample code uses the insecure default [Edit: it turns out that this depends on the SDK version used]. I made this pull request to fix this.

Fixing the sample code led to several unrelated issues with the tests, all of which I fixed:

  • The Azure Python SDK version was not locked and the API output had changed, no longer matching the mocked responses
  • Travis was running tests for a version of Python that was no longer supported
  • Travis was not running tests for current versions of Python 3.7 and 3.8.

It took me hours, but Azure-Samples/storage-python-manage#11 fixed all of the above.

What did Kubeflow Fairing do?

Taking a close look we can see that Kubeflow Fairing basically copied from the Azure Python sample code.

I would have done the same. Good thing that is fixed now!

I made this pull request to fix the issue in Kubeflow Fairing.

Success?

  • ✅ Kubeflow Fairing now creates secure storage accounts.
  • ✅ Anyone finding the Azure Python SDK samples for storage account creation will create secure accounts.

Solution:
A new version of the Azure Storage Resource Provider API should default to creating secure storage accounts if the supportsHttpsTrafficOnly parameter has not been provided.

Anyone who updates the Azure CLI or Storage SDK will automatically inherit secure defaults. Those who have been relying on the insecure behavior (this should be few people) will experience a breaking change, but they of course are not required to upgrade SDKs or API versions and can certainly explicitly set the parameter.

Exactly what I suggested here was indeed done since API version 2019-04-01. All current SDKs will be using this API version or a newer one which defaults to the secure setting.

So why the Kubeflow Fairing issue afterall?

Kubeflow Fairing was using the deprecated azure library. Instead azure-mgmt-storage should have been used.

The edits to the Azure Python sample documentation weren’t strictly necessary. But at least I fixed the tests.

We must go deeper