Enforce Compute Sustainability through Policies

Veröffentlicht von Vanessa Kantner am

Authored by Florian Stoeber and Vanessa Kantner


Using the public cloud has a huge impact on the environment. The public cloud is just another company’s server, running in a data centre. These data centres are now – mid-2022  – responsible for 2% of the world’s total greenhouse gas (short: GHG) emissions[1]. This number is expected to rise over the next years, due to the almost exponential growth in demand of public cloud services.

In this blog post, we share with you the idea and possible implementation of centrally governing your cloud carbon emissions as an organization by using policies in your cloud environment.

How to reduce GHG emissions of the public cloud

From the perspective of cloud providers, they can reduce that impact by designing efficient facilities or improving server utilization. The first is dependent on the power source used by the data centres as well as the location. Using renewable energies or building a data centre in a cold location reduces the emissions. The latter is influenced by exploiting virtualization techniques[2]. Both are heavily dependent on the region of the data centres.

The three most prominent cloud providers – Amazon Web Services (short: AWS), Google Cloud Platform (short: GCP) and Microsoft Azure – are actively working on the necessary improvements. GCP has already reached carbon neutrality and 100% renewable energy, AWS and Azure both have this in their short-term goals for 2025. New data centres are built in cold locations, which is the main reason Finland has ten data centres operated by four different providers[3]. In addition, there are studies reducing the carbon emissions by using AI to improve cooling[4] or submerging the data centre in the sea[5].

Nevertheless, the demand for public cloud spend is rising. Compared to 2021, public cloud spending will increase up to 46% by 2023[6]. This increasing demand leads to continuous construction of additional data centres, which reduces the impact of public cloud providers improvements on the physical data centres. This is the reason why the public cloud customers are responsible to make their cloud use carbon efficient. Many of the measures used to reduce carbon emissions are the same as those used to reduce costs, such as shutting down idle resources or environments or the rightsizing of compute instances. This means, adopting a FinOps practice for cost reduction will lead almost imminently to a carbon reduction of your cloud carbon footprint.

Focus on the largest consumer: Compute

Classic compute resources  – such as AWS‘ EC2, Azure’s VMs or GCP’s Compute Engine  – are not only the oldest, but also the most prominent public cloud service and thus usually the baseline for sophisticated architectures. These compute resources can be placed in a specific region of the world. Typically, the regions are picked depending on latency, price, compliance needs or just out of habit. But this scheduling can also be done based on the carbon emissions of a specific region.

GCP is a pioneer in this field, providing their customer the data needed to make an informed decision about a low carbon placement of their compute instances. GCP shares information about the grid carbon intensity of each region. Based on this, they calculate the average percentage of carbon-free energy in that location. As a result, they show their customers “Low CO2” regions.[7]

This metric is displayed during daily work with GCP Compute, e.g., when creating a new Compute Engine instance:

Low CO2 metric when creating a new instance in GCP

In order to enforce the use of such Low CO2 instances throughout the company, it is usually not enough to include this as a requirement or best practice. In addition, what happens with the instances that are already up and running?

How to prevent the creation of new instances

To enforce using low carbon regions throughout the whole organization for all future Compute Engine instances, GCP’s Organization Policies can be used. In general, these policies support the centralized control of how an organization’s resources can be used.

To enforce the usage of low carbon regions only, the existing organizational policy “Resource Location Restriction”, which defines the set of locations where location based GCP resources can be created, can be restricted to these low carbon locations.

Enforce low-carbon locations with organizational policy in GCP

This results in only low-carbon regions being displayed when setting up new instances within the console.

Creating a new instance with low-carbon organizational policy applied

It also applies to new instances being created by Infrastructure as Code  – creating an instance in a forbidden region returns an error.

gcloud compute instances create example-instance --image-family=rhel-8 --image-project=rhel-cloud --zone=europe-west2-a

ERROR: (gcloud.compute.instances.create) Could not fetch resource:
 - Location ZONE:europe-west2-a violates constraint constraints/gcp.resourceLocations on the resource projects/liquid-florian-stoeber-668/zones/europe-west2-a/instances/example-instance.

How to stop existing instances

In order to apply the use of low carbon regions to running instances – or to notify the owners that they need to adapt their instance types accordingly  – the open-source tool Cloud Custodian can be used.

All running instances can be addressed with one policy and deleted immediately as one solution. The following example policy shows how to tag instances with the “highcarbon” label. For these instances, it is further allowed to utilize high-carbon regions. All other instances without the label, will be deleted.

policies:
  - name: delete-instance-in-high-carbon-zone
    description: |
      Deletes all instances in high carbon zones
    resource: gcp.instance
    mode:
      type: gcp-audit
      methods:
        - beta.compute.instances.insert  # got this value from the audit logs
        - v1.compute.instances.insert    # to have the future value in place already
    filters:
      - and:
        - "tag:highcarbon": absent
        - not:
          - type: value
            key: "zone"
            op: regex
            value: "(.*northamerica-northeast1.*|.*northamerica-northeast2.*|.*southamerica-east1.*|.*us-central1.*|.*us-west1.*|.*europe-north1.*|.*europe-southwest1.*|.*europe-west1.*|.*europe-west6.*|.*europe-west9.*)"
    actions:
      - type: delete

As an alternative to the rather drastic approach of deleting instances right away, the instances could be marked for another operation with “mark-for-op”. Cloud Custodian will add an additional tag to recognize the instance in a future run. This enables the FinOps team to notify the owner of the instance and ask the owner to migrate the workload to a low-carbon region before the instance is deleted manually.

policies:
  - name: mark-notify-for-stop-instance-in-high-carbon-zone
    description: |
      Marks all instances in high carbon zones to stop in 7 days and notify
    resource: gcp.instance
    mode:
      type: gcp-audit
      methods:
        - beta.compute.instances.insert  # got this value from the audit logs
        - v1.compute.instances.insert    # to have the future value in place already
    filters:
      - and:
        - "tag:highcarbon": absent
        - not:
          - type: value
            key: "zone"
            op: regex
            value: "(.*northamerica-northeast1.*|.*northamerica-northeast2.*|.*southamerica-east1.*|.*us-central1.*|.*us-west1.*|.*europe-north1.*|.*europe-southwest1.*|.*europe-west1.*|.*europe-west6.*|.*europe-west9.*)"
    actions:
      - type: mark-for-op
        op: stop
        days: 7
      - type: notify
        template: default
        priority_header: '2'
        subject: Your instance is utilizing a high-carbon region
        to:
          - owner@example.com
        transport:
          type: sqs
          queue: https://sqs.us-east-1.amazonaws.com/1234567890/c7n-mailer

Conclusion

Currently, GCP is the only public cloud provider that offers this level of transparency on the regions‘ carbon emissions. The use of these low emission regions can be enforced via GCP organizational policies for new instances or third-party policy tools like Cloud Custodian for running instances. If you take a closer look, you will see that only a few regions in Europe and the Americas offer low-carbon emissions for being used by the GCP Compute Engine already. This is heavily dependent on the energy mix of the region. This indicates, enforcing low carbon regions within an organization might not be advisable for workloads in specific regions or those having specific requirements on latency or pricing. For a sustainable future this means, that in addition to latency, price and compliance, consumers also must take the cloud carbon emissions into account when scheduling an instance.


References

[1] https://www.independent.co.uk/climate-change/news/global-warming-data-centres-to-consume-three-times-as-much-energy-in-next-decade-experts-warn-a6830086.html

[2] https://www.springerprofessional.de/en/towards-greener-applications-enabling-sustainable-aware-cloud-na/23122802

[3] https://www.datacenters.com/locations/finland

[4] https://www.deepmind.com/blog/deepmind-ai-reduces-google-data-centre-cooling-bill-by-40

[5] https://news.microsoft.com/innovation-stories/project-natick-underwater-datacenter/

[6] https://www.gartner.com/en/newsroom/press-releases/2022-04-19-gartner-forecasts-worldwide-public-cloud-end-user-spending-to-reach-nearly-500-billion-in-2022#:~:text=Worldwide%20end%2Duser%20spending%20on,to%20reach%20nearly%20%24600%20billion.

[7] https://cloud.google.com/sustainability/region-carbon

[8] https://cloudcustodian.io/