FinOps + Policy-as-Code
FinOps brings financial accountability to the variable spend model of cloud, enabling distributed teams to make business trade-offs between speed, cost, and quality.FinOps definition at Cloud FinOps by J.R. Storment; Mike Fuller
tl;dr: Writing FinOps-guided governance policies will help with your Cloud Cost Optimization.
In this post, we spend some time trying to use simple words in order to explain the concept of FinOps, as well as Policy-as-Code.
What is FinOps
The term FinOps is an acronym for Financial Operations and is nowadays a synonym for Cloud Financial Management or Cloud Cost Management. Through the variable spend model of the Cloud and the corresponding change for financial controlling and procurement, it was about time to introduce a cross-functional discipline to benefit the most from Cloud.
FinOps has the potential to fulfill this need and is already adopted by some large companies. In his core FinOps can be defined as an operational framework and cultural shift that brings technology, finance, and business together to drive financial accountability and accelerate business value realization through cloud transformation.
Benefits of FinOps
FinOps is the most efficient way in the world for teams to manage their cloud costs, where everyone takes ownership of their cloud usage supported by a central best-practices group. Cross-functional teams work together to enable faster delivery, while at the same time gaining more financial and operational control.
No longer is a siloed procurement team identifying costs and signing off on them. Instead, a cross-functional FinOps team adopts a definitive series of procurement best practices, enabling them to pull together technology, business, and finance in order to optimize cloud vendor management, rate, and discounting. With FinOps, each operational team (workload, service, product owner) can access the near-real-time data they need to influence their spend and help them make intelligent decisions that ultimately result in efficient cloud costs balanced against the speed/performance and quality/availability of services.
What is Policy-as-Code
The definition and the benefits of Policy-as-Code, come in the world of cloud-native. Let’s speak about a similar concept, Infrastructure-as-Code (IaC). IaC is the process of managing and provisioning computer data centers through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools.
Applying the same logic, Policy-as-code is the idea of writing code in a high-level language to manage and automate policies. The high-level language is dependent on the policy engine, which takes a query input, some data, and policy to produce a query result. For a policy engine such as the open-source Open Policy Agent (OPA), the policies are expressed in a declarative language called Rego. Alternative policy engine solutions aside from OPA include Hashicorp’s Sentinel and Kyverno.
Benefits of Policy-as-Code
There are lots of benefits and use-cases where using Policy-as-Code can be very helpful, especially when it comes to cloud cost restrictions. Below are few use-cases that can be used:
- Cloud Infrastructure Provisioning: By writing fine-grained policies to enforce cloud resources such as VM machine types, disk storage, network, and firewall settings using mandatory tags as an example. That can be accomplished using HashiCorp Sentinel embedded within your Terraform manifest files or by using Cloud Custodian.
- Kubernetes Control: By implementation restricted access control policies. As shown below, whenever human user and/or Kubernetes service accounts make REST API requests, against different Kubernetes resources such as pods, nodes, and services, as well as, enforce rules for your ingress/egress network traffic would go through the policy engine (i.e. OPA Gatekeeper) to check whether or not, this API call is authorized.
- Automated Compliance Check: By adding your organization’s agreed and well-established guidelines, best practices, and conventions codified rules as part of your organization’s CI/CD pipelines, thus automate frequent tasks, improve efficiency which will eventually reduce maintenance costs and enhance overall security and eliminate any attack surface.
If your workload is being operated and orchestrated using Kubernetes, then you probably have been looking for ways to control what end-users can do on the cluster and ways to ensure that clusters are in compliance with company policies. These policies may be there to meet governance and legal requirements or to enforce best practices and organizational conventions. With Kubernetes, how do you ensure compliance without sacrificing development agility and operational independence?
For example, you can enforce policies like:
- All images must be from approved repositories
- All ingress hostnames must be globally unique
- All pods must have resource limits
- All namespaces must have a label that lists a point-of-contact
To enforce these rules, using a framework to facilitate these restrictions into your Kubernetes through Admission Controllers. There are two phases of Admission Controller Webhooks:
- Mutating Webhook: A mutating admission controller webhook mutating admission webhooks may mutate your Kubernetes objects (in ways like adding labels or annotations to your deployments, namespaces ..) and is defined by creating a
MutatingWebhookConfigurationobject in Kubernetes. An example below:
apiVersion: admissionregistration.k8s.io/v1 kind: MutatingWebhookConfiguration ... webhooks: - name: my-webhook.example.com objectSelector: matchLabels: foo: bar rules: - operations: ["CREATE"] apiGroups: ["*"] apiVersions: ["*"] resources: ["*"] scope: "*"
- Validation Webhook: A validation admission controller webhook is executed after the mutation phase and will not mutate the objects(Like the MutationWebhook) and is defined by creating a
ValidatingWebhookConfigurationobject in Kubernetes. Also an example below:
apiVersion: admissionregistration.k8s.io/v1 kind: ValidatingWebhookConfiguration ... webhooks: - name: my-webhook.example.com namespaceSelector: matchExpressions: - key: environment operator: In values: ["prod","staging"] rules: - operations: ["CREATE"] apiGroups: ["*"] apiVersions: ["*"] resources: ["*"] scope: "Namespaced" ...
FinOps as Kubernetes Governance Framework
Governance, as described by CNCF, is the ability of Ops teams to verify and enforce certain rules across departments, groups, or the entire organization. In the Kubernetes context, that means enforcing rules across Kubernetes clusters as well as applications running in those clusters.
There are two governance dimensions. First, policy scope, meaning where a specific rule should be applied, enforced, or verified. Secondly, policy targets, relating to what should be enforced and verified.
The scope may be specified in terms of organizational units (departments, teams, groups, users), technical units (cloud provider, datacenter, region, group of clusters, namespaces, label selectors, etc.), or both. Scope definition capabilities may also range from static lists to dynamic rules.
As we had explained earlier, once the security aspect had been taken care of for our cloud environment by leveraging the usage of OPA Gatekeeper with fine-grained security policies, then comes the part for FinOps. This framework specializes in cloud-resources cost management and control, using tools such as our partner Apptio Cloudability. Liquid Reply can help not only explain FinOps Principles and Lifecycle, but also lower your organization’s overall cloud-resources cost management and implement a sustainable organization to run and maintain FinOps.