Unit Costs in Action: Strategies for Effective Kubernetes Cost Allocation
veröffentlicht am 11.01.2024 von Vanessa Kantner
Explore effective Kubernetes cost allocation through FinOps principles like Shared and Unit Costs. This approach, devoid of specialized tools, utilizes meticulous tagging and manual metrics for precise client-specific allocation, ensuring transparent resource utilization and cost management in the cloud.
Authors: Vanessa Kantner, Ferenc Domröse
The Quest for Fair Kubernetes Cost Allocation
Allocating costs in Kubernetes can be a challenging task, especially for infrastructure platform teams in larger organizations. These teams seek to accurately charge internal clients for platform usage, a task involving both direct customer usage costs and the fair distribution of shared resources across clusters. The key challenge is to develop a usage-based cost accounting system that is both fair and accurate.
To illustrate our approach, consider an infrastructure platform team operating multiple AWS accounts with resources across various Kubernetes clusters. In this scenario, there are costs directly linked to client usage, as well as additional resources, both within and outside the Kubernetes clusters, that need to be equitably allocated among the clients.
The key question remains: How can this team efficiently introduce a customer-centric cost accounting system for their Kubernetes platform?
Fundamentals: Understanding Shared and Unit Costs in FinOps
The resolution to this complex problem lies in embracing FinOps capabilities, specifically Managing Shared Costs and Measuring Unit Costs. Understanding these concepts is crucial:
- Shared Costs: These are cloud cost shared by multiple teams or departments. The allocation methodologies range from proportional, fixed or even-split, all striving for transparent and equitable cost distribution.¹
- Unit Costs: In FinOps, Unit Costs refers to dissecting the costs into marginal costs specific to the business value of this spend, such as costs per transaction. The journey through unit economics is typically segmented into three phases:
- Crawl: Teams begin with a cloud-only cost approach per client, laying the groundwork for future, more refined cost allocation strategies.
- Walk: Expanding the scope to encompass SaaS and license-based expenses, integrating tools like Datadog, and fostering collaborations with product-focused teams.
- Run: Developing comprehensive cost models per client, these models encompass a variety of expenses, including Cloud, SaaS, hybrid infrastructure, and personnel costs. This phase reflects an advanced level of understanding regarding resource utilization.²
Crafting a Solution
In diverse IT organizations, the deployment of specialized tools to allocate and distribute container costs with higher granularity based on usage may be limited, necessitating alternative approaches. In our solution, we show how manual cost distribution can be implemented, based on a robust data foundation. In our use case, this data has been provided by our customer through Cloudability, a third-party cloud cost management tool made by Apptio.
The cornerstone of this approach is a well-implemented tagging strategy in your cloud environment. This ensures that each internal customer is associated with distinct namespaces and Cloudability can allocate the data. Shared resources and resources outside the clusters are also allocated using consistent tagging. Find more information on tagging strategies in this Blogpost and about cluster namespaces here.
The data obtained about the usage per namespace can be further categorized into different usage metrics. The metric “Item Description” indicates the according cost metric, such as
- Compute Running Hours
- Data Transfer GB/month
- Storage GB/month
The unit per metric, such as running hours or gigabyte, is displayed in the column “Usage Quantity”, whereas the according costs are shown under “Cost (Total)”.
Each namespace, therefore, possesses three usage-dependent cost metrics (compute running hours, data transfer GB/month, storage GB/month). We exported these metrics and calculated the customer's proportion of the total usage per metric, as indicated in Figure 2. For each namespace, the sum of the usage per cost metric divided by the total usage of a cost metric results in the share of the respective customer (per cost metric).
Concurrently, Shared Costs are also segmented into these categories, and then distributed according to each customer's usage share. This method presumes that Shared Costs scale with usage, making this form of allocation more precise. However, it requires validation for each specific use case.
In comparison, we also tried distributing the costs based on cost proportions, as shown in the outer right column in Figure 4. Then we compared the result with the Unit Cost Distribution. As shown in the column “Cost Difference Unit Costs vs. Proportional Cost Distribution” (Marked in red in Figure 4), there are significant cost differences. We assume this cost difference derives from the Unit Cost distribution being more precise than the cost proportional method.
Conclusion and Outlook
In adopting this methodology, we've successfully attributed cloud costs to individual clients on a usage basis, aligning with an advanced 'Crawl' phase in Unit Cost considerations. This approach provides a solid foundation for future phases.
Our next steps to reach the 'Walk' phase involve a deep dive into the specific use cases of internal customers on the platform, analyzing additional incurred costs, and identifying suitable business metrics for a Unit Cost analysis. This could include assessing further SaaS costs, licensing fees, support fees, and their integration with pure cloud costs.
Through this journey, we demonstrate that effective Kubernetes cost allocation is achievable, even in the absence of specialized tools. The assumed increase in precision with the usage proportional allocation needs to be weighed against a potentially increased operational effort.
By strategically applying principles of Shared and Unit Costs, organizations can navigate the complexities of cloud-native environments, ensuring fair and transparent cost allocation while enhancing their financial management and resource utilization understanding.