Blog
LLMOps, a new aspect of platform engineering
For some time, large language models have no longer only been used in experimental playgrounds. They are increasingly used and embedded in enterprise-grade software to power customer support flows and knowledge systems, act as developer tooling, and support mission-critical applications. With this wide application of LLMs, new challenges regarding scalability, reliability, observability, cost efficiency and security must be taken into consideration. To tackle these, a new flavor of operations is currently emerging called “Large language model operations”, in short “LLMOps”. We will discover what LLMOps is, what it “contains” and create a high-level architecture of an LLMOps platform.
How vCluster Solves The Multi-Tenancy Compliance Dilemma
Compliance frameworks like ISO 27001, SOC 2, and PCI-DSS demand strict isolation in multi-tenancy infrastructure. But in Kubernetes, achieving that isolation traditionally meant choosing between expensive cluster sprawl or audit-risky namespace separation. vCluster changes that equation entirely.
Beyond the Model: The Hardware Fundamentals That Define Your AI Strategy
How does model size translate into real hardware requirements?
In this post, we break down the fundamentals every tech professional should know about LLM sizes (overview and intended use), memory demand (how to estimate it quickly and reliably), hardware choices and the VRAM bottleneck during inference.
Dagger: CI/CD as Code and Agentic AI enabler
CI/CD pipelines are supposed to help developers ship better code, and faster. In practice, they quite often do the opposite. Developers still need to write scripts to build and test applications locally. Environment configurations bloat pipelines with opaque and hard-to-reuse YAML code. And as workflows expand beyond CI/CD to integrate agentic AI, traditional tools start to show their limits. Dagger was created to address exactly these problems.
One Prometheus to Rule Them All: Multi-Tenancy Kubernetes with Centralized Monitoring and vCluster Private Nodes
Discover how platform teams can implement centralized metrics for multi-tenant Kubernetes using vCluster. This article walks through observability patterns for both regular vClusters and private-node vClusters, showing how a centralized Prometheus and Grafana stack can serve many isolated tenant clusters, laying the foundation for scalable, production-ready multi-tenant observability.
Isolated GPU Nodes on Demand: Implementing vCluster Auto Nodes for AI Training on GKE
Learn how to provision isolated GPU nodes on demand for multi-tenant AI training on GKE. This tutorial implements vCluster Auto Nodes with Private Nodes, giving each tenant dedicated Compute Engine VMs that spin up automatically and terminate when workloads complete. Cost-efficient GPU isolation without managing separate clusters.