7 Platform Engineering Lessons from Kubermatic on Scaling Kubernetes

What has transitioned from late-night scripts and broken clusters has become the control plane of the internet. Kubernetes transformed chaos into order, and now the task is scaling that order across cloud, teams, and even AI workloads.

Kubernetes evolved from a side project in 2015 to the underpinning of modern application infrastructure, powering everything from scrappy start-ups to large global enterprises. Along the way, it has changed not just how we deploy software, but how teams structure and collaborate, and how teams think about technology.

Twain Taylor, editor at Software Plaza, sat down with Sebastian Sheila, CEO of Kubermatic, to discuss the early days of Kubernetes, key lessons from scaling platforms, and how organizations can simplify operations. In this article, we explore seven lessons about platform engineering for scaling Kubernetes with Kubermatic.

1. Embrace curiosity and timing

Sebastian Sheila's story illustrates the combination of curiosity and timing. Starting in traditional enterprise software at SAP,  worked on ERP and HANA databases. When he noticed the rise of open source technologies like Node.js, Docker, and Kubernetes, he made a career leap that seemed risky at the time. He left his stable role to join the open infrastructure 

Curiosity led to the creation of Kubermatic. In 2016, Kubernetes was not yet the enterprise juggernaut it is today. Teams could barely spin up a cluster, let alone in a production environment. Sheila and his co-founder saw this opportunity. They believed Kubernetes would become the new operating system of the cloud, and they wanted to simplify the adoption curve for others.

The takeaway here is that platform engineering is not about the tool itself. It is about timing, willingness and desire to experiment, and sharing with others by identifying a way to ease their lives..

2. Solve the early pain points first

In the early Kubernetes days, teams spent weeks just trying to get a cluster up and running. Managed services were rare. GKE was the only offering from Google at that time, and most organizations either did not want to depend on it or couldn't have afforded to use it. Setting up networking, storage, and security was a manual puzzle, with scant guidance and even fewer best practices. 

Kubermatic focused on solving these pain points first. The movement that they created solved challenges in provisioning clusters consistently, removing repetitive scripting, and creating reproducible environments. This foundational work allowed customers to get clusters up and running without having to reinvent the wheel every time. 

The broader lesson is simple. Before you can start solving for higher-order innovation, you need to take care of the basics. The platforms that minimize toil are the ones that teams will trust.

3. Prepare for day two from the start

Launching a cluster is just the beginning. The real challenge begins the next day when you need your teams to keep the cluster running by patching vulnerabilities, managing workloads, and upgrading every few months.

As Kubernetes got older, the Day Two challenges became more real. There are never-ending headaches on networking, the persistent storage problem, enforcing workloads with security policies, and even release management. Plus, every time a new version of Kubernetes came out, organizations had to rethink how to stay current with Kubernetes while keeping production up and running.

Kubermatic's approach was to bake Day Two operations into the platform. Automatic reconciliation ensures that failed nodes always come back online. Policy enforcement ensures that all workloads are compliant across projects. This is the biggest part of a successful Day Two experience, where monitoring and backup can reduce your business risk in case of downtime.

For platform engineering, the lesson here is: design your product unsoundly around installation. It should be sound around upgrades and long-term maintenance. Failing to do so will result in a victory on day one, only to see their success turn into battles to do routine tasks on day three.

4. Use Kubernetes to manage Kubernetes clusters

Kubermatic's most audacious move was to use Kubernetes to manage other Kubernetes clusters. They built controllers and operators that provision, upgrade, and manage the lifecycle across multiple environments.

This concept may sound recursive, but it works. By using the Kubernetes API as a universal language, Kubermatic created a self-service approach for developers, who could request their cluster using mere input requests. At the same time, the platform handled the heavy lifting of VM provisioning, networking, and scaling.

This teaching demonstrated that platform engineering is built in standing the shoulders of giants. Instead of inventing new abstractions, reuse powerful primitives which already exist in Kubernetes. By doubling down on the API model, you already have an audience with you'll always create a better experience for both operators and developers.

5. Build an API-driven platform

At the heart of Kubermatic’s platform is an API-first design. Each resource, whether a cluster or database, is exposed as a Custom Resource Definition (CRD). Developers interact with simple parameters, such as version or size, without having to worry about every hidden and fixed configuration parameter. 

This mirrors the approach taken by hyperscalers like AWS and Google Cloud. The API becomes the single interface for provisioning and managing resources while still supporting separation of concerns. The platform teams own the complex business logic, and the developers use a simplified service.

The lesson for platform engineering is that a strong API layer is not optional. It is the backbone of scale. APIs let you create a unified workflow, automate repetitive tasks, and empower others to innovate on your platform without waiting for central approval.

6. Support the next wave of workloads

Kubernetes has evolved beyond only supporting stateless web applications. With the emergence of AI and machine learning, infrastructure requirements have hit new limits. AI workloads require GPU scheduling, high-throughput networking, and high-volume storage. AI workloads also introduce operational complexity that traditional stateless web applications have never had to consider. 

Sebastian, believes that the future of Kubernetes will include extending Kubernetes into these spaces of computing, instead of creating yet another orchestrator just for AI.  Kubermatic is a Kubernetes management solution that provides GPU support, storage integrations, and networking within the platform.

The lesson for platform engineering is forward-looking. Anticipate what the next generation of workloads will need and build extensibility into your platform. That way, you can foster innovation without starting from scratch.

7. Focus on abstraction and self-service

The key takeaway from Kubermatic is probably abstraction. Developers do not have to learn all the networking policies or the various details of storage in order to create business value. All they need is a secure, standardized environment in which they can deploy code quickly and safely. 

Kubermatic gives them that via a self-service console and API. Teams can create clusters, enforce policies, manage upgrades, and back up data, all in a couple of clicks. They have multi-tenancy to ensure isolation, while centralized management allows them to scale to hundreds or even thousands of clusters.

This aligns with the broader philosophy of platform engineering. The aim is to give developers the right level of abstraction, hiding unnecessary complexity yet allowing the expert to dive deeper if they need to. This balance reduces friction, speeds delivery, and limits lock-in.

The road ahead

Kubernetes has risen from a chaotic experiment to the basis for modern platforms, but the work remains. Serverless technologies can provide an even further layer of abstraction, and many teams have grown to the point of outgrowing serverless and returning to Kubernetes for its flexibility. Meanwhile, conversations about AI workloads are demanding new levels of performance and automation.

Kubermatic believes that platforms must evolve as these changes are happening. By embracing open source, creating API-centric systems, and establishing self-service at scale, they help organizations untangle Kubernetes complexity while preparing for whatever's next.

Kubernetes is no longer a container orchestrator. It is the framework of modern infrastructure, and a proper platform philosophy extends K8s and its workloads into the future.

This blog is based on a podcast with Twain Taylor, editor at Software Plaza, and Sebastian Sheila, CEO of Kubermatic. They discussed Kubernetes early days, lessons from building platforms at scale, and how organizations can simplify operations. Watch the complete podcast here to dive deeper into their conversation.

Threat Actors Turn HexStrike AI Against Citrix Fla ...

Nvidia Invests $5 Billion in Intel, Eyes Bold AI C ...