Kubernetes tutorial for dummies

Back in 2019, as a non-native English speaker, I had a bad time trying to understand what the hell Kubernetes is.

Kubernetes is a portable, extensible, open-source platform for managing containerized workloads and services, that facilitates both declarative configuration and automation. It has a large, rapidly growing ecosystem. Kubernetes services, support, and tools are widely available.

The name Kubernetes originates from Greek, meaning helmsman or pilot. K8s as an abbreviation results from counting the eight letters between the "K" and the "s". Google open-sourced the Kubernetes project in 2014. Kubernetes combines over 15 years of Google's experience running production workloads at scale with best-of-breed ideas and practices from the community.

Sorry, I just understood it's greek and made by Google.

I will break it down for you, so you don't have to spend the same amount of time I did to get started.

Be prepared, I'd like to incentivize you to run commands on your own.

🐳 (Dummy) Intro to docker containers

You need to understand containers, it will be tough for you to understand Kubernetes if you don't get containers. In case you already do, please, have a cookie 🍪 and skip to the next section.

Containers are like small operating systems with all needed executables. (Keep going, You're about to see examples).

How small? It can be as small as Alpine Linux with ~3 MB, or a bit bigger like Ubuntu, which has ~80 MB.

Let's make executables simpler to digest first. The browser/app you're using to read this article is an executable, the ls command you run to list directories in the terminal is an executable, the python command itself is another executable. Got it? Executables are any program you use for any purpose.

Take some more examples, we can have containers running Python, PostgresSQL, or even GoLang without installing on your machine.

Great, if you could understand that running containers (you can run multiple) are somewhat like executing different programs in different virtual machines this is enough for you to move on.

🖼️ What are container images?

Oh yeah, I need to explain it a bit as well.

Think about images as different versions of base containers. Consider that when you download Ubuntu Desktop from their official website, you are prompted to download an ISO (image), and you can select several versions (20.04, 21.04).

You pick the version that pleases you, and you install it. If you install the same image on many computers, you have all of them starting in the same state. By the same state consider the same animal wallpaper, the same set of programs, firefox as your default browser, Gnome as the GUI, etc.

This initial state is defined by the image.

What each individual user will do on the computer will make them different. One will install Chrome, the other one will change the wallpaper, etc. They started the same, but are different now in settings/executables. So containers change, images don't.

For the sake of this example:

The ISO you download is the docker image. The computer you installed the ISO is the container. The version (e.g. 20.04) is known as docker tags.

I hope this rough explanation made you understand enough to move on. Of course, it's not the best explanation ever, so I wonder... Would you be interested in a dedicated post about Docker? If yes, please, get in touch and let me know.

🎻 How does Kubernetes relate to Containers?

Ok, now let's start talking Greek: Kubernetes.

Kubernetes is just a tool that runs and manages your containers, and that's mostly it. I personally enjoy the term container orchestrator, it's fancy, and makes you sound smart.

Me telling my friends I know Kubernetes

By orchestra I imagine something like this:

Photo by Andrea Zanenga on Unsplash

Pretty, right? Notice how each person has a role in this picture. Each musician has to play their own instrument only. Maybe the maestro doesn't even know how to play the violin, but he knows how many violins it's required to compose the song. He might ask to increase the number of cellos or to change the tone of vocals.

Whatever, I know shit about music. Picture this now:

Photo by... Me I guess?

If you understand this, the next part will become easier and easier, we're just going to discuss "HOW" the maestro (aka Kubernetes) coordinates with the musicians (aka containers).

Oh, and we're done with the music comparisons, let's get it more realistic now.

⛵ Kubernetes objects and `kubectl`

kubectl is the magic command that allows you to interact with a Kubernetes cluster.
Kubernetes has several ways of coordinating containers, they're called "objects" and such objects can be represented by yaml.

I'm about to share with you the most common objects I had to use to deploy systems in production. Bear in mind there are even more types than the ones you're about to see.

🦴 Kubernetes `yaml` anatomy

I know you want to see some action, but before getting started into the Kubernetes objects themselves, let's just understand how we can define them.

✨ Benefits

In case you aren't familiar with yaml files, I can tell you that's great because they allow you to:

Version it with your favorite VCS (which I know is git)

You can roll back to a previous version, and you can easily check the latest modifications.

Review modifications before applying them to the production cluster

This is a killing feature. Your cluster management now is (somewhat) auditable! Changes and authors are documented and you can rely on Pull Requests to ensure infrastructure modifications are always reviewed before getting into production.

💠 Format

Most of the objects match the following format:

kind: Pod  # What kind of object it is

apiVersion: v1  # version of the object schema

metadata:  # Data bound to the Kubernetes object
  name: object-name  # an identifier to the Kubernetes object
  labels:
    key: value

spec: {}  # Definition of the object parameters

kind

It represents what object you're creating. Kubernetes have many, in this article you're going to learn about: Pod, Deployment, and Service.

Don't worry about what they represent right now, keep the focus on the yaml structure!

apiVersion

This field just tells the cluster what schema version you're using. You can find out the latest apiVersion by running kubectl explain <object-kind>.

For our case, we can run: kubectl explain pod and get something as:

KIND:     Pod
VERSION:  v1   <==  API version

DESCRIPTION:
     Pod is a collection of containers that can run on a host. This resource is
     created by clients and scheduled onto hosts.

FIELDS:
   apiVersion   <string>
     APIVersion defines the versioned schema of this representation of an
     object. Servers should convert recognized schemas to the latest internal
     value, and may reject unrecognized values. More info:
     https://git.k8s.io/community/contributors/devel/api-conventions.md#resources

Based on this output, we know that v1 is the latest apiVersion for pod.

metadata

metadata:
  name: object-name
  labels:
    key: value

Metadata contains some data that is bound to the Kubernetes object.

It might sound simple, but it's quite powerful. The name is required to identify that object, and labels can be used to handle complex filters! For example, you can categorize several objects as layer: api and decide to restart or scale all of them based on that label.

Real examples coming soon!

📝 Kubernetes Summary so far

Ok, we're done with the boring theory part, so far you should understand:

Kubernetes orchestrates (or manages) containers;
You can define how orchestration (or management) works as Kubernetes objects;
You interact with the Kubernetes cluster through a tool named kubectl;
Objects can be defined through yamls;

You know what's time for?

For me, it's coffee ☕, but for you, it's to get more acquainted with what kind of objects you might create with Kubernetes.

We will deploy (locally) instances of the Ghost CMS (which powers this blog).

🧑‍💻🚀 I want to code!

You might be like me. Reading is just not enough, and you want to run some code yourself (great choice by the way).

Be my guest! You need to have either Docker installed on your machine and you're going to need minikube. Minikube allows you to run a local Kubernetes cluster on your computer.

Make sure to run minikube start, if it works you're ready to go.

👾 Pods

(Here's another fancy word to your new hipster DevOps vocab!)

Pods are the smallest unit you can manage, and they can run one or more containers.

Such pods can be scaled up or down ↕️ as you desire, for example, you can run 2 pods with one ghost container each (resulting in 2x ghost containers). Actually, let's do it together.

Kubernetes, pods, and containers

Consider the following yaml:

kind: Pod

apiVersion: v1

metadata:
  name: ghost-pod-object  # <-- Verbose for demo purposes
  labels:
    app: ghost
    layer: cms
    blog: guilatrova.dev    # <-- Adding any arbitraty label I want
    article: kubernetes-simple
    twitter: guilatrova

spec:  # This session details are new and specific to pods!
  containers:  # Note the plural: "containerS"
    - name: ghost
      image: ghost  # Took from dockerhub https://hub.docker.com/_/ghost
      ports:
        - containerPort: 2368  # Default Ghost port, note it's a list of ports
      env:  # Define env variables
        - name: url
          value: http://localhost:2368

Save it as ghost-pod.yaml, now apply it by running:

❯ kubectl apply -f ghost-pod.yaml
pod/ghost-pod-object created

Now I can see what I created with commands get and describe:

❯ kubectl get pod
NAME               READY   STATUS    RESTARTS   AGE
ghost-pod-object   1/1     Running   0          5m21s

Coooool, right? You just created a pod and it's running a Ghost container (maybe you should start your blog!).

Note that the verbose name we set is what we use to refer to that object.

❯ kubectl describe pod ghost-pod-object
Name:         ghost-pod-object
Namespace:    default
Priority:     0
Node:         minikube/192.168.49.2
Start Time:   Tue, 10 Aug 2021 06:02:49 -0300
Labels:       app=ghost
              article=kubernetes-simple
              blog=guilatrova.dev
              layer=cms
              twitter=guilatrova
...

🔄 Deployments

Did you enjoy defining and running your pods? Do you feel like a Kubernetes expert already? Great, because now it's time to rip all your dreams.

I never had to create a single pod in production, and I don't think you should either.

Don't worry, there will still be pods. We're just not creating them ourselves anymore, we're going to delegate it to the Deployment object.

Deployments are responsible for managing, creating, updating, raising, feeding, educating, putting to sleep, and anything else moms would do, but for pods.

Kubernetes, deployments, pods, and containers

So, as you can see, the "pod" concept is valuable! You still need to understand what are containers and what are pods. Stick with me!

I'll delete my orphan pod from past examples by running:

❯ kubectl delete -f ghost-pod.yaml
pod "ghost-pod-object" deleted

Let's do better now, let's use deployments to manage pods for us!

You're about to see that labels are not just cool tags, they matter! Take extra time to analyze the yaml and understand how each label (1) and (2) relates to other values, I'll do my best to comment on it for you!

kind: Deployment

apiVersion: apps/v1

metadata:
  name: ghost-deployment
  labels:
    app: ghost  # This is repeated for consistency, and has no relation to (1)

spec:
  replicas: 2  # How many pods you want?

  selector:  # How can I find the pods I should manage?
    matchLabels:
      app: ghost  # (1) This value is used to tell Deployment "which pods are mine"
      blog: guilatrova.dev  # (2) Another key:value, same behavior as (1) above

  template:  # How should I create the pods?
    metadata:  # Metadata for the POD
      labels:
        app: ghost # (1) Created pod matches selector
        blog: guilatrova.dev # (2) Created pod matches selector
        layer: cms
        article: kubernetes-simple
        twitter: guilatrova

    spec:  # The pod definition, reused from our previous example
      containers:
        - name: ghost
          image: ghost
          ports:
            - containerPort: 2368
          env:
            - name: url
              value: http://localhost:2368

What are you waiting for? Save and apply it!

❯ kubectl apply -f ghost-deployment.yaml
deployment.apps/ghost-deployment created

Let's observe the deployments!

❯ kubectl get deployment
NAME               READY   UP-TO-DATE   AVAILABLE   AGE
ghost-deployment   2/2     2            2           20s

That's it! As we defined replicas: 2, we can notice the 2/2. Well, why we don't check pods again?

❯ kubectl get pod
NAME                                READY   STATUS    RESTARTS   AGE
ghost-deployment-5c69cf47bf-jz7jf   1/1     Running   0          29s
ghost-deployment-5c69cf47bf-n4g9w   1/1     Running   0          29s

Yes!! I knew it! Inside of you, something was telling you that spending all that time reading about pod stuff wouldn't be wasted. You're right!

Well, let's scale it? Make it 4 by modifying your yaml to 4 and applying it again.

❯ kubectl apply -f ghost-deployment.yaml
deployment.apps/ghost-deployment configured

❯ kubectl get deploy
NAME               READY   UP-TO-DATE   AVAILABLE   AGE
ghost-deployment   2/4     4            2           2m32s

❯ kubectl get deploy
NAME               READY   UP-TO-DATE   AVAILABLE   AGE
ghost-deployment   4/4     4            4           2m44s

❯ kubectl get pod
NAME                                READY   STATUS    RESTARTS   AGE
ghost-deployment-5c69cf47bf-4kzcc   1/1     Running   0          32s
ghost-deployment-5c69cf47bf-g82v9   1/1     Running   0          32s
ghost-deployment-5c69cf47bf-jz7jf   1/1     Running   0          3m2s
ghost-deployment-5c69cf47bf-n4g9w   1/1     Running   0          3m2s

How easy was that? I want to show yet another trick!

You can also use kubectl commands right away without any yaml, try this:

❯ kubectl scale deploy/ghost-deployment --replicas=1
deployment.apps/ghost-deployment scaled

Now it's your turn, check if it worked! 😄

🛑 Again: One pod may contain many containers

Although pods can have all the containers you might need, Kubernetes docs recommend you to have one container per pod.

For example, you notice an increased amount of traffic to your new blog - YOU'RE A HIT! You want to increase the number of pods to better serve all your readers. If we have added a Postgres container in the same pod, it would be harder to manage the scaling.

What if you just want to scale the ghost container and not the database?

Deployment scaling

So, respect it as a rule of thumb: "One container per pod".

🔌 Services

I don't know if you realized it yet, but we didn't even try to access the Ghost pod we got running.

Even though you try to connect on the specified url it won't work, that's because deployments are responsible for pod management, not exposure.

Please, meet the service object. Services are responsible for exposing your pods.

Here is the yaml for you, save it:

kind: Service

apiVersion: v1

metadata:
  name: ghost-service

spec:
  selector:  # How can I find the pods I should point to?
    app: ghost
    # We can add more labels here, like we did for deployment

  ports:
    - protocol: TCP
      port: 2368  # Incoming port
      targetPort: 2368  # Container port

At this point, you should be already familiar with kind, apiVersion, and metadata.

Inside spec we can find a selector doing the same as deployment: it specifies which pods it should care about based on the labels set for each pod.

For ports as you can imagine, it means: whatever coming to port: 2368 will be forwarded to that pod on targetPort: 2368, I kept them the same for simplicity.

The port attribute could be different, but the targetPort needs to be the same defined in our pods.

Kubernetes, service, deployments, pods, and containers

Well, let's check kubectl get service:

❯ kubectl get service
NAME            TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
ghost-service   ClusterIP   10.110.225.118   <none>        2368/TCP   63m
kubernetes      ClusterIP   10.96.0.1        <none>        443/TCP    2d

You should realize there's a default service named kubernetes running, and your ghost-service as well!

In order to access it locally, you need to forward a localhost port, try to run: minikube service ghost-service.

❯ minikube service ghost-service
|-----------|---------------|-------------|--------------|
| NAMESPACE |     NAME      | TARGET PORT |     URL      |
|-----------|---------------|-------------|--------------|
| default   | ghost-service |             | No node port |
|-----------|---------------|-------------|--------------|
😿  service default/ghost-service has no node port
🏃  Starting tunnel for service ghost-service.
|-----------|---------------|-------------|------------------------|
| NAMESPACE |     NAME      | TARGET PORT |          URL           |
|-----------|---------------|-------------|------------------------|
| default   | ghost-service |             | http://127.0.0.1:50741 |
|-----------|---------------|-------------|------------------------|
🎉  Opening service default/ghost-service in default browser...
❗  Because you are using a Docker driver on darwin, the terminal needs to be open to run it.

It should open a browser tab with a new Ghost instance.

And that's it, from your browser you accessed Kubernetes pods created by a deployment through a service!

There's a lot of other objects you might want to learn, like Kustomize, ConfigMap, Jobs, etc. I might write about them someday, subscribe to the blog newsletter so you don't miss out! Follow me for more bite-sized tips about programming, books, and more.

Do you know the difference between #docker COPY and ADD commands?

- ADD allows the source to be an URL
- If you use ADD with a compressed file it gets uncompressed

Prefer COPY when you don't need the magic ADD
— Gui Latrova (@guilatrova) August 10, 2021