πŸ“¦Topic 352: container Virtualization


🧠 352.1 container Virtualization Concepts

\virtualization-container
spinner

Weight: 7

Description: Candidates should understand the concept of container virtualization. This includes understanding the Linux components used to implement container virtualization as well as using standard Linux tools to troubleshoot these components.

Key Knowledge Areas:

  • Understand the concepts of system and application container

  • Understand and analyze kernel namespaces

  • Understand and analyze control groups

  • Understand and analyze capabilities

  • Understand the role of seccomp, SELinux and AppArmor for container virtualization

  • Understand how LXC and Docker leverage namespaces, cgroups, capabilities, seccomp and MAC

  • Understand the principle of runc

  • Understand the principle of CRI-O and containerd

  • Awareness of the OCI runtime and image specifications

  • Awareness of the Kubernetes container Runtime Interface (CRI)

  • Awareness of podman, buildah and skopeo

  • Awareness of other container virtualization approaches in Linux and other free operating systems, such as rkt, OpenVZ, systemd-nspawn or BSD Jails


πŸ“‹ 352.1 Cited Objects


🧠 Understanding containers

\container

containers are a lightweight virtualization technology that package applications along with their required dependencies β€” code, libraries, environment variables, and configuration files β€” into isolated, portable, and reproducible units.

In simple terms: a container is a self-containerd box that runs your application the same way, anywhere.

πŸ’‘ What Is a container?

Unlike Virtual Machines (VMs), containers do not virtualize hardware. Instead, they virtualize the operating system. containers share the same Linux kernel with the host, but each one operates in a fully isolated user space.

πŸ“Œ containers vs Virtual Machines:

Feature
containers
Virtual Machines

OS Kernel

Shared with host

Each VM has its own OS

Startup time

Fast (seconds or less)

Slow (minutes)

Image size

Lightweight (MBs)

Heavy (GBs)

Resource efficiency

High

Lower

Isolation mechanism

Kernel features (namespaces)

Hypervisor

πŸ”‘ Key Characteristics of containers

πŸ”Ή Lightweight: Share the host OS kernel, reducing overhead and enabling fast startup.

πŸ”Ή Portable: Run consistently across different environments (dev, staging, prod, cloud, on-prem).

πŸ”Ή Isolated: Use namespaces for process, network, and filesystem isolation.

πŸ”Ή Efficient: Enable higher density and better resource utilization than traditional VMs.

πŸ”Ή Scalable: Perfect fit for microservices and cloud-native architecture.

🧱 Types of containers

  1. System containers

    • Designed to run the entire OS, Resemble virtual machines.

    • Support multiple processes and system services (init, syslog).

    • Ideal for legacy or monolithic applications.

    • Example: LXC, libvirt-lxc.

  2. Application containers

    • Designed to run a single process.

    • Stateless, ephemeral, and horizontally scalable.

    • Used widely in modern DevOps and Kubernetes environments.

    • Example: Docker, containerd, CRI-O.

πŸš€ Popular container Runtimes

Runtime
Description

Docker

Most widely adopted CLI/daemon for building and running containers.

containerd

Lightweight runtime powering Docker and Kubernetes.

CRI-O

Kubernetes-native runtime for OCI containers.

LXC

Traditional Linux system containers, closer to full OS.

RKT

Security-focused runtime (deprecated).

πŸ” container Internals and Security Elements

Component
Role

Namespaces

Isolate processes, users, mounts, networks.

cgroups

Control and limit resource usage (CPU, memory, IO).

Capabilities

Fine-grained privilege control inside containers.

seccomp

Restricts allowed syscalls to reduce attack surface.

AppArmor / SELinux

Mandatory Access Control enforcement at kernel level.


🧠 Understanding chroot - Change Root Directory in Unix/Linux

\chroot

What is chroot?

chroot (short for change root) is a system call and command on Unix-like operating systems that changes the apparent root directory (/) for the current running process and its children. This creates an isolated environment, commonly referred to as a chroot jail.

🧱 Purpose and Use Cases

  • πŸ”’ Isolate applications for security (jailing).

  • πŸ§ͺ Create testing environments without impacting the rest of the system.

  • πŸ› οΈ System recovery (e.g., boot into LiveCD and chroot into installed system).

  • πŸ“¦ Building software packages in a controlled environment.

πŸ“ Minimum Required Structure

The chroot environment must have its own essential files and structure:

Use ldd to identify required libraries:

🚨 Limitations and Security Considerations

  • chroot is not a security boundary like containers or VMs.

  • A privileged user (root) inside the jail can potentially break out.

  • No isolation of process namespaces, devices, or kernel-level resources.

For stronger isolation, consider alternatives like:

  • Linux containers (LXC, Docker)

  • Virtual machines (KVM, QEMU)

  • Kernel namespaces and cgroups

πŸ§ͺ Test chroot with debootstrap

:πŸ§ͺ Lab chroot

Use this script for lab: \chroot.sharrow-up-right

asciicastarrow-up-right


🧠 Understanding Linux Namespaces

\linux-namespaces

Namespaces are a core Linux kernel feature that enable process-level isolation. They create separate "views" of global system resources β€” such as process IDs, networking, filesystems, and users β€” so that each process group believes it is running in its own system.

In simple terms: namespaces trick a process into thinking it owns the machine, even though it's just sharing it.

This is the foundation for container isolation.

πŸ” What Do Namespaces Isolate?

Each namespace type isolates a specific system resource. Together, they make up the sandbox that a container operates in:

Namespace
Isolates...
Real-world example

PID

Process IDs

Processes inside a container see a different PID space

Mount

Filesystem mount points

Each container sees its own root filesystem

Network

Network stack

containers have isolated IPs, interfaces, and routes

UTS

Hostname and domain name

Each container sets its own hostname

IPC

Shared memory and semaphores

Prevents inter-process communication between containers

User

User and group IDs

Enables fake root (UID 0) inside the container

Cgroup (v2)

Control group membership

Ties into resource controls like CPU and memory limits

πŸ§ͺ Visual Analogy

\linux-namespaces

Imagine a shared office building:

  • All tenants share the same foundation (Linux kernel).

  • Each company has its own office (namespace): different locks, furniture, phone lines, and company name.

  • To each tenant, it feels like their own building.

That's exactly how containers experience the system β€” isolated, yet efficient.

πŸ”§ How containers Use Namespaces

When you run a container (e.g., with Docker or Podman), the runtime creates a new set of namespaces:

This command gives the process:

  • A new PID namespace β†’ it's process 1 inside the container.

  • A new network namespace β†’ its own virtual Ethernet.

  • A mount namespace β†’ a container-specific root filesystem.

  • Other namespaces depending on configuration (user, IPC, etc.)

The result: a lightweight, isolated runtime environment that behaves like a separate system.

βš™οΈ Complementary Kernel Features

Namespaces hide resources from containers. But to control how much they can use and what they can do, we need additional mechanisms:

πŸ”© Cgroups (Control Groups)

Cgroups allow the kernel to limit, prioritize, and monitor resource usage across process groups.

Resource
Use case examples

CPU

Limit CPU time per container

Memory

Cap RAM usage

Disk I/O

Throttle read/write operations

Network (v2)

Bandwidth restrictions

πŸ›‘οΈ Prevents the "noisy neighbor" problem by stopping one container from consuming all system resources.

🧱 Capabilities

Traditional Linux uses a binary privilege model: root (UID 0) can do everything, everyone else is limited.

Capability
Allows...

CAP_NET_BIND_SERVICE

Binding to privileged ports (e.g. 80, 443)

CAP_SYS_ADMIN

A powerful catch-all for system admin tasks

CAP_KILL

Sending signals to arbitrary processes

By dropping unnecessary capabilities, containers can run with only what they need β€” reducing risk.

πŸ” Security Mechanisms

Used in conjunction with namespaces and cgroups to lock down what a containerized process can do:

Feature
Description

seccomp

Whitelist or block Linux system calls (syscalls)

AppArmor

Apply per-application security profiles

SELinux

Enforce Mandatory Access Control with tight system policies

🧠 Summary for Beginners

βœ… Namespaces isolate what a container can see βœ… Cgroups control what it can use βœ… Capabilities and security modules define what it can do

Together, these kernel features form the technical backbone of container isolation β€” enabling high-density, secure, and efficient application deployment without full VMs.

πŸ§ͺ Lab Namespaces

Use this script for lab: \namespace.sharrow-up-right

asciicastarrow-up-right


🧩 Understanding Cgroups (Control Groups)

\cgroups

πŸ“Œ Definition

Control Groups (cgroups) are a Linux kernel feature introduced in 2007 that allow you to limit, account for, and isolate the resource usage (CPU, memory, disk I/O, etc.) of groups of processes.

cgroups are heavily used by low-level container runtimes such as runc and crun, and leveraged by container engines like Docker, Podman, and LXC to enforce resource boundaries and provide isolation between containers.

Namespaces isolate, cgroups control.

Namespaces create separate environments for processes (like PID, network, or mounts), while cgroups limit and monitor resource usage (CPU, memory, I/O) for those processes.

βš™οΈ Key Capabilities

Feature
Description

Resource Limiting

Impose limits on how much of a resource a group can use

Prioritization

Allocate more CPU/IO priority to some groups over others

Accounting

Track usage of resources per group

Control

Suspend, resume, or kill processes in bulk

Isolation

Prevent resource starvation between groups

πŸ“¦ Subsystems (Controllers)

cgroups operate through controllers, each responsible for managing one type of resource:

Subsystem
Description

cpu

Controls CPU scheduling

cpuacct

Generates CPU usage reports

memory

Limits and accounts memory usage

blkio

Limits block device I/O

devices

Controls access to devices

freezer

Suspends/resumes execution of tasks

net_cls

Tags packets for traffic shaping

ns

Manages namespace access (rare)

πŸ“‚ Filesystem Layout

cgroups are exposed through the virtual filesystem under /sys/fs/cgroup.

Depending on the version:

  • cgroups v1: separate hierarchies for each controller (e.g., memory, cpu, etc.)

  • cgroups v2: unified hierarchy under a single mount point

Mounted under:

Typical cgroups v1 hierarchy:

In cgroups v2, all resources are managed under a unified hierarchy:

πŸ§ͺ Common Usage (v1 and v2 examples)

v1 – Create and assign memory limit:

v2 – Unified hierarchy:

🧭 Process & Group Inspection

Command
Description

cat /proc/self/cgroup

Shows current cgroup membership

cat /proc/PID/cgroup

cgroup of another process

cat /proc/PID/status

Memory and cgroup info

ps -o pid,cmd,cgroup

Show process-to-cgroup mapping

πŸ“¦ Usage in containers

container engines like Docker, Podman, and containerd delegate resource control to cgroups (via runc or crun), allowing:

  • Per-container CPU and memory limits

  • Fine-grained control over blkio and devices

  • Real-time resource accounting

Docker example:

Behind the scenes, this creates cgroup rules for memory and CPU limits for the container process.

🧠 Concepts Summary

Concept
Explanation

Controllers

Modules like cpu, memory, blkio, etc. apply limits and rules

Tasks

PIDs (processes) assigned to the control group

Hierarchy

Cgroups are structured in a parent-child tree

Delegation

Systemd and user services may manage subtrees of cgroups

πŸ§ͺ Lab Cgroups

Use this script for lab: \cgroups.sharrow-up-right

asciicastarrow-up-right


πŸ›‘οΈ Understanding Capabilities

❓ What Are Linux Capabilities?

Traditionally in Linux, the root user has unrestricted access to the system. Linux capabilities were introduced to break down these all-powerful privileges into smaller, discrete permissions, allowing processes to perform specific privileged operations without requiring full root access.

This enhances system security by enforcing the principle of least privilege.

πŸ” Capability
πŸ“‹ Description

CAP_CHOWN

Change file owner regardless of permissions

CAP_NET_BIND_SERVICE

Bind to ports below 1024 (e.g., 80, 443)

CAP_SYS_TIME

Set system clock

CAP_SYS_ADMIN

⚠️ Very powerful – includes mount, BPF, and more

CAP_NET_RAW

Use raw sockets (e.g., ping, traceroute)

CAP_SYS_PTRACE

Trace other processes (debugging)

CAP_KILL

Send signals to any process

CAP_DAC_OVERRIDE

Modify files and directories without permission

CAP_SETUID

Change user ID (UID) of the process

CAP_NET_ADMIN

Manage network interfaces, routing, etc.

πŸ” Some Linux Capabilities Types

Capability Type
Description

CapInh (Inherited)

Capabilities inherited from the parent process.

CapPrm (Permitted)

Capabilities that the process is allowed to have.

CapEff (Effective)

Capabilities that the process is currently using.

CapBnd (Bounding)

Restricts the maximum set of effective capabilities a process can obtain.

CapAmb (Ambient)

Allows a process to explicitly define its own effective capabilities.

πŸ“¦ Capabilities in containers and Pods containers typically do not run as full root, but instead receive a limited set of capabilities by default depending on the runtime.

Capabilities can be added or dropped in Kubernetes using the securityContext.

πŸ“„ Kubernetes example:

πŸ” This ensures the container starts with zero privileges and receives only what is needed.

πŸ§ͺ Lab Capabilities

Use this script for lab: \capabilities.sharrow-up-right

asciicastarrow-up-right

πŸ›‘οΈ Seccomp (Secure Computing Mode)

What is it?

  • A Linux kernel feature for restricting which syscalls (system calls) a process can use.

  • Commonly used in containers (like Docker), browsers, sandboxes, etc.

How does it work?

  • A process enables a seccomp profile/filter.

  • The kernel blocks, logs, or kills the process if it tries forbidden syscalls.

  • Filters are written in BPF (Berkeley Packet Filter) format.

Quick commands

Tools

🦺AppArmor

What is it?

  • A Mandatory Access Control (MAC) system for restricting what specific programs can access.

  • Profiles are text-based, path-oriented, easy to read and edit.

How does it work?

  • Each binary can have a profile that defines its allowed files, network, and capabilitiesβ€”even as root!

  • Easy to switch between complain, enforce, and disabled modes.

Quick commands:

Tools:

aa-genprof, aa-logprof for generating/updating profiles

Logs

πŸ”’SELinux (Security-Enhanced Linux)

What is it?

  • A very powerful MAC system for controlling access to everything: files, processes, users, ports, networks, and more.

  • Uses labels (contexts) and detailed policies.

How does it work?

  • Everything (process, file, port, etc.) gets a security context.

  • Kernel checks every action against policy rules.

Quick commands:

Tools:

  • audit2allow, semanage, chcon (for managing policies/labels)

  • Logs: /var/log/audit/audit.log

  • Policies: /etc/selinux/

πŸ“‹ Summary Table for Common Security Systems

System
Focus
Complexity
Policy Location
Typical Use

Seccomp

Kernel syscalls

Medium

Per-process (via code/config)

Docker, sandboxes

AppArmor

Per-program access

Easy

/etc/apparmor.d/

Ubuntu, Snap, SUSE

SELinux

Full-system MAC

Advanced

/etc/selinux/ + labels

RHEL, Fedora, CentOS

πŸ—‚οΈ Linux container Isolation & Security Comparison

Technology
Purpose / What It Does
Main Differences
Example in containers

chroot 🏠

Changes the apparent root directory for a process. Isolates filesystem.

Simple filesystem isolation; doesnot restrict resources, privileges, or system calls.

Docker uses chroot internally for building minimal images, but not for strong isolation.

cgroups πŸ“Š

Controls and limits resource usage (CPU, memory, disk I/O, etc.) per group of processes.

Kernel feature; fine-grained resource control, not isolation.

Docker and Kubernetes use cgroups to limit CPU/mem per container/pod.

namespaces 🌐

Isolate system resources: PID, mount, UTS, network, user, IPC, time.

Kernel feature; provides different kinds of isolation.

Each container runs in its own set of namespaces (PID, net, mount, etc).

capabilities πŸ›‘οΈ

Split root privileges into fine-grained units (e.g., net_admin, sys_admin).

More granular than all-or-nothing root/non-root; can drop or grant specific privileges.

Docker containers usually run with reduced capabilities (drop dangerous ones).

seccomp 🧱

Filter/restrict which syscalls a process can make (whitelisting/blacklisting).

Very focused: blocks kernel syscalls; cannot block all actions.

Docker’s default profile blocks dangerous syscalls (e.g.,ptrace, mount).

AppArmor 🐧

Mandatory Access Control (MAC) framework: restricts programs' file/network access via profiles.

Profile-based, easier to manage than SELinux; less fine-grained in some cases.

Ubuntu-based containers often use AppArmor for container process profiles.

SELinux πŸ”’

More complex MAC framework, label-based, very fine-grained. Can confine users, processes, and files.

More powerful and complex than AppArmor; enforced on Fedora/RHEL/CentOS.

On OpenShift/Kubernetes with RHEL, SELinux labels are used to keep pods separate.

Summary

  • chroot: Basic isolation, no resource/security guarantees.

  • cgroups: Resource control, not isolation.

  • namespaces: Isolate "views" of kernel resources.

  • capabilities: Fine-tune process privileges.

  • seccomp: Restrict system call surface.

  • AppArmor/SELinux: Limit what processes can touch, even as root (MAC).

🧩 OCI, runc, containerd, CRI, CRI-O β€” What They Are in the container Ecosystem

Overview and Roles

  • OCI (Open container Initiative) πŸ›οΈ

    A foundation creating open standards for container images and runtimes .

    Defines how images are formatted, stored, and how containers are started/stopped (runtime spec).

  • runc βš™οΈ

    A universal, low-level, lightweight CLI tool that can run containers according to the OCI runtime specification.

    β€œThe engine” that turns an image + configuration into an actual running Linux container.

  • containerd πŸ‹οΈ

    A core container runtime daemon for managing the complete container lifecycle: pulling images, managing storage, running containers (calls runc), networking plugins, etc.

    Used by Docker, Kubernetes, nerdctl, and other tools as their main container runtime backend.

  • CRI (container Runtime Interface) πŸ”Œ

    A Kubernetes-specific gRPC API to connect Kubernetes with container runtimes.

    Not used outside Kubernetes, but enables K8s to talk to containerd, CRI-O, etc.

  • CRI-O πŸ₯€

    A lightweight, Kubernetes-focused runtime that only runs OCI containers, using runc under the hood.

    Mostly used in Kubernetes, but demonstrates how to build a minimal container runtime focused on open standards.

🏷️ Comparison Table: OCI, runc, containerd, CRI, CRI-O

Component
Emoji
What Is It?
Who Uses It?
Example Usage

OCI

πŸ›οΈ

Standards/specifications

Docker, Podman, CRI-O, containerd, runc

Ensures images/containers are compatible across tools

runc

βš™οΈ

container runtime (CLI)

containerd, CRI-O, Docker, Podman

Directly running a container from a bundle (e.g.runc run)

containerd

πŸ‹οΈ

container runtime daemon

Docker, Kubernetes, nerdctl

Handles pulling images, managing storage/network, starts containers via runc

CRI

πŸ”Œ

K8s runtime interface (API)

Kubernetes only

Lets kubelet talk to containerd/CRI-O

CRI-O

πŸ₯€

Lightweight container runtime for K8s

Kubernetes, OpenShift

Used as K8s container engine


πŸ› οΈ Practical Examples (General container World)

  • Building images:

    Any tool (Docker, Podman, Buildah) can produce images following the OCI Image Spec so they’re compatible everywhere.

  • Running containers:

    Both Podman and Docker ultimately use runc (via containerd or directly) to create containers.

  • Managing many containers:

    containerd can be used on its own (via ctr or nerdctl) or as a backend for Docker and Kubernetes.

  • Plug-and-play runtimes:

    Thanks to OCI , you could swap runc for another OCI-compliant runtime (like Kata containers for VMs, gVisor for sandboxing) without changing how you build or manage images.


🚒 Typical Stack

  • Docker : User CLI β†’ containerd β†’ runc

  • Podman : User CLI β†’ runc

  • Kubernetes : kubelet (CRI) β†’ containerd or CRI-O β†’ runc


🧠 Summary

  • OCI = Common language for images/runtimes (standards/specs)

  • runc = Actual tool that creates and manages container processes

  • containerd = Full-featured daemon that manages images, containers, lifecycle

  • CRI = Only for Kubernetes, to make runtimes pluggable

  • CRI-O = Lightweight runtime focused on Kubernetes, built on OCI standards and runc

🧩 Diagram: container Ecosystem

spinner

πŸ§ͺ lab runc

For runc lab, you can use this script: \runc.sharrow-up-right

asciicastarrow-up-right

πŸ§ͺ lab containerd

For containerd, you can use this script: \containerd.sharrow-up-right

asciicastarrow-up-right


πŸš€ Podman, Buildah, Skopeo, OpenVZ, crun & Kata containers – Fast Track


🐳 Podman

  • What is it? A container manager compatible with Docker CLI, but daemonless and can run rootless .

  • Use: Create, run, stop, and inspect containers and pods.

  • Highlights: No central daemon, safer for multi-user, integrates with systemd.


πŸ“¦ Buildah

  • What is it? Tool to build and manipulate container images (OCI/Docker) without a daemon.

  • Use: Building images in CI/CD pipelines or scripting.

  • Highlights: Lightweight, rootless support, used by Podman under the hood.


πŸ”­ Skopeo

  • What is it? Utility to inspect, copy, and move container images between registries without pulling or running them.

  • Use: Move images, check signatures and metadata.

  • Highlights: No daemon, ideal for automation and security.


🏒 OpenVZ

  • What is it? container-based virtualization solution for Linux (pre-dating modern container tools).

  • Use: Lightweight VPS (virtual private servers) sharing the same kernel.

  • Highlights: Very efficient, but less isolated than VM (shares kernel).


⚑ crun

  • What is it? Ultra-fast, minimal OCI runtime for containers, written in C (not Go).

  • Use: Executes containers with minimal overhead.

  • Highlights: Faster and lighter than runc, default for Podman on some systems.


πŸ›‘οΈ Kata containers

  • What is it? Open source project combining containers and VMs: each container runs in a lightweight micro-VM.

  • Use: Strong isolation for sensitive workloads or multi-tenant environments.

  • Highlights: VM-grade security, near-container performance.


πŸ“Š Comparison Table

Project
Category
Isolation
Daemon?
Main Use
Rootless
Notes

Podman

Orchestration

container

No

Manage containers

Yes

Docker-like CLI

Buildah

Build

N/A

No

Build images

Yes

For CI/CD, no container run

Skopeo

Image transfer

N/A

No

Move/check images

Yes

No container execution

OpenVZ

Virtualization

container/VPS

Yes

Lightweight VPS

No

Kernel shared, legacy tech

crun

OCI Runtime

container

No

Fast container runtime

Yes

Faster than runc

Kata containers

Runtime/VM

MicroVM per container

No

Strong isolation

Yes

VM-level security


β˜‘οΈ Quick Recap

  • Podman: Modern, daemonless Docker alternative.

  • Buildah: Build images, doesn't run containers.

  • Skopeo: Moves/inspects images, never runs them.

  • OpenVZ: Legacy container-based VPS.

  • crun: Super fast, lightweight OCI runtime.

  • Kata: containers with VM-level isolation.

πŸ› οΈ 352.1 Important Commands

πŸ”— unshare

πŸ” lsns

πŸšͺ nsenter

🌐 252.1 ip

πŸ“Š stat

πŸ› οΈ systemctl and systemd

πŸ—οΈ cgcreate

🏷️ cgclassify

πŸ›‘οΈ pscap - List Process Capabilities

πŸ›‘οΈ getcap /usr/bin/tcpdump

πŸ›‘οΈ setcap cap_net_raw=ep /usr/bin/tcpdump

πŸ›‘οΈ check capabilities by process

πŸ›‘οΈ capsh - capability shell wrapper

🦺 AppArmor - kernel enhancement to confine programs to a limited set of resources

πŸ”’ SELinux - Security-Enhanced Linux

βš™οΈ runc


(back to sub topic 352.1)

(back to topic 352)

(back to top)


πŸ“¦ 352.2 LXC

Weight: 6

Description: Candidates should be able to use system containers using LXC and LXD. The version of LXC covered is 3.0 or higher.

Key Knowledge Areas:

  • Understand the architecture of LXC and LXD

  • Manage LXC containers based on existing images using LXD, including networking and storage

  • Configure LXC container properties

  • Limit LXC container resource usage

  • Use LXD profiles

  • Understand LXC images

  • Awareness of traditional LXC tools

πŸ“‹ 352.2 Cited Objects

🧩 LXC & LXD – The Linux System containers Suite


πŸ“¦ LXC (Linux containers)

  • What is it?

    The core userspace toolset for managing application and system containers on Linux. Think of LXC as "chroot on steroids" – it provides lightweight process isolation using kernel features (namespaces, cgroups, AppArmor, seccomp, etc).

  • Use:

    • Run full Linux distributions as containers (not just single apps).

    • Useful for testing, legacy apps, or simulating servers.

  • Highlights:

    • CLI-focused: lxc-create, lxc-start, lxc-attach, etc.

    • Fine-grained control over container resources.

    • No daemon – runs per-container processes.

  • Best for:

    Linux experts who want total control and β€œbare-metal” feel for containers.

πŸ§ͺ lab LXC

For LXC lab, you can use this script: \lxc.sharrow-up-right

asciicastarrow-up-right


🌐 LXD

  • What is it?

    LXD is a next-generation container and VM manager, built on top of LXC . It offers a powerful but user-friendly experience to manage containers and virtual machines via REST API, CLI, or even a Web UI.

  • Use:

    • Manage system containers and virtual machines at scale.

    • Networked β€œcontainer as a service” with easy orchestration.

  • Highlights:

    • REST API : manage containers/VMs over the network.

    • Images: Instant deployment of many Linux distros.

    • Snapshots, storage pools, clustering, live migration.

    • Supports running unprivileged containers by default.

    • CLI: lxc launch, lxc exec, lxc snapshot, etc. (Yes, same prefix as LXC, but different backend!)

  • Best for:

    DevOps, sysadmins, cloud-native setups, lab environments.

πŸ“ LXD Storage: Feature Table (per backend)

Feature
dir
zfs
btrfs
lvm/lvmthin
ceph/cephfs

Snapshots

❌

βœ…

βœ…

βœ…

βœ…

Thin Provisioning

❌

βœ…

βœ…

βœ… (lvmthin)

βœ…

Resizing

❌

βœ…

βœ…

βœ…

βœ…

Quotas

❌

βœ…

βœ…

βœ… (lvmthin)

βœ…

Live Migration

❌

βœ…

βœ…

βœ…

βœ…

Deduplication

❌

βœ…

❌

❌

βœ… (Ceph)

Compression

❌

βœ…

βœ…

❌

βœ… (Ceph)

Encryption

❌

βœ…

❌

βœ… (LUKS)

βœ…

Cluster/Remote

❌

❌

❌

❌

βœ…

Best Use Case

Dev

Labs/Prod

Labs/Prod

Labs/Prod

Clusters, Enterprise

πŸ” Quick LXD Storage Summary

  • Storage Pools: Abstracts the backendβ€”multiple pools, different drivers per pool.

  • Available Drivers: dir, zfs, btrfs, lvm, lvmthin, ceph, cephfs (more via plugins).

  • Custom Volumes: Create, mount, unmount for containers/VMs.

  • Snapshots & Clones: Native, fast, supports backup/restore, copy-on-write migration.

  • Quotas & Resize: Easy live management for pools, containers, or volumes.

  • Live Migration: Move containers/VMs across hosts without downtime.

  • Security: Built-in encryption (ZFS, LVM, Ceph), ACLs, backup/restore, etc.

  • Enterprise-ready: Suits clustered and high-availability setups.


πŸ“Š LXC vs LXD Comparison Table

Feature
🏷️ LXC
🌐 LXD

Type

Low-level userspace container manager

High-level manager (containers + VMs)

Interface

CLI only

REST API, CLI, Web UI

Daemon?

No (runs as processes)

Yes (central daemon/service)

Orchestration

Manual, scriptable

Built-in clustering & API

Images

Template-based

Full image repository, many OSes

Snapshots

Manual

Native, integrated

VM support

No

Yes (QEMU/KVM)

Use-case

Fine-grained control, β€œbare-metal”

Scalable, user-friendly, multi-host

Security

Can be unprivileged, but DIY

Default unprivileged, more isolation

Best for

Linux pros, advanced scripting

DevOps, cloud, teams, self-service


β˜‘οΈ Quick Recap

  • LXC = The low-level building blocks. Power and flexibility for container purists .

  • LXD = Modern, API-driven, scalable platform on top of LXC for easy container and VM management (single node or clusters).

πŸ—ƒοΈ LXC vs LXD - Storage Support (Summary)

Feature

LXC

LXD

Storage Backends

Local filesystem (default only)

dir(filesystem), zfs , btrfs , lvm , ceph , cephfs ,lvmthin

Storage Pools

❌ (just local paths, no native pools)

βœ… Multiple storage pools, each with different drivers

Snapshots

Manual/FS dependent

βœ… Native, fast, automatic, scheduled, consistent snapshots

Thin Provisioning

❌ (not supported natively)

βœ… Supported in ZFS, Btrfs, LVM thin, Ceph

Quotas

❌

βœ… Supported per container/volume (in ZFS, Btrfs, Ceph, LVMthin)

Live Migration

Limited

βœ… Live storage migration between hosts, copy-on-write

Encryption

❌

βœ… (ZFS, LVM, Ceph)

Custom Volumes

❌

βœ… Create, attach/detach custom storage volumes for containers/VMs

Remote Storage

❌

βœ… Ceph, CephFS, NFS, SMB support

Filesystem Features

Host dependent

ZFS: dedup, compress, snapshots, send/receive, cache, quotas. LVM: thin, snapshots, etc.

Resizing

Manual (via host)

βœ… Volumes and pools can be resized live

Storage Drivers

Basic/local only

Extensible plugins, multiple enterprise-ready drivers

πŸ“Š Final Storage Comparison Table

LXC

LXD

Storage Backend

Local only

dir, zfs, btrfs, lvm, lvmthin, ceph, cephfs

Storage Pools

❌

βœ… Multiple, independent, hot-pluggable

Snapshots

Limited/manual

βœ… Fast, automatic, consistent

Thin Provisioning

❌

βœ… (ZFS, Btrfs, LVMthin, Ceph)

Quotas

❌

βœ…

Resizing

Manual

βœ…

Remote Storage

❌

βœ… (Ceph, NFS, SMB)

Custom Volumes

❌

βœ…

Cluster Ready

❌

βœ…

Enterprise

No

Yesβ€”HA, backup, migration, security, production ready

🌐 LXC vs LXD - Network Support (Summary)

Feature

LXC

LXD

Network Types

bridge, veth, macvlan, phys, vlan

bridge, ovn, macvlan, sriov, physical, vlan, fan, tunnels

Managed Networks

❌ Manual (host config)

βœ… Natively managed via API/CLI, easy to create and edit

Network API

❌ CLI commands only

βœ… REST API, CLI, integration with external tools

Bridge Support

βœ… Manual

βœ… Automatic and advanced (L2, Open vSwitch, native bridge)

NAT & DHCP

❌ Manual (iptables/dnsmasq)

βœ… Integrated NAT, DHCP, DNS, per-network configurable

DNS

❌ Manual

βœ… Integrated DNS, custom domains, systemd-resolved integration

IPv6

βœ… (manual, limited)

βœ… Full support, auto, DHCPv6, NAT6, routing

VLAN

βœ… (manual, host)

βœ… Native VLANs, easy configuration

SR-IOV

❌

βœ… Native support

Network ACLs

❌

βœ… ACLs, forwards, zones, peerings, firewall rules

Clustering

❌

βœ… Replicated and managed networks in clusters

Attach/Detach

Manual (host)

βœ… CLI/API, hotplug, easy for containers/VMs

Security

Manual (host)

βœ… Isolation, firewall, ACL, firewalld integration, per-network rules

Custom Routes

Manual

βœ… Custom routes support, multiple gateways

Network Profiles

❌

βœ… Reusable network profiles

Monitoring

Manual

βœ… Status, IPAM, logs, detailed info via CLI/API

Enterprise

No

Yesβ€”multi-tenant, ACL, clustering, cloud integration

πŸ“Š Final Network Comparison Table

LXC

LXD

Network Types

bridge, veth, vlan

bridge, ovn, macvlan, sriov, physical, vlan, fan, tunnels

Managed

❌

βœ…

NAT/DHCP/DNS

Manual

βœ… Integrated

VLAN

Manual

βœ…

SR-IOV

❌

βœ…

API

❌

βœ…

Clustering

❌

βœ…

Security/ACL

Manual

βœ…

Profiles

❌

βœ…

Enterprise

No

Yes

πŸ§ͺ lab LXD

For LXD lab, you can use this script: \lxd.sharrow-up-right

πŸ› οΈ 352.2 Important Commands

πŸ“¦ lxc

🌐 lxd

(back to sub topic 352.2)

(back to topic 352)

(back to top)


🐳 352.3 Docker

\docker-architecture
\docker-runtime

Weight: 9

Description: Candidate should be able to manage Docker nodes and Docker containers. This include understand the architecture of Docker as well as understanding how Docker interacts with the node’s Linux system.

Key Knowledge Areas:

  • Understand the architecture and components of Docker

  • Manage Docker containers by using images from a Docker registry

  • Understand and manage images and volumes for Docker containers

  • Understand and manage logging for Docker containers

  • Understand and manage networking for Docker

  • Use Dockerfiles to create container images

  • Run a Docker registry using the registry Docker image

πŸ“‹ 352.3 Cited Objects

πŸ“– Definition

Docker is an open-source container platform that allows developers and operators to package applications and their dependencies into containers .

These containers ensure consistency across environments , speed up deployments, and reduce infrastructure complexity.


πŸ”‘ Key Concepts

  • πŸ“¦ container β†’ Lightweight, isolated runtime sharing the host kernel.

  • πŸ–ΌοΈ Image β†’ Read-only template containing the app and dependencies.

  • βš™οΈ Docker Engine (dockerd) β†’ Daemon managing containers, images, and volumes.

  • ⌨️ Docker CLI β†’ Command-line tool (docker) communicating with the daemon.

  • ☁️ Docker Hub β†’ Default registry for storing and distributing images.


πŸš€ Advantages

  • ⚑ Lightweight & Fast β†’ Much faster than virtual machines.

  • 🌍 Portability β†’ Runs anywhere Docker is supported.

  • πŸ› οΈ Rich Ecosystem β†’ Compose, Swarm, Hub, Desktop UI, registries.

  • πŸ”„ DevOps Friendly β†’ CI/CD integration and IaC alignment.


πŸ“‘ Docker Registries

  • ☁️ Docker Hub β†’ Default, public registry.

  • 🏒 Private Registries β†’ Harbor, Artifactory, GitHub container Registry.

  • πŸ”’ Use docker login to authenticate, push, and pull images.


Docker Images

\docker-images
  • Concept: immutable package with app, dependencies, and metadata.

  • Layers and cache: each Dockerfile instruction becomes a reusable layer

  • Builds and pulls share layers.

  • Naming: registry/namespace/repo:tag (e.g., docker.io/library/nginx:1.27).

  • Digest: use @sha256:... to pin exact content (good for production).

  • Image vs container: image is read-only; container is an instance with an ephemeral write layer.

  • Basic commands: docker image ls, docker pull, docker run, docker inspect, docker history, docker tag, docker push, docker rmi, docker image prune -a, docker save/docker load.

  • Best practices: minimal base (alpine/distroless), multi-stage builds, pin versions/tags, run as non-root USER.

Docker Image Layers

In this example, I demonstrate a docker image layers.

In the first image we have a base image of alpine and add one layer.

The second image I have a my-base-image:1.0 and add two layers, generating a new image with name acme/my-final-image:1.0.

\docker-image-layers

Docker image Copy-on-Write (CoW)

In this example, I demonstrate a docker image Copy-on-Write (CoW).

Create a 5 containers from the same image.

See the size of the containers.

To demonstrate this, run the following command to write the word 'hello' to a file on the container's writable layer in containers my_container_1, my_container_2, and my_container_3:

Check the size of the containers again.

\docker-image-cow

🐳 Dockerfile Image Instructions and Layers

πŸ“Š Table: Instruction vs. Layer Generation

Instruction
Creates a Filesystem Layer?
Notes

FROM

❌ No

Sets the base image; underlying layers come from it.

RUN

βœ… Yes

Executes filesystem changes; adds content that persists.

COPY

βœ… Yes

Adds files from build context into the image filesystem.

ADD

βœ… Yes

Similar to COPY, with additional features (URLs, tar extraction).

LABEL

❌ No

Only adds metadata; doesn’t change filesystem content.

ENV

❌ No

Defines environment variables; stored as metadata.

ARG

❌ No

Build-time only; does not affect final image unless used later.

WORKDIR

❌ No

Changes working directory; metadata only.

USER

❌ No

Sets the user; metadata only.

EXPOSE

❌ No

Declares exposed port(s); metadata only.

ENTRYPOINT

❌ No

Defines how container starts; metadata configuration.

CMD

❌ No

Default command or args; metadata only.

VOLUME

βœ… Yes / Partial

Declares mount points; metadata + volumes in runtime; has filesystem implications.

HEALTHCHECK

❌ No

Defines health check config; stored as metadata.

STOPSIGNAL

❌ No

Defines signal to stop container; metadata only.

SHELL

❌ No

Changes shell for later RUN; metadata only.

ONBUILD

❌ No

Triggers for future builds; metadata only.

πŸ”Ž Key Insights

  • Most Dockerfile instructions create a new image layer β€” even metadata changes (CMD, EXPOSE, etc.) are stored as part of the final image configuration.

  • Heavyweight layers come from instructions that modify the filesystem (RUN, COPY, ADD).

  • Lightweight/metadata layers come from instructions like ENV, CMD, LABEL.

  • ARG is special : it exists only at build-time and is discarded in the final image unless used in other instructions.

  • To minimize image size:

    • Combine multiple RUN commands into one.

    • Use .dockerignore to avoid copying unnecessary files.

    • Order instructions to maximize Docker’s build cache efficiency .


🐳 Dockerfile

A Dockerfile is a declarative text file that contains a sequence of instructions to build a Docker image. It's the blueprint for creating reproducible, portable, and automated containerized environments.

✨ Key Concepts

Concept
Description

πŸ“œ Declarative Script

A simple text file with line-by-line instructions for assembling an image.

κ²Ή Layered Architecture

Each instruction in a Dockerfile creates a new layer in the image. Layers are stacked and are read-only.

⚑ Build Cache

Docker caches the result of each layer. If a layer and its dependencies haven't changed, Docker reuses the cached layer, making builds significantly faster.

πŸ“¦ Build Context

The set of files at a specified PATH or URL that are sent to the Docker daemon during a build. Use a .dockerignore file to exclude unnecessary files.

πŸ—οΈ Multi-Stage Builds

A powerful feature that allows you to use multiple FROM instructions in a single Dockerfile. This helps to separate build-time dependencies from runtime dependencies, resulting in smaller and more secure production images.


πŸ“ Core Instructions

The following table summarizes the most common Dockerfile instructions.

Instruction
Purpose
Example

🏁 FROM

Specifies the base image for subsequent instructions. Must be the first instruction.

FROM ubuntu:22.04

🏷️ LABEL

Adds metadata to an image as key-value pairs.

LABEL version="1.0" maintainer="me@example.com"

πŸƒ RUN

Executes any commands in a new layer on top of the current image and commits the results.

RUN apt-get update && apt-get install -y nginx

πŸš€ CMD

Provides defaults for an executing container. There can only be one CMD.

CMD ["nginx", "-g", "daemon off;"]

πŸšͺ ENTRYPOINT

Configures a container that will run as an executable.

ENTRYPOINT ["/usr/sbin/nginx"]

🌐 EXPOSE

Informs Docker that the container listens on the specified network ports at runtime.

EXPOSE 80

🌳 ENV

Sets environment variables.

ENV APP_VERSION=1.0

πŸ“‚ COPY

Copies new files or directories from the build context to the filesystem of the image.

COPY ./app /app

πŸ”— ADD

Similar to COPY, but with more features like remote URL support and tar extraction.

ADD http://example.com/big.tar.xz /usr/src

πŸ‘€ USER

Sets the user name (or UID) and optionally the user group (or GID) to use when running the image.

USER appuser

πŸ“ WORKDIR

Sets the working directory for any RUN, CMD, ENTRYPOINT, COPY, and ADD instructions.

WORKDIR /app

πŸ’Ύ VOLUME

Creates a mount point with the specified name and marks it as holding externally mounted volumes.

VOLUME /var/lib/mysql

πŸ—οΈ ONBUILD

Adds to the image a trigger instruction to be executed at a later time, when the image is used as the base for another build.

ONBUILD COPY . /app/src

πŸ’Š HEALTHCHECK

Tells Docker how to test a container to check that it is still working.

`HEALTHCHECK --interval=5m --timeout=3s CMD curl -f http://localhost/

🐚 SHELL

Allows the default shell used for the shell form of commands to be overridden.

SHELL ["/bin/bash", "-c"]


⭐ Best Practices for Writing Dockerfiles

Following best practices is crucial for creating efficient, secure, and maintainable images.

Guideline
Description

🀏 Keep it Small

Start with a minimal base image (like alpine or distroless). Don't install unnecessary packages to reduce size and attack surface.

♻️ Leverage Build Cache

Order your Dockerfile instructions from least to most frequently changing. Place COPY and ADD instructions as late as possible to avoid cache invalidation.

πŸ—οΈ Use Multi-Stage Builds

Separate your build environment from your runtime environment. This dramatically reduces the size of your final image by excluding build tools and dependencies.

🚫 Use .dockerignore

Exclude files and directories that are not necessary for the build (e.g., .git, node_modules, local test scripts) to keep the build context small and avoid sending sensitive data.

πŸ“¦ Combine RUN Instructions

Chain related commands using && to create a single layer. For example, combine apt-get update with apt-get install and clean up afterwards (rm -rf /var/lib/apt/lists/*).

πŸ“Œ Pin Versions

Pin versions for base images (ubuntu:22.04) and packages (nginx=1.21.6-1~bullseye) to ensure reproducible builds and avoid unexpected changes.

πŸ‘€ Run as Non-Root

Create a dedicated user and group with RUN useradd, and use the USER instruction to switch to that user. This improves security by avoiding running containers with root privileges.

πŸš€ CMD vs ENTRYPOINT

Use ENTRYPOINT for the main executable of the image and CMD to specify default arguments. This makes the image behave like a binary.

πŸ’¬ Sort Multi-line Arguments

Sort multi-line arguments alphanumerically (e.g., in a long RUN apt-get install command) to make the Dockerfile easier to read and maintain.

πŸ“ Be Explicit

Use COPY instead of ADD when the extra magic of ADD (like tar extraction or URL fetching) is not needed. It's more transparent.

Dockerfile example


Docker + containerd + shim + runc Architecture

\Docker shim architecture example

πŸ”Ή Main Components

  • Docker CLI / Docker Daemon (dockerd)

    The docker command communicates with the Docker daemon, which orchestrates container lifecycle, images, networks, and volumes.

  • containerd

    A high-level container runtime that manages the entire container lifecycle: pulling images, managing storage, networking, and execution.

  • containerd-shim

    • Acts as the parent process of each container once runc has done its job.

    • Keeps stdin/stdout/stderr streams open, even if Docker or containerd restarts (so docker logs / kubectl logs still work).

    • Collects the container exit code and reports it back to the manager.

    • Prevents containers from becoming orphans if the daemon crashes or is restarted.

  • runc

    A low-level runtime (OCI-compliant) that creates containers using Linux namespaces and cgroups.

    After launching the container, runc exits, and containerd-shim takes over as the parent process.


πŸ”Ή Execution Flow

  1. User runs docker run ... β†’ the Docker Daemon is called.

  2. Docker Daemon delegates to containerd .

  3. containerd spawns runc , which sets up the container.

  4. Once the container starts, runc exits .

  5. containerd-shim remains as the container’s parent process , handling logging and exit codes.


πŸ”Ή Benefits of the Shim Layer

  • Resilience β†’ Containers continue running even if dockerd or containerd crash or restart.

  • Logging β†’ Maintains container log streams for docker logs or kubectl logs.

  • Isolation β†’ Each container has its own shim, simplifying lifecycle management.

  • Standards Compliance β†’ Works with the OCI runtime spec , ensuring compatibility.

βš–οΈ Docker vs. containerd

πŸ”Ή Feature / Component
🐳 Docker (dockerd)
πŸ‹ containerd

Scope

Full platform (build, CLI, UI, Hub)

Core container runtime only

API

High-level Docker API

Low-level CRI/runtime API

Built upon

Uses containerd internally

Standalone runtime

Features

Build, Compose, Swarm, Hub, Desktop

Image lifecycle, pull/run, runtime

Use Cases

Dev workflows, local testing

Kubernetes, production runtimes

Footprint

Heavier, more tooling

Lightweight, efficient

Ecosystem

Rich developer tools

CNCF project, Kubernetes default

Docker Storage

🧱 Core Concepts

πŸ” Focus
Details

Union FS

Read-only image layers + the container's writable layer form a union filesystem; removing the container drops ephemeral changes.

Data Root

Storage drivers persist data under /var/lib/docker/<driver>/; inspect the active driver via docker info --format "{{.Driver}}".

Persistence

Move stateful data to volumes (persistent), bind mounts (host path), or tmpfs mounts (in-memory, ephemeral) to survive container recreation or optimize performance.

βš™οΈ Storage Drivers

Driver
When to use
Notes

overlay2

Default on modern Linux kernels.

Fast copy-on-write; backing filesystem must support d_type.

fuse-overlayfs

Rootless or user-namespace deployments.

Adds a thin FUSE layer; enables non-root workflows.

btrfs / zfs

Need native snapshots, quotas, compression.

Provision dedicated pools and use platform tooling for management.

devicemapper (direct-lvm) / aufs

Legacy setups only.

Maintenance mode; plan migrations to overlay2.

windowsfilter

Windows container images.

Use LCOW/WSL 2 to expose overlay2 for Linux workloads on Windows hosts.

🧭 Selecting the Driver

  • Confirm kernel modules (modprobe overlay) and filesystem prerequisites before switching drivers.

  • Match driver features to workloads: many small layers favor overlay2; filesystem-level snapshots may justify btrfs or zfs.

  • Stick to provider defaults on Docker Desktop, EKS, GKE, etc., to stay within support boundaries.

  • Keep /var/lib/docker on reliable, low-latency storageβ€”copy-on-write drivers amplify slow disks.

For testing volume drivers use script: \docker-storage-driver.sharrow-up-right.

πŸ“¦ Docker Storage Types

Volumes:

  • Managed by Docker, located outside the container's writable layer (/var/lib/docker/volumes).

  • Persist after container removal, can be shared between containers.

  • Used for data that must survive the container lifecycle.

  • Examples:

    • Create volume: docker volume create data

    • Use volume: docker run -v data:/app/data ...

Bind Mounts:

  • Mount a host directory/file directly into the container.

  • Useful for development, code sync, or accessing existing host data.

  • Less portable (absolute paths, host permissions).

  • Examples:

    • docker run -v /home/user/app:/app ...

    • docker run --mount type=bind,source=/data,target=/app/data ...

Tmpfs Mounts:

  • In-memory mount (RAM), does not persist after container stops or restarts.

  • Ideal for temporary data, caches, or sensitive information.

  • Nothing is written to disk, maximum performance.

  • Examples:

    • docker run --mount type=tmpfs,target=/tmp/cache ...

    • docker run --tmpfs /tmp/cache ...

Quick summary:

Type
Persistence
Location
Portability
Typical use

Volume

Yes

Docker

High

App data, databases

Bind mount

Optional

Host

Low

Dev, integration

Tmpfs

No

RAM

High

Cache, ephemeral

πŸ› οΈ Storage Types Usage examples

βœ… Docker Storage Best practices

For testing storage volumes use script: \docker-storage-volumes.sharrow-up-right.

Docker Networking

🌐 Core Concepts

πŸ” Focus
Details

User-defined networks

Create isolated topologies (docker network create) and attach/detach containers on demand with docker network connect or the --network flag.

Network stack sharing

By default each container gets its own namespace; --network container:<id> reuses another container's stack but disables flags like --publish, --dns, and --hostname.

Embedded DNS

Docker injects an internal DNS server per network; container names and --network-alias entries resolve automatically and fall back to the host resolver for external lookups.

Gateway priority

When a container joins multiple networks, Docker selects the default route via the highest --gw-priority; override IPs with --ip / --ip6 for deterministic addressing.

🚍 Default Drivers

Driver
Use when
Highlights

bridge

Standalone workloads on a single host need simple east-west traffic.

Default bridge network ships with Docker; create user-defined bridges for DNS, isolation, and per-project scoping.

host

You need native host networking with zero isolation.

Shares the host stack; no port mapping needed; ideal for high-throughput or port-dynamic workloads.

overlay

Services must span multiple Docker hosts or Swarm nodes.

VXLAN-backed; requires the Swarm control plane (or external KV store) to coordinate networks across engines.

macvlan

Containers must appear as physical devices on the LAN.

Assigns unique MAC/IP pairs from the parent interface; great for legacy integrations or strict VLAN segmentation.

ipvlan

Underlay restricts MAC addresses but permits L3 routing.

Provides per-container IPv4/IPv6 without extra MACs; supports L2 (ipvlan -l2) and L3 modes with VLAN tagging.

none

Full isolation is required.

Removes the network stack entirely; manual namespace wiring only (not supported for Swarm services).

Plugins

Built-in drivers fall short of SDN or vendor needs.

Install third-party network plugins from the Docker ecosystem to integrate with specialized fabrics.

πŸ•ΉοΈ Working with Networks

  • Scope infrastructure with user-defined bridges for app components, overlays for distributed stacks, or L2-style macvlan/ipvlan for direct LAN presence.

  • Combine frontend/backend networks per container; set --internal on a bridge to block egress while still allowing service-to-service traffic.

  • Inspect connectivity with docker network ls, docker network inspect, and docker exec <ctr> ip addr to validate namespace wiring.

  • Clean up unused networks regularly with docker network prune to avoid stale subnets and orphaned config.

🚦 Published Ports & Access

  • Bridge networks keep ports private unless you publish them with -p / --publish; include 127.0.0.1: (or [::1]: for IPv6) to restrict exposure to the host only.

  • Port publishing is not required for container-to-container access on the same user-defined bridgeβ€”DNS and the internal IP suffice.

  • Overlay and macvlan drivers bypass the userland proxy; plan upstream firewalls or routing accordingly.

πŸ” Addressing & DNS

  • IPv4 is enabled by default on new networks; add --ipv6 to provision dual-stack ranges and use --ip / --ip6 to pin addresses.

  • Each join operation can supply extra identities via --alias; Docker advertises them through the embedded DNS service.

  • Override resolvers per container using --dns, --dns-search, or --dns-option, or import extra hosts with --add-host.

  • Containers inherit a curated /etc/hosts; host-level entries are not synced automatically.

πŸ› οΈ Docker Network Usage examples

βœ… Docker Network Best practices

  • Model network boundaries early; document which containers share bridges, overlays, or macvlan segments.

  • Use --internal or firewalls to block unintended egress, and prefer network-level isolation over ad-hoc port publishing.

  • When mixing drivers, verify default routes (ip route) to ensure the correct gateway won the --gw-priority.

  • Monitor subnet allocation conflicts when multiple hosts create networks; explicitly set --subnet / --gateway for predictable CIDRs.

  • Cross-check the official docs for updates: Networking overviewarrow-up-right and Network driversarrow-up-right.

For testing docker network use script: \docker-network.sharrow-up-right.

🐳 Docker Registry

πŸ“˜ What is a Docker Registry?

A Docker Registry is a stateless, highly scalable server-side application that stores and lets you distribute Docker images. It's the central place where you can push your images after building them and pull them to run on other machines.

Key Concepts

  • Registry: The storage system that contains repositories of images. Examples: Docker Hub, AWS ECR, a self-hosted registry.

  • Repository: A collection of related Docker images, often different versions of the same application or service (e.g., the nginx repository).

  • Tag: A label applied to an image within a repository to identify a specific version (e.g., 1.27, latest).

  • Image Name: The full name of an image follows the format: [registry-host]/[username-or-org]/[repository]:[tag].

    • If registry-host is omitted, it defaults to Docker Hub (docker.io).

    • If tag is omitted, it defaults to latest.

Types of Registries

  1. Public Registries:

    • Docker Hub: The default and largest public registry.

    • Quay.io: Another popular public and private registry by Red Hat.

    • GitHub Container Registry (GHCR): Integrated with GitHub repositories and Actions.

  2. Private Registries:

    • Self-Hosted:

      • Docker Registry Image: A simple, official image to run your own basic registry.

      • Harbor: An enterprise-grade open-source registry with security scanning, access control, and replication.

      • JFrog Artifactory: A universal artifact manager that supports Docker images.

    • Cloud-Hosted:

      • Amazon Elastic Container Registry (ECR)

      • Google Artifact Registry (formerly GCR)

      • Azure Container Registry (ACR)

Running a Local Registry

You can easily run a private registry locally for testing or development using Docker's official registry image.

  1. Start the local registry container:

    This starts a registry listening on localhost:5000.

  2. Tag an image to point to the local registry: Before you can push an image to this registry, you need to tag it with the registry's host and port.

  3. Push the image to the local registry:

  4. Pull the image from the local registry: You can now pull this image on any machine that can access localhost:5000.

  5. Access the registry API: You can interact with the registry using its HTTP API. For example, to list repositories:

πŸš€ Core Commands

Command
Description
Example

docker login

Log in to a Docker registry.

docker login myregistry.example.com

docker logout

Log out from a Docker registry.

docker logout

docker pull

Pull an image or a repository from a registry.

docker pull ubuntu:22.04

docker push

Push an image or a repository to a registry.

docker push myregistry.com/myapp:1.0

docker search

Search Docker Hub for images.

docker search nginx

For testing docker registry use script: \docker-registry-lab.sharrow-up-right.

πŸ› οΈ 352.3 Important Commands

🐳 docker

(back to sub topic 352.3)

(back to topic 352)

(back to top)


πŸ—‚οΈ 352.4 container Orchestration Platforms

Weight: 3

Description: Candidates should understand the importance of container orchestration and the key concepts Docker Swarm and Kubernetes provide to implement container orchestration.

Key Knowledge Areas:

  • Understand the relevance of container orchestration

  • Understand the key concepts of Docker Compose and Docker Swarm

  • Understand the key concepts of Kubernetes and Helm

  • Awareness of OpenShift, Rancher and Mesosphere DC/OS

(back to sub topic 352.4)

(back to topic 352)

(back to top)


🧩 Docker Compose

πŸ“˜ Docker Compose Command Reference

Docker Compose is a tool for defining and managing multi-container Docker applications using a YAML file (docker-compose.yml).

Below is a structured table of the main commands and their purposes.

πŸ“Š Table: Docker Compose Commands

Command
Purpose
Example

▢️**docker compose up**

Build, (re)create, start, and attach to containers defined in docker-compose.yml.

docker compose up -d

⏹️**docker compose down**

Stop and remove containers, networks, volumes, and images created by up.

docker compose down --volumes

πŸ”„**docker compose restart**

Restart running services.

docker compose restart web

🟒**docker compose start**

Start existing containers without recreating them.

docker compose start db

πŸ”΄**docker compose stop**

Stop running containers without removing them.

docker compose stop db

🧹**docker compose rm**

Remove stopped service containers.

docker compose rm -f

πŸ—οΈ**docker compose build**

Build or rebuild service images.

docker compose build web

πŸ“₯**docker compose pull**

Pull service images from a registry.

docker compose pull redis

πŸ“€**docker compose push**

Push service images to a registry.

docker compose push api

πŸ“„**docker compose config**

Validate and view the Compose file.

docker compose config

πŸ“‹**docker compose ps**

List containers managed by Compose.

docker compose ps

πŸ“Š**docker compose top**

Display running processes of containers.

docker compose top

πŸ“œ**docker compose logs**

View output logs from services.

docker compose logs -f api

πŸ”**docker compose exec**

Run a command in a running service container.

docker compose exec db psql -U postgres

🐚**docker compose run**

Run one-off commands in a new container.

docker compose run web sh

πŸ”§**docker compose override**

Use -fto specify multiple Compose files (overrides).

docker compose -f docker-compose.yml -f docker-compose.override.yml up

🌐Networking

Networks are auto-created; can be declared explicitly in YAML.

docker network ls

πŸ“¦Volumes

Manage persistent data; can be declared in YAML and used across services.

docker volume ls

πŸ”‘ Key Notes

  • up vs start : up builds/recreates containers, start only runs existing ones.

  • run vs exec : run launches a new container, exec runs inside an existing one.

  • Config validation : Always run docker compose config to check for syntax errors.

  • Detach mode : Use -d to run services in background.

πŸ“„ docker-compose.yml

πŸ”Ž Explanation

  • services : Defines containers (web, api, db) that make up the app.

  • ports : Maps host ports to container ports (8080:80).

  • volumes :

  • Named volume (db-data) for persistent DB data.

  • Bind mount (./html:/usr/share/nginx/html) to serve static content.

  • build : Allows building a custom image from a Dockerfile.

  • depends_on : Ensures service startup order (api waits for db).

  • networks : Defines an isolated virtual network for communication.

πŸš€ Usage

Start in detached mode

For testing docker compose use examples of services in appsarrow-up-right.

🌐 Docker Swarm

\swarm-nodes Swarm architecture with manager and worker nodes

\swarm-services Swarm services with multiple replicas

Docker Swarm is Docker's native orchestration tool that allows you to manage a cluster of Docker hosts as a single virtual system. It facilitates the deployment, management, and scaling of containerized applications across multiple machines.

Docker Swarm Key Concepts

Concept
Description

🌐 Swarm

A cluster of Docker hosts running in swarm mode.

πŸ€– Node

A Docker host participating in the swarm. Nodes can be either managers or workers.

πŸ‘‘ Manager Node

Responsible for managing the swarm's state, scheduling tasks, and maintaining the desired state of the cluster.

πŸ‘· Worker Node

Executes tasks assigned by manager nodes, running the actual containers.

πŸš€ Service

An abstract definition of a computational resource (e.g., an Nginx web server) that can be scaled and updated independently.

πŸ“ Task

A running container that is part of a service.

✨ Main characteristics

Feature
Description

⬆️ High availability

Distributes services across multiple nodes, ensuring applications remain available even if a node fails.

βš–οΈ Scalability

Easily scale services up or down to handle varying workloads.

πŸ”„ Load Balancing

Built-in load balancing distributes requests evenly among service replicas.

πŸš€ Rolling Updates

Perform updates to services with zero downtime.

😊 Ease of use

Integrated directly into Docker Engine, making it relatively simple to set up and manage compared to other orchestrators.

🐳 Swarm Management Commands

Command
Description
Example

πŸ‘‘ docker swarm init

Initializes a new swarm on the current node, making it the manager.

docker swarm init --advertise-addr 192.168.1.10

πŸ‘· docker swarm join

Joins a node to an existing swarm as a worker or manager.

docker swarm join --token <TOKEN> 192.168.1.10:2377

πŸ‘‹ docker swarm leave

Removes the current node from the swarm.

docker swarm leave --force

πŸ“œ docker node ls

Lists all nodes in the swarm.

docker node ls

πŸš€ Service Management Commands

Command
Description
Example

✨ docker service create

Creates a new service in the swarm.

docker service create --name web -p 80:80 --replicas 3 nginx

βš–οΈ docker service scale

Scales one or more replicated services.

docker service scale web=5

πŸ”„ docker service update

Updates a service's configuration.

docker service update --image nginx:latest web

πŸ—‘οΈ docker service rm

Removes a service from the swarm.

docker service rm web

πŸ“œ docker service ls

Lists all services in the swarm.

docker service ls

πŸ“ docker service ps

Lists the tasks of one or more services.

docker service ps web

For testing docker swarm use script: \docker-swarm.sharrow-up-right.

☸️ Kubernetes

Kubernetes, also known as K8s, is an open-source platform for automating the deployment, scaling, and management of containerized applications. It groups containers that make up an application into logical units for easy management and discovery.

πŸ›οΈ Kubernetes Architecture

\Kubernetes Architecture

A Kubernetes cluster consists of a set of worker machines, called nodes, that run containerized applications. Every cluster has at least one worker node. The worker node(s) host the Pods which are the components of the application workload. The control plane manages the worker nodes and the Pods in the cluster.

✈️ Control Plane Components

The control plane's components make global decisions about the cluster (for example, scheduling), as well as detecting and responding to cluster events.

Component
Description

kube-apiserver

The API server is a component of the Kubernetes control plane that exposes the Kubernetes API. The API server is the front end for the Kubernetes control plane.

etcd

Consistent and highly-available key-value store used as Kubernetes' backing store for all cluster data.

kube-scheduler

Watches for newly created Pods with no assigned node, and selects a node for them to run on.

kube-controller-manager

Runs controller processes. Logically, each controller is a separate process, but to reduce complexity, they are all compiled into a single binary and run in a single process.

cloud-controller-manager

A Kubernetes control plane component that embeds cloud-specific control logic. The cloud controller manager lets you link your cluster into your cloud provider's API, and separates out the components that interact with that cloud platform from components that only interact with your cluster.

πŸ‘· Node Components

Node components run on every node, maintaining running pods and providing the Kubernetes runtime environment.

Component
Description

kubelet

An agent that runs on each node in the cluster. It makes sure that containers are running in a Pod.

kube-proxy

A network proxy that runs on each node in your cluster, implementing part of the Kubernetes Service concept.

Container runtime

The software that is responsible for running containers. Kubernetes supports several container runtimes: Docker, containerd, CRI-O, and any other implementation of the Kubernetes CRI (Container Runtime Interface).

πŸ“¦ Kubernetes Objects

Kubernetes objects are persistent entities in the Kubernetes system. Kubernetes uses these entities to represent the state of your cluster.

Object
Description

Pod

The smallest and simplest unit in the Kubernetes object model that you create or deploy. A Pod represents a set of running containers on your cluster.

Service

An abstract way to expose an application running on a set of Pods as a network service.

Volume

A directory containing data, accessible to the containers in a Pod.

Namespace

A way to divide cluster resources between multiple users.

Deployment

Provides declarative updates for Pods and ReplicaSets. You describe a desired state in a Deployment, and the Deployment Controller changes the actual state to the desired state at a controlled rate.

ReplicaSet

Ensures that a specified number of pod replicas are running at any given time.

StatefulSet

Manages the deployment and scaling of a set of Pods, and provides guarantees about the ordering and uniqueness of these Pods.

DaemonSet

Ensures that all (or some) Nodes run a copy of a Pod.

Job

Creates one or more Pods and ensures that a specified number of them successfully terminate.

CronJob

Creates Jobs on a time-based schedule.

⎈ Helm

Helm is a package manager for Kubernetes.

It helps you manage Kubernetes applications β€” Helm Charts help you define, install, and upgrade even the most complex Kubernetes application.

🎯 Key Concepts

Concept
Description

Chart

A Helm package. It contains all of the resource definitions necessary to run an application, tool, or service inside of a Kubernetes cluster.

Repository

A place where charts can be collected and shared.

Release

An instance of a chart running in a Kubernetes cluster. One chart can often be installed many times into the same cluster. And each time it is installed, a new release is created.

πŸš€ Core Commands Kubernetes

Command
Description
Example

helm search

Search for charts in a repository.

helm search repo stable

helm install

Install a chart.

helm install my-release stable/mysql

helm upgrade

Upgrade a release.

helm upgrade my-release stable/mysql

helm uninstall

Uninstall a release.

helm uninstall my-release

helm list

List releases.

helm list

πŸ—οΈ Other Orchestration Platforms

OpenShift

OpenShift is a family of containerization software products developed by Red Hat. Its flagship product is the OpenShift Container Platform β€” an on-premises platform as a service built around Docker containers orchestrated and managed by Kubernetes on a foundation of Red Hat Enterprise Linux.

Rancher

Rancher is a complete software stack for teams adopting containers. It addresses the operational and security challenges of managing multiple Kubernetes clusters across any infrastructure, while providing DevOps teams with integrated tools for running containerized workloads.

Mesosphere DC/OS

Mesosphere DC/OS (the Datacenter Operating System) is a distributed operating system based on the Apache Mesos distributed systems kernel. It can manage multiple machines in a datacenter or cloud as if they’re a single computer. It provides a highly elastic, and highly scalable way of deploying applications, services, and big data infrastructure on shared resources.

πŸ› οΈ 352.4 Important Commands


Last updated