CSE221 - lec11: Virtualization: Container-based OS & FireCracker

date
Nov 8, 2024
slug
cse221-lec11
status
Published
tags
System
summary
type
Post

11 - Virtualization: Container-based OS & FireCracker

Container-based Operating System Virtualization: A Scalable, High-performance Alternative to Hypervisors

Terms clarification

  • Planet Lab:
    • a distributed compute platform for distributed system research
    • resource constrained
  • Docker: a software that could create images for containers
  • Kubernetes: a runtime environment for containers
  • VServer: has influence on how Linux support containers today
  • VM: in this paper they refer to both hardware virtualization and containers as VM.

Container-based Virtualization

notion image
  • Each container is just a group of normal processes running on the host OS, with logical separation, and has the illusion that it is the only container that runs on top of this OS.
Benefits of Containers (compared with hardware virtualization)
Drawbacks of Containers (compared with hardware virtualization)
less overhead; more efficient (due to more sharing)
less isolation
faster startup
does not support multiple OSes per host
more scalable: enables better overcommitment (not all containers are active at the same time, this could improve resource utilization)
less secure
N/A
need to modify host OS

Types of Isolation Requirements

  • Fault Isolation: buggy / malicious containers should not affect other containers
  • Resource Isolation: containers should accuire the resource they are accounted for.
  • Security Isolation: logical objects (e.g. pid, file, …) of a container should not be visible to others.

Techniques to achieve isolation

  • Contexts:
    • each container has their own contexts(namespace), for logical and physical objects provided by OS, such as pid, file, ip address, port …
  • Filters:
    • tag the objects provided in OS by some container-spcific tags, only show those associated with the corresponding container.

Resouce Allocation

  • Token Bucket Filter: each container is assigned some tokens cummulated by a fix rate overtime, each time it consumes some resouces, it pays for them by available tokens.
  • Resources such as CPU, network, I/O bandwidth could be allocated in this manner.
  • This approach enables each container to get a specific share of resources in avergage.

File System

  • each container is limited to see a subtree in the file system
  • COW is applied on shared files

Evaluation

  • Micro-benchmark, network, disk: Xen has high overhead
  • CPU, memory: similar performance since there are little hypercall into hypervisor.
  • The points of this paper is not to argue that container is better than hardware virtualization, but to argue that they are excellent for the applications that suits them.

Summary

  • container: virtualize at OS level
    • techniques: contexts, filters
  • techniques for resource allocation: token bucket filter

Firecracker: Lightweight Virtualization for Serverless Applications

Serverless Applications

  • programmers only need to implement a function
  • often short running
  • could rely on external storage (that is, the serverless application itself does not provide storage)

Usecases

  • short running (often < 15s)
  • event-driven, on-demand work, instead of need to run continuously
  • bursty or periodic
  • stateless
  • edge-settings
  • microservices

Benefits of serverless applications

  • simpler for programmer, only need to focus on application logic
  • finer-granularity of accuonting, only pay for what you need
  • the ability to autoscale
  • easy to manage

Serverless Requirements

  • fast setup and teardown (since a lot of short running works come and go) (~100ms)
  • low overhead & high density
  • soft allocation: the ability to overcommit
notion image

Improvements on Fire Cracker

  • simplify VMM, specialized for serverless
    • only runs on Linux
    • only support OS-V and Linux as guest OS
    • only support a limited set of devices
  • controlled via RESTful API

Evaluation

  • Boottime: 125ms (half of QEMU’s)
  • Memory overhead: 3MB (131MB for QEMU)
  • I/O: mixed results

Summary

  • hardware based virtualization
  • optimization & simplification to reduce overhead in certain settings

© Lifan Sun 2023 - 2024