CSE221 - lec18: File System Cont. : GFS
date
Dec 7, 2024
slug
cse221-lec18
status
Published
tags
System
summary
type
Post
The Google File System
Settings
- very large files (different from what we saw in UNIX), 100MB - 1GB
- write: mostly sequential, append
- failures are common (due to the quality and quantity of the machines)
- care about throughput than latency (usually for large scale data analysis purpose instead of serving user requests)
- ok with modifying applications (mostly used by experts at google)
- large scale: thousands of machines
- support concurrent writers (such as for MapReduce)
GFS
Files
- files are divided into chunks (64 MB)
- stored as regular Linux files
Namespace
- hierarchical
- no directory data structure, directory is an illusion. only a big hash table mapping full name to files
GFS Architecture
- client
- master: for metadata operations
- chunk servers: for data operations
GFS Example
- control and data flow is separated
- primary replica decide write order
Append Operations
- record-append
- primary chunk server orders appends operations
- at-least-once semantics: retry on failures, so may have duplicate of incomplete content
Expose Inconsistencies to Apps
- padding between records (due to missing)
- duplicates (can be resolved using record-id)
- fragments (due to failures)
- 2 and 3 can be checked using checksum
Summary
- distributed FS
- expose inconsistencies to apps
- challenges: scale, consistencies, failures