# 6.824 Lecture 2 - RPC and Threads

## Intro

• Why Go?
• stdlib RPC package
• Memory-safe, etc. etc.
• Each thread: one PC, one set of registers, one stack. Shared address space.
• I/O concurrency: a different thread can execute when one is waiting for IO
• Parallelism: 2+ compute threads actually running in parallel
• Convenience: scheduled and/or background jobs
• Async / Event-Driven
• Single thread, sits in a loop and waits for input
• IO concurrency but no parallelism
• Can be more performant when the number of jobs/threads would be large, like a server with a million clients
• Hybrid: start N threads with an event loop in each one
• Two levels of scheduling: OS scheduling context-switches between OS threads, Go then schedules goroutines onto that thread
• Data races: fix with mutexes, channels
• Deadlock: in its simplest form: $T1$ is blocked waiting on $T2$, and $T2$ is blocked waiting on $T1$.
• Can we consider individual instructions to be atomic in a threading context?
• No, some instructions are not
• A 32-bit store is likely atomic. If two threads write a 32-bit value to memory at the same time, you’ll end up with one or the other, but not a mixture.
• A one-byte store likely isn’t. If two threads write a one-byte value to memory at the same time, you might end up with a mixture based on which 8 bits are being modified. Depends on the processor, but this is likely implemented as: load 32 bits, modify 8 bits, store 32 bits.
• More complex instructions like increment likely aren’t unless explicitly specified.
• Should locks be stored inside a data structure, like a ConcurrentHashMap?
• Reasonable choice, but has some issues
• If you need to coordinate two of these data structures, locking needs to happen at a higher level, so the per-data structure locks are wasted. More importantly, this could cause deadlocks.
• Coordination in Go
• Channels
• sync.Cond
• WaitGroup

## Web Crawler

• Recursively fetch pages and pages they link to
• Avoid cycles because the graph is cyclic; find a tree-shaped subset of this graph
• The actual fetch is embarassingly parallel
• Three styles: serial/recursive (DFS), concurrent + mutex, concurrent + channels
• Go maps are always pass-by-reference
• Go closures are similar to JS; variables within the closure aren’t necessarily copied when the closure is defined.
• Go has a race detector; invoke with -race
• (Possibly among other things, it) Checks for reads of shared data from a goroutine right after a write from a different goroutine without an intermediate lock/unlock
• Not guaranteed to catch all data races
• Runtime check, not static analysis
• Concurrent + mutex version uses an unbounded number of goroutines