Concurrent & incremental marking

Marking is the expensive part of a major GC, because it must traverse the entire live object graph. If V8 did that in one stop-the-world pause, it could freeze the main thread for ~100 ms — a visible jank, a dropped frame, a stuttering animation. Incremental and concurrent marking exist to make that pause nearly disappear. This is where most of Orinoco's "short pause" reputation is earned.

::: info Ubiquitous language Incremental: split the work into small steps interleaved with JS execution. Concurrent: do the work on background threads while JS runs on the main thread. V8 uses both, together. :::

Incremental: many small steps

Incremental marking breaks the traversal into chunks and runs a little at a time — on allocation, or on small tasks — instead of all at once:

class IncrementalMarking final {
  void Start(GarbageCollector, GarbageCollectionReason, const char* reason);
  bool Stop();
  void AdvanceAndFinalizeIfComplete();   // do a step; finish if marking is done
  void AdvanceOnAllocation();            // do a step driven by allocation rate
};

— src/heap/incremental-marking.h#L55-L100

Each step is capped to a few milliseconds, and steps are paced by how fast the program allocates:

static constexpr size_t kMajorGCOldGenerationAllocationObserverStep = 256 * KB;
static constexpr v8::base::TimeDelta kMaxStepSizeOnTask =
    v8::base::TimeDelta::FromMilliseconds(1);
static constexpr v8::base::TimeDelta kMaxStepSizeOnAllocation =
    v8::base::TimeDelta::FromMilliseconds(5);

— src/heap/incremental-marking.cc#L50-L80

The pacing is adaptive: a program that allocates fast (and so will hit the heap limit sooner) is made to mark faster, so marking finishes before the heap fills.

Concurrent: marking on background threads

Incremental steps still run on the main thread. Concurrent marking goes further: it runs markers on background threads in parallel with JavaScript, so the main thread barely participates:

class ConcurrentMarking {
  void TryScheduleJob(GarbageCollector, TaskPriority = kUserVisible);
  void Join();
  bool Pause();
  void RescheduleJobIfNeeded(GarbageCollector, TaskPriority = kUserVisible);
};

— src/heap/concurrent-marking.h#L35-L100

This is hard because the main thread is mutating the object graph while background threads traverse it. Two mechanisms make it correct:

The write barrier records pointers the mutator writes during marking, so concurrent markers never miss an edge (preserving the tri-color invariant).
The marking bitmap with atomic set operations (see Mark-Compact) lets many threads claim objects without a global lock.

::: details A scalability detail worth admiring Updating each page's live-byte counter atomically for every marked object would serialize the markers on those counters. So each concurrent marker keeps a tiny local cache of per-page live bytes and flushes it with a CAS, collapsing many atomic ops into few:

// Caches page live bytes during concurrent marking to avoid costly CAS
// operations on MutablePage::live_byte_count_ for each traced object.
class MemoryChunkLiveBytesMap { … static constexpr size_t kTableSize = 32; };

— src/heap/concurrent-marking.cc#L57-L100. This kind of "batch the contention" pattern recurs all over high-performance runtimes. :::

How they combine

DEFINE_BOOL(incremental_marking, true,  "use incremental marking")
DEFINE_BOOL(concurrent_marking,  true,  "use concurrent marking")
DEFINE_INT(concurrent_marking_max_worker_num, 7, "max concurrent marking workers")
DEFINE_IMPLICATION(concurrent_marking, incremental_marking)

— src/flags/flag-definitions.h#L2545-L2591

Concurrent marking requires incremental marking (it relies on the same write barrier and start/finish structure). In a typical major GC: marking starts, runs concurrently on up to ~7 background threads (with incremental main-thread steps helping along), and then a short atomic pause finishes marking and does the parts that cannot be concurrent. The result is that the user-visible pause is a fraction of the total marking work.

The big picture

 ┌─ JS running on main thread ─────────────────────────────────────────┐
 │   (write barrier records mutations)                                   │
 └───────────────────────────────────────────────────────────────────────┘
        ▲ tiny incremental steps         ▲ short final pause: finish marking
 ═══════╪════════════════════════════════╪═════════════════════════════►  time
   background threads marking concurrently (atomic bitmap, work-stealing)

Concurrent & incremental marking

Incremental: many small steps

Concurrent: marking on background threads

How they combine

The big picture

See also