Compaction Error Categorization & Logging in Cassandra 4.x/5.x

Compaction failures in Apache Cassandra rarely announce themselves with a clean stack trace. They surface as latent I/O stalls, tombstone accumulation, or silent SSTable corruption that propagates into repair backlogs and a degraded read path. For DBAs and distributed-systems engineers running v4.x and v5.x clusters, a deterministic way to categorize those failures — and route each category to the correct remediation — is the difference between a five-minute automated recovery and a multi-hour incident. This guide sits under Advanced Compaction Strategy Tuning & Monitoring; read that first if you have not yet aligned your table strategy with your workload. Use this page when you need to turn raw log lines from system.log and debug.log into a severity-tiered signal that automation can act on without human triage.

Everything here is validated against Cassandra 4.0, 4.1, and 5.0. Where behaviour diverges between releases — the compaction_throughput_mb_per_sec → compaction_throughput rename in 4.1, the STCS → UnifiedCompactionStrategy default change in 5.0, and the arrival of the system_views virtual tables — the version-specific detail is called out inline.

How compaction failures surface in the logs

Cassandra’s compaction subsystem emits telemetry across two files: system.log carries INFO/WARN/ERROR operator-facing events, while debug.log carries the finer-grained CompactionTask and CompactionExecutor traces you need for root-cause work. In v5.0, UnifiedCompactionStrategy introduces more granular state transitions and dynamic I/O budgeting, but the underlying failure signatures fall into four stable categories. Categorizing on the signature — not on the severity Cassandra happened to log — is what makes routing deterministic.

I/O and resource exhaustion. java.io.IOException: No space left on device, FSWriteError/DiskFullException, or RejectedExecutionException from a saturated CompactionExecutor. These indicate storage-capacity breaches, filesystem quota limits, or a thread pool that has no headroom. A compaction that fails mid-write leaves partial SSTables that the next cycle must reconcile, so exhaustion is rarely a one-shot event.
SSTable integrity failures. CorruptSSTableException, ChecksumMismatchException, or an IndexOutOfBoundsException during read-ahead. These usually stem from interrupted compaction cycles, underlying NVMe/SSD degradation, or an unclean node shutdown that left an SSTable component (the Data.db, Index.db, or CompressionInfo.db file) inconsistent with its checksum.
Tombstone and GC pressure. Scanned over N tombstones warnings, TombstoneOverwhelmingException, or OutOfMemoryError: Java heap space paired with long GC pauses. These correlate directly with read amplification and, left unaddressed, trigger speculative-retry storms. The root cause lives in tombstone management and garbage collection, where gc_grace_seconds and purge eligibility decide whether compaction can actually reclaim the deletes.
Strategy and configuration drift. InvalidRequestException raised at DDL time when an ALTER TABLE ... WITH compaction = {...} statement carries invalid options — an out-of-range min_threshold/max_threshold, or a min_threshold greater than max_threshold. This is distinct from runtime throughput governance: compaction_throughput_mb_per_sec (renamed compaction_throughput in 4.1) is a cassandra.yaml / nodetool setcompactionthroughput setting and is not validated by table DDL. Drift of this kind follows manual ALTER TABLE edits, rolling restarts without validation, or mismatched cassandra.yaml deployments. Because strategy choice itself drives the failure surface, categorization has to be read against the trade-offs between STCS, LCS, and TWCS.

The tree below classifies a raw compaction exception and routes it to the correct remediation branch.

Compaction error categorization and remediation: route each exception by its signature, not by the log level Cassandra happened to emit.

Configuration reference: the knobs that shape each error class

Error categorization is only actionable if you know which configuration surface governs each category. The table maps the parameters you will read or set during triage. All are validated against 4.x/5.x; the throughput key name depends on release.

Key	Scope	Default	Valid range	Impact on the failure surface
`compaction_throughput` (4.1+); `compaction_throughput_mb_per_sec` (4.0)	cassandra.yaml / runtime	`64MiB/s` (4.1), `16` (4.0)	`0` (unthrottled) – disk ceiling	Too low → chronic backlog and exhaustion; too high → read/write thread starvation
`concurrent_compactors`	cassandra.yaml	min(cores, disks), capped	`1` – core count	Under-provisioning stalls the `CompactionExecutor`; over-provisioning triggers `RejectedExecutionException` and I/O contention
`tombstone_warn_threshold`	cassandra.yaml	`1000`	`100` – `100000`	Emits the `Scanned over N tombstones` WARN that seeds the tombstone category
`tombstone_failure_threshold`	cassandra.yaml	`100000`	`1000` – `1000000`	Above this, queries abort with `TombstoneOverwhelmingException`
`min_threshold` / `max_threshold`	table (STCS)	`4` / `32`	`2` – `64`, min < max	Invalid pairs raise `InvalidRequestException` at DDL — the config-drift category
`gc_grace_seconds`	table	`864000`	`0` – any	Shorter than repair cadence → tombstone resurrection; blocks purge and inflates GC pressure

Validate table-level compaction options before they reach a node. A safe change on an STCS table looks like this, and the DDL is rejected atomically if the option set is invalid — a rejection here is a caught config-drift error, not a runtime failure:

ALTER TABLE sensor.readings
WITH compaction = {
  'class': 'SizeTieredCompactionStrategy',
  'min_threshold': '4',
  'max_threshold': '32'
};

Runtime throughput governance is separate and is applied per node, without touching schema:

# Cassandra 4.1 / 5.0 — value is MiB/s; 0 disables the throttle
nodetool setcompactionthroughput 128
nodetool getcompactionthroughput

Step-by-step: parse, classify, and route compaction errors

The following procedure turns a live log stream into a severity-tiered signal. It is safe to run continuously on a production node; every step that acts on a live node is preceded by an inline safety gate.

1. Confirm the log surface and grep for the signatures

Before automating, verify the categories actually appear in your logs and that you are reading the right file. On 4.x/5.x the operator log is system.log:

grep -E "CorruptSSTableException|No space left on device|Scanned over [0-9]+ tombstones|InvalidRequestException" \
  /var/log/cassandra/system.log | tail -20

Expected output is a mix of WARN/ERROR lines with a class name and a keyspace/table scope. If nothing returns, widen to debug.log, where CompactionTask traces live.

2. Classify each line by signature, not by log level

Cassandra’s own log level is unreliable for routing — a corrupt SSTable and a benign GC pause can both land at WARN. Classify on the signature. The rule set below is ordered so the most destructive categories win:

#!/usr/bin/env python3
# requirements: python>=3.10 (stdlib only)
"""Classify a Cassandra compaction log line into a severity tier."""
from __future__ import annotations
import re

# Ordered most-destructive first; first match wins.
SEVERITY_RULES: list[tuple[str, str]] = [
    (r"(No space left on device|DiskFullException|FSWriteError)", "CRITICAL"),
    (r"(CorruptSSTableException|ChecksumMismatchException|RejectedExecutionException)", "CRITICAL"),
    (r"(TombstoneOverwhelmingException|OutOfMemoryError: Java heap space)", "HIGH"),
    (r"Scanned over \d+ tombstones", "HIGH"),
    (r"GC for (ConcurrentMarkSweep|G1).*\d+ms", "MEDIUM"),
    (r"InvalidRequestException.*compaction", "MEDIUM"),
    (r"GC for .* \d+ms", "LOW"),
]

def classify_compaction_error(log_line: str) -> str:
    """Return the severity tier for a single log line, or INFO if unmatched."""
    for pattern, severity in SEVERITY_RULES:
        if re.search(pattern, log_line, re.IGNORECASE):
            return severity
    return "INFO"

This classifier is the same primitive the Python monitoring for Cassandra compaction pipeline consumes, where parsed events are enriched with JMX metrics and pushed to a central observability stack.

3. Route each tier to its remediation — with a safety gate per action

The severity tier dictates the action. Each CRITICAL remediation isolates the node before mutating any data, so a false positive degrades throughput but never destroys an SSTable.

CRITICAL — isolate first: nodetool disablebinary to stop new client work, then nodetool disableautocompaction to freeze the failing subsystem. Only then run nodetool verify -e or nodetool scrub. Safety gate: confirm the replica set still satisfies your consistency level (a peer must be UN in nodetool status) before isolating.
HIGH — throttle and schedule: lower compaction_throughput via nodetool setcompactionthroughput, then schedule incremental repair, validating pending-task headroom against compaction backlog analysis and alerting baselines. Safety gate: do not raise throughput to “catch up” — that starves the read path.
MEDIUM — tune and observe: adjust runtime parameters or correct the offending DDL, monitor SSTable generation rates, and defer intervention unless backlog exceeds SLA.
LOW — record only: log for capacity trending and GC-tuning cycles; no automated action.

Severity-routing pipeline: the classifier fans each line into a tiered lane, and every lane passes a safety gate (the lock) before its nodetool action runs.

4. Recover integrity failures in a strict order

Corruption is the only category where the order of operations is load-bearing. Streaming a replacement SSTable while compaction is still running risks overlapping the very file you are repairing. Run the sequence exactly:

Halt compaction on the affected node to prevent SSTable overlap during streaming: nodetool disableautocompaction <keyspace> <table>.
Validate the data, not just the checksums: nodetool verify -e <keyspace> <table>. Plain nodetool verify checks only checksums; the -e (extended) flag reads and validates the rows and isolates corrupted segments.
Repair the range while watching streaming throughput: nodetool repair -pr <keyspace> <table> (incremental is the default on 4.x/5.x; pass -full only when a full repair is genuinely required), monitoring nodetool netstats and nodetool compactionstats as it runs.
Re-enable compaction only after pending tasks drop to zero and backlog metrics stabilize: nodetool enableautocompaction <keyspace> <table>.

This sequence keeps the fallback routing and read-path optimization mechanisms effective during the recovery window, and it aligns repair with the incremental paradigm rather than forcing an overlapping full repair.

Verification & observability

Confirm each remediation actually landed rather than assuming the log went quiet. Three surfaces corroborate a recovery:

Live task state. nodetool compactionstats -H should show the failing table draining, not stuck. On Cassandra 5.0 you can read the same state without JMX through the virtual table, which is essential for automation inside restricted networks:

SELECT keyspace_name, table_name, task_id, completion_ratio, unit
FROM system_views.sstable_tasks;

JMX counters. Watch the org.apache.cassandra.metrics:type=Compaction MBeans — PendingTasks, CompletedTasks, and BytesCompacted. A recovering node shows BytesCompacted advancing and PendingTasks falling; a still-stalled node shows a flat BytesCompacted with non-zero pending, which is the starvation signature covered in async compaction tracking and metrics.
Log confirmation. Grep for the absence of new signatures and the presence of a clean finish: grep "Compacted" /var/log/cassandra/system.log | tail should show Compacted ... to [...] lines for the affected table with no interleaved Corrupt/FSWriteError entries.

Failure modes & rollback

Three failure modes are specific to error-driven automation on compaction. Each has a detection command and an explicit rollback.

Scrub-induced data loss on a recoverable SSTable

nodetool scrub will discard rows it cannot deserialize. If the underlying issue was a transient checksum mismatch rather than true corruption, scrubbing can silently drop live data. Detect: compare nodetool tablestats row estimates before and after; a sharp drop is the tell. Rollback: scrub writes snapshots to a pre-scrub directory under the table; restore from that snapshot with nodetool refresh and re-run nodetool verify -e to confirm the SSTable was genuinely unrecoverable before scrubbing again.

Classifier false-positive isolating a healthy node

An over-broad regex can tag a benign GC line as CRITICAL and isolate a node that was fine, dropping capacity below your consistency level. Detect: nodetool status shows an unexpected node in a disabled/DN state with no corresponding disk or corruption error in the log. Rollback: re-enable the node with nodetool enablebinary and nodetool enableautocompaction, then tighten the offending SEVERITY_RULES pattern — anchor GC rules so they never overlap the corruption rules.

Config-drift rejection mistaken for a runtime outage

An InvalidRequestException from an ALTER TABLE is a caught, atomic rejection — the schema is unchanged — but naive automation can escalate it to a node incident. Detect: the exception carries compaction and a DDL context, and nodetool compactionstats shows no actual failure. Rollback: none needed at the node level; correct the option set (verify min_threshold < max_threshold) and re-issue the ALTER TABLE. Route this category to MEDIUM, never CRITICAL.

FAQ

Should I route on Cassandra’s log level or on the exception signature?

On the signature. Cassandra logs corruption, disk-full, and benign GC pauses at overlapping levels, so WARN/ERROR alone cannot distinguish a node-down event from a slow collection. Match the class name (CorruptSSTableException, DiskFullException, TombstoneOverwhelmingException) and let severity be a property of the category, not of the log line.

When is `nodetool scrub` safe versus `nodetool verify`?

Run nodetool verify -e first — it is read-only and reports whether an SSTable is actually corrupt. Only escalate to nodetool scrub when verify confirms unrecoverable corruption, because scrub rewrites the SSTable and discards rows it cannot read. Treating verify as the gate for scrub prevents scrub-induced data loss on transiently flagged files.

Did the throughput parameter name change between Cassandra 4.0 and 4.1?

Yes. compaction_throughput_mb_per_sec was renamed compaction_throughput in 4.1 (the value is now a size string such as 64MiB/s). Automation that writes cassandra.yaml must branch on version, but nodetool setcompactionthroughput <MiB/s> accepts a plain integer on both, so prefer the nodetool path for runtime changes.

Why does an `InvalidRequestException` appear during compaction changes but not a node failure?

Because table compaction options are validated at DDL time, before any node acts. An out-of-range threshold or a min_threshold greater than max_threshold is rejected atomically — the schema never changes and no SSTable is touched. It is a configuration-drift signal, so route it to MEDIUM and fix the statement, not to CRITICAL.

How does tombstone pressure differ from a genuine I/O failure in the logs?

Tombstone pressure shows Scanned over N tombstones warnings and, past tombstone_failure_threshold, TombstoneOverwhelmingException on the read path — the compaction subsystem itself is healthy. I/O failure shows No space left on device or FSWriteError from the write side. The first is tuned via gc_grace_seconds and strategy choice; the second requires capacity or thread-pool remediation.

Advanced Compaction Strategy Tuning & Monitoring — the parent guide covering strategy selection, tuning, and observability end to end.
How to tune compaction_throughput_mb_per_sec safely — step-by-step validation for the throughput knob referenced throughout this page.
Compaction backlog analysis & alerting — dynamic thresholds and tiered severity that consume these categorized events.
Python monitoring for Cassandra compaction — wiring the classifier into Prometheus, Datadog, and custom telemetry.
Async compaction tracking & metrics — the starvation signatures that corroborate a stalled-compaction diagnosis.

Compaction Error Categorization & Logging in Cassandra 4.x/5.x

Related guides