Skip to content
dbm.lt
Open menu
Back to blog
performance t-sql benchmarks storage compression

ROW vs PAGE compression: it depends on your data, and it can make inserts faster

Benchmarked NONE / ROW / PAGE data compression on SQL Server 2025 across five data shapes. There is no single compression ratio — ROW reclaims fixed-width waste, PAGE adds dictionary compression for repetitive data, and neither touches high-entropy data. Plus the counterintuitive bits: compression that speeds up inserts, the warm-vs-cold read tradeoff, and what random-key inserts actually do to a compressed table.

SQL Server data compression looks like a free lunch: ALTER TABLE … REBUILD WITH (DATA_COMPRESSION = PAGE) and your table gets smaller. But “how much smaller” has no single answer — it depends entirely on the shape of your data — and the performance story has two genuinely counterintuitive twists. So I measured it: NONE vs ROW vs PAGE, on SQL Server 2025, across five deliberately different data shapes, with the same timing methodology as the rest of the series.

First, a one-paragraph refresher. ROW compression stores fixed-width types in a variable-width format — a BIGINT holding 42 stops costing 8 bytes, a CHAR(50) holding "item_7" stops being padded to 50. PAGE compression does ROW plus prefix compression and a per-page dictionary, so repeated and similar values across the page collapse. ROW is cheap CPU; PAGE is more. Neither does anything for data that is already compact and random.

There is no “compression ratio” — there’s your data

Same 50k–300k rows, five shapes, measured used-KB as a percentage of uncompressed:

ShapeROWPAGEPAGE factorWhy
padded_char (CHAR(n), short content)25%15%~6.6×ROW trims the padding; PAGE adds more
bigint_small (BIGINT, small values)33%32%~3.1×ROW does all the work; PAGE adds ~nothing
lowcard (repeated codes/strings)67%34%~2.9×PAGE’s dictionary is the win; ROW is modest
realistic (typical OLTP row)80%44%~2.3×PAGE is the clear pick
highentropy (GUID + random bytes)98%98%~1.0×random data won’t compress — don’t bother

Read the spread on that table: PAGE shrank the padded-CHAR table by 6.6× and the random-GUID table by nothing. Two rules fall out:

  • ROW is the safe baseline win wherever you have fixed-width slack — oversized integer types, padded CHAR, sparse DECIMAL. It’s cheap and often pays for itself.
  • PAGE earns its extra CPU specifically on low-cardinality, repetitive data. On a BIGINT-heavy table it adds almost nothing over ROW; on a status/category/code table it roughly halves ROW again.
  • On high-entropy data (GUIDs, hashes, encrypted/random blobs, already-compact rows) compression buys you nothing but CPU. Don’t apply it reflexively.

Before you compress anything, ask the engine: sp_estimate_data_compression_savings predicts each level’s size from a sample. In the benchmark it called the big lowcard win and the highentropy non-win correctly — use it.

The counterintuitive write: compression can make inserts faster

Everyone “knows” compression trades CPU for space, so it must slow writes down. On compressible data, the opposite happened. Bulk-loading the compressible shapes, ROW compression ran the insert faster than NONE — roughly 60–75% of the uncompressed time (padded CHAR ~63%, bigint_small ~70%, lowcard ~75%).

The reason is that an insert’s cost is dominated by pages touched and transaction log written, not by the compression arithmetic. Fewer bytes means fewer pages and less log — and on compressible data that saving outweighs the cost of ROW-encoding the values. PAGE was usually still faster than NONE but slower than ROW (more CPU per page, and on lowcard it landed at roughly break-even). On the incompressible and realistic shapes there was no byte saving to bank, so compression was a small net loss on insert. ROW compression on write-heavy, compressible tables is frequently a throughput win, not a tax — for sequential loads.

Reads: a CPU-vs-IO tradeoff, not a free win

This is where you have to be honest about your workload.

  • Warm cache → compression is slower. When the pages are already in the buffer pool, a scan does no IO, so you pay pure decompression CPU for nothing. Warm scans of compressed tables ran ~130–200% of the uncompressed time, with PAGE slower than ROW.
  • Cold / IO-bound → compression is faster. Flush the buffer pool and the same scan has to read from disk, where fewer pages wins: the cold scan of the padded-CHAR table under PAGE ran at ~40% of uncompressed.
  • Point seeks are a wash. Decompressing a single page to return one row is lost in the noise (~parity across all three levels).

So the read benefit of compression is the same lesson as UTF-8 collations: the payoff is buffer-pool density and less IO under memory pressure — you cache 2–6× the rows per GB of RAM — not a faster query on data that’s already in memory. On a server that comfortably holds its working set in RAM and is CPU-bound, PAGE compression on hot tables can cost you. On a server that’s IO-bound or memory-starved, it’s a clear win. Know which one you are.

Random inserts: compression doesn’t fragment you, your key does

The fear with compression and OLTP is page splits: pack more rows per page, surely you split more often and fragment faster? I tested it directly — a table clustered on a NEWID() key (the classic fragmentation generator), loaded in many small random batches, NONE vs ROW vs PAGE:

100k rows, 500 random batches:

LevelLeaf pages (≈ splits)FragmentationLoad time vs NONE
NONE96798.7%100%
ROW83799.2%127%
PAGE76998.8%107%

Two things hold and one flips versus the sequential case. Fragmentation was ~99% for all three — it’s a function of the random key, not the compression. And compression reduced the page count and leaf allocations (denser rows → fewer pages → fewer splits). But unlike the clean sequential load, here the random load was slower compressed (ROW ~+27%, PAGE ~+7%): when every insert triggers a split, you pay the compression CPU on the hot, split-heavy path and there’s no sequential-log saving to offset it. So compression doesn’t fragment you under random inserts — fragmentation is all down to the key — but on a split-heavy random workload it does add a real write cost. (Your NEWID() clustered key, meanwhile, is the thing actually worth fixing.)

A couple of gotchas before you flip the switch

  • LOB / off-row data is not compressed. ROW and PAGE work on in-row pages; a VARCHAR(MAX) value stored off-row is untouched. A table that’s mostly big blobs won’t shrink.
  • Compression is per index, not per table. ALTER TABLE … REBUILD WITH PAGE compresses the clustered index; every nonclustered index stays uncompressed until you ALTER INDEX … REBUILD WITH (DATA_COMPRESSION = PAGE) it explicitly. It’s easy to compress the table and leave half its size on the floor in the indexes.
  • Changing it is a rebuild, not a metadata flip — a full, size-of-data, log-generating operation (ONLINE only on Enterprise). Plan the maintenance window.

Verdict

There is no universal answer, only a procedure:

  1. sp_estimate_data_compression_savings first. If it predicts <10–15%, skip it — you’re in high-entropy territory and it’s all cost.
  2. ROW is the low-risk default for tables with fixed-width slack; it often helps write throughput and barely touches read CPU.
  3. PAGE for low-cardinality, repetitive, read-mostly or IO-bound tables — the archive-y, cold, or memory-starved data where the dictionary pays and the warm-CPU penalty rarely bites.
  4. Compress every index, not just the table, and remember the win is density and IO, realised under pressure — not a warm-query speedup.

Numbers from SQL Server 2025 (Developer), five synthetic data shapes, 50k–300k rows, on a shared host — trust the ratios over the absolute milliseconds. The harness builds the shapes, runs the matrix and the random-insert split study, and prints the receipts.


Need help with something like this?

I take on focused SQL Server engagements — performance, migrations, and CI/CD.

Book a 30-min consult