Best Practices

Tiered Historian Storage: Keep Years of Trend Data Without Running Out of Disk

OptiZeus TeamApril 20, 202611 min read

Introduction

A typical medium-sized plant logging 500 tags at 1-second resolution produces about 43 million samples per day. Over a year, that's 15 billion rows. Store them as full-resolution raw records, and you're looking at roughly 250 GB of disk space per year — assuming no redundancy, no indexes, no backup copies.

And yet: when operators open the trend view, they almost always want the last few hours or, at most, the last shift. The rare exception is a compliance audit or a root-cause analysis that looks back weeks or months — and even then, nobody is zooming in to individual 1-second samples from three years ago.

Tiered historian storage exploits this access pattern. Recent data stays at full resolution on fast storage. Older data gets progressively downsampled and compressed. You keep years of queryable history in tens of gigabytes instead of terabytes, and recent queries stay as fast as they've ever been.

The Three Tiers

Borrowed from general-purpose database and data-warehouse architectures, the three-tier model works like this:

🔥 Hot tier — raw samples, full resolution

What: every sample exactly as it was recorded
Where: fast SSD, local disk, primary database
When: the most recent few days to a month
Why: operators query recent data constantly; they want full fidelity and sub-second response times
Footprint: large per sample, but small total because the time window is narrow

🌡️ Warm tier — downsampled aggregates

What: one record per time bucket (commonly 1 minute), carrying the average (and optionally min/max) of the raw samples in that bucket
Where: same fast storage as hot, or a secondary volume — it's small enough that it rarely matters
When: from the end of the hot window up to a year or more
Why: at plant timescales, a 1-minute-averaged trend is visually indistinguishable from a 1-second trend except at zoom levels almost nobody uses. Shrinks data 60× (at 1-minute buckets) with negligible loss of information
Footprint: 60× smaller than hot, for the same time window

🧊 Cold tier — gzipped raw archive

What: the original full-resolution data, gzip-compressed
Where: cheap storage — a NAS share, a secondary drive, or cloud object storage
When: everything older than the hot window, retained for compliance and forensic access
Why: regulatory requirements ("keep 7 years") don't require fast access — they require that the data still exists and is retrievable. Decompressing an archive file on demand is acceptable for the kind of query that looks at old data.
Footprint: 80–90% smaller than raw hot, and typically never re-read

Why Not Just Use a Time-Series Database?

Products like InfluxDB, TimescaleDB, and Prometheus already downsample internally. Why build a tier on top?

Two reasons:

Control over the retention policy. Built-in downsampling in most time-series databases uses a single, global policy. Tiered storage lets you set per-tag retention — keep pressure readings for 90 days raw but temperature readings for 7 days raw, because temperature changes slowly and old raw data adds little value.
Independence from any particular DB. File-based tiering works whether your historian uses SQLite, SQL Server, MySQL, or binary archive files. The same logic works against any backend.

Time-series databases remain an excellent choice when you want someone else to solve the tiering problem. Explicit tiering is the right choice when you need to understand exactly where each sample lives and how it's being aggregated.

The Migration Job

Whatever the underlying storage, the tiering logic is a scheduled job that runs once a day (usually just after midnight):

```

for each tag:

for each daily archive file older than hotRetentionDays:

if no warm file exists for this day:

generate warm file: group raw samples into bucketMs buckets, write avg per bucket

if the raw file has not yet been gzipped:

gzip the raw file and delete the uncompressed version

for each warm file older than warmRetentionDays:

delete the warm file (keep only the cold gzip archive)

```

The sequence matters. You want to create warm files before gzipping the raw data — so the warm file is generated from the fast uncompressed source, not from a decompressed gzip. And you want to purge old warm files before the cold-compression step so that if the warm retention is shorter than the hot retention (unusual but possible), you don't accidentally delete data you just created.

Query Routing

A well-designed tiered historian makes the tier boundaries invisible to the querying application. The query engine picks the right source based on the time range:

Pure hot range → read raw files only, return full-fidelity samples
Range spanning hot + warm → read raw for the recent portion, warm for the older portion, merge on the timestamp axis
Pure warm range → read warm files, return downsampled samples
Range including cold data → decompress cold files on demand, return full fidelity

The client displaying the trend can optionally show an indicator per pen — 🔥 / 🌡️ / 🧊 — so the operator knows whether they're looking at raw or downsampled data. Most of the time they don't care; when they do care (during a detailed incident review), they can widen their zoom to pull in the cold tier.

Choosing Retention Thresholds

There's no universal answer, but these heuristics cover most plants:

Hot retention: 7–30 days. Long enough to cover the typical operator question ("show me yesterday", "last week"), short enough that the hot tier stays small and fast.
Warm bucket: 1 minute. Short enough to catch brief excursions that matter for trend analysis, long enough to cut data volume by ~60×.
Warm retention: 1–3 years. Most compliance requirements and most operational questions fit within this window.
Cold retention: indefinite, or per regulatory requirement. Gzipped archives are cheap; keep them until legal says otherwise.

A critical tag (setpoint, safety interlock) might warrant longer hot retention than a routine pressure reading. Per-tag overrides give you that flexibility without running two separate historians.

Operational Considerations

A few things to watch when running a tiered historian in production:

Clock drift. The migration job uses the local system clock. If your server's clock is off by hours, you might migrate data that shouldn't be migrated yet. NTP isn't optional.
Disk IO during migration. The nightly job reads old files, writes new files, deletes originals. On large archives this is meaningful IO — schedule it for your quietest maintenance window.
Backup strategy. Warm and cold files need to be in your backup plan. Hot is typically also backed up, but it's worth confirming that the backup retention policy aligns with the historian retention policy.
Monitoring. A dashboard showing per-tier file counts and disk usage over time catches problems early — a tier that's not shrinking after migration, a disk filling up faster than expected, or a tag that got stuck in hot because migration failed for some reason.

Conclusion

Tiered historian storage is the unglamorous architectural decision that keeps a plant historian from becoming a disk-space problem. By shrinking data 60× for the 95% of history nobody queries at full resolution, you buy yourself years of retention headroom without giving up fidelity where it matters.

OptiZeus ships with file-based tiered historian storage built in. Recent data stays in raw per-day .hda files for full-resolution queries. Files older than the configurable hot retention window are downsampled to .whda warm siblings (configurable bucket size from 10 seconds to 1 hour), and the raw files are gzipped to .hda.gz for long-term storage. A nightly scheduler runs the migration automatically, the admin page shows per-tier file counts and disk usage, and a "Run Migration Now" button lets you kick off the process on demand. The query engine transparently reads across tiers, so trends span hot, warm, and cold data without the user ever noticing which is which.