Technical White Paper
150 million rules at sub-microsecond speed
Enterprise and carrier-grade web filtering increasingly means combining threat-intelligence feeds, category blocklists and regulatory domain lists — aggregated, these reach tens to hundreds of millions of rules. Two questions decide whether a gateway can serve that scale: can it hold the rules at all, and is it still fast?
EnforceGate vX answers both on a single commodity machine. In our in-house benchmark it loaded and enforced 150 million domain-policy rules in under 5 GiB of memory, resolved the common request case in about two-tenths of a microsecond, and — critically — held that speed flat as the rule set tripled from 50M to 150M. Capacity and latency are decoupled by design: adding rules adds memory, not delay.
01 Why this matters to the business
For a CTO, the question behind a web gateway is operational risk: will the box that inspects every outbound request become the bottleneck — or the single point of failure — as policy grows? For a CSO, it is coverage and control: can we enforce the full breadth of threat-intel and category data we are paying for, on our own infrastructure, without shipping every lookup to a vendor cloud?
Most products force a trade-off. Cloud-rated filtering services keep the big database off-box but add a network round-trip — and an external dependency — to decisions they cannot answer from cache. On-box engines that try to hold large custom lists tend to slow down or run out of memory as the lists grow. EnforceGate vX is built to avoid both: the entire policy lives in memory on your node, and the match structure is designed so that lookup time does not grow with rule count.
02 The headline: 3× the rules, same speed
The result that matters is not raw capacity — it is that per-decision latency stays flat as the policy grows. Tripling the rule set from 50 million to 150 million leaves the common-case decision time essentially unchanged at roughly 0.23 µs in high-density mode. Capacity scales; latency does not.
"Common case" means traffic to a destination that is not on a blocklist — the dominant real-world pattern, since most requests are to permitted sites. At 150M rules that decision clears in about 0.2 µs, equivalent to millions of policy decisions per second on a single CPU core. Real deployments serve requests across many cores in parallel, so aggregate throughput scales further.
03 A 150M-rule policy fits a modest server
High-density mode runs 150 million rules in under 5 GiB of steady-state memory — about 33 bytes per rule — and memory grows linearly and predictably with rule count. Capacity planning is a simple multiplication, with no cliffs or sudden jumps. The whole policy loads well within a 16 GiB budget, comfortably inside a single mid-range VM or appliance.
| Rules | Memory — high-density | Memory — performance | Common-case decision | On-disk policy |
|---|---|---|---|---|
| 50 million | 1.7 GiB | 4.4 GiB | 0.23 µs | 0.86 GiB |
| 100 million | 3.2 GiB | 8.8 GiB | 0.26 µs | 1.72 GiB |
| 150 million | 4.9 GiB | 15.7 GiB | 0.23 µs | 2.58 GiB |
Provisioning a fresh 150M-rule policy takes about two minutes to compile offline and one to two minutes for the engine to cold-load on start or deploy — so a deployment that runs on a small server also provisions within minutes.
04 How the approach compares
The table below contrasts EnforceGate vX with the architectures buyers most often weigh it against: enterprise next-generation firewalls with cloud-rated URL filtering, and the open-source proxy add-ons that teams once assembled themselves — chief among them SquidGuard, a legacy project that has been long dead (its last release was in 2009). The comparison is architectural — it reflects how each approach is designed to work, not a head-to-head lab benchmark of any specific product.
| Dimension | EnforceGate vX | Enterprise NGFW + cloud URL filtering (typical commercial appliance) |
SquidGuard (legacy open source) |
|---|---|---|---|
| Where matching happens | On-box, in memory — every request decided locally | Local cache + cloud rating service for category / uncached lookups | On-box, against on-disk blacklist files |
| Large custom rule capacity | 150M rules, lab-measured, in < 5 GiB | Local/custom URL lists are bounded; broad coverage relies on the vendor cloud database | Designed for modest lists; not architected for 100M+ entries |
| Decision latency at scale | ~0.2 µs, flat as rules grow | Cache hits fast; cache/category misses incur a network round-trip to the cloud | Per-request file lookup; slows and grows memory as lists enlarge |
| External dependency at decision time | None | Cloud rating subscription for uncached / category decisions | None (you assemble and maintain the lists yourself) |
| Data residency | Fully self-hosted — traffic, policy and logs never leave your network | URL / category queries are sent to the vendor cloud | Self-hosted |
| Licensing model | By edition + connector sessions — no per-seat, per-Gbps or per-request metering | Appliance / VM tier plus throughput-bound service subscriptions | Free / open source |
| Maintenance status | Actively developed and supported by the engineers who write it | Vendor-maintained | A legacy project that is long dead — last release ~2009; not viable for new deployments |
On this comparison. The "Enterprise NGFW + cloud URL filtering" column describes the typical architecture of commercial next-generation firewalls with cloud-rated URL filtering, summarised from publicly available product documentation as of 2026 — not an independent benchmark of any specific product, and capabilities vary by model, platform and licence tier. EnforceGate vX figures are from the in-house benchmark described below.
05 How it was measured
So the numbers can be trusted and reproduced, the benchmark exercises the complete production path — author, compile, cold-load and enforce — using the shipping parser, storage layer and match engine end-to-end, not a model or simplified stand-in.
- Real product code. The same components that ship in the product produced these figures.
- Realistic corpus. Rules are unique, realistically-shaped domain names, generated deterministically so any run is reproducible.
- Honest memory measurement. Memory is read from the operating system's own accounting of the engine process (resident set size), capturing both steady state and the transient load peak.
- Correctness-gated. Before any figure is accepted, the run verifies that a large random sample of rules resolves correctly and that non-rule traffic correctly does not match. A failure invalidates the run.
- Single node, single core for lookups. All figures come from one commodity x86-64 system (specified below); latencies are single-threaded, so multi-core nodes serve proportionally more.
Test system
For full transparency, the figures in this paper were produced on a single, off-the-shelf commodity x86-64 machine — no specialised accelerators, no exotic hardware:
| CPU | AMD Ryzen 9 9950X3D — 16 cores / 32 threads, AMD 3D V-Cache, boost up to ~5.76 GHz |
|---|---|
| Memory | 96 GB DDR5 (Socket AM5 platform) |
| Architecture | x86-64, single socket |
| Operating system | Linux (kernel 7.0) |
| Lookup measurement | Single-threaded — one CPU core, the rest idle |
The 150-million-rule policy uses under 5 GiB in high-density mode, so memory capacity was never the limiting factor on this host — the headroom simply confirms the footprint figures. Lookup latencies reflect a single core; a production node serving traffic across all available cores scales throughput proportionally.
06 Scope & caveats
- Figures are lab-measured on a single commodity machine (see Test system) under controlled conditions; production numbers vary with hardware, traffic mix and co-located workloads.
- The "flat latency" result is for the common (permitted) path — the dominant case. Decisions for blocked destinations are also sub-3 µs but do rise modestly with scale.
- "Rules" here are domain-policy entries — the dominant high-volume rule type. Regular-expression URL rules have different, lower-cardinality characteristics and are not the subject of this 150M claim.
- The benchmark measures load and lookup; it is not a full production-traffic soak test. A sustained-load endurance test is a recommended follow-up before publishing hard SLAs.
Sizing a large deployment?
We are happy to walk a CTO or security team through the benchmark methodology and the raw per-run data under NDA for technical due diligence.
Underlying benchmark harnesses and full per-run data are retained internally and available on request for technical due diligence. Figures should be cited as lab-measured pending an independent validation pass.