Methodology

ipdex is a derived index. Every number, name, and ranking traces to a public source and is rebuilt daily through a validated pipeline — nothing is hand-edited.

Four sources

RIR delegated statistics (the five regional registries) give the official allocation of address space and ASNs per country, with allocation dates.

BGP routing tables (APNIC thyme / RouteViews) give the prefixes each ASN actually announces — the routed reality, not just the registration.

PeeringDB gives each network its type (ISP, content, hosting, education, enterprise) and peering presence.

IPinfo Lite is the backbone: IP→country, IP→ASN, organization name and domain, for both IPv4 and IPv6.

The pipeline

Each run is fetch → validate → diff → derive → upsert → report. It is diff-first: only rows that actually changed are written. A full daily rewrite would be a bug, not an update.

Organizations are not a source — they are derived, by clustering ASNs that belong to the same operator. Per-country rankings are computed from announced prefix counts.

Validation gates

Before anything is published, the run must pass every gate: a count-drift check against the previous snapshot, an unknown-country check, twenty fixed anchors (for example 8.8.8.8 must resolve to AS15169), and a country-agreement cross-check between IPinfo and the RIR data.

If any gate fails, the run aborts and production is left untouched. ipdex never publishes partial or unvalidated data.

The cross-check compares each network’s home country: IPinfo reports geolocation (where addresses are used) while the registries report allocation (where they were assigned), so the two diverge by design — we measure agreement at the network level, not per address.

Versioning and freshness

Every publication is stamped with a content-hash data version and a date, shown in the provenance box on each page and listed on the changelog. You can always see how fresh the data behind a page is.

The exposure score

A directional 0–100 index of how identifiable your browser looks — NOT a probability and not forensic. Formula: score = 100 × B(34 × (1 − e^(−Σ min(bits_i, 9) / 34)) ) / 34, where bits_i are published per-signal surprisal estimates, each capped at 9 bits so no single trait dominates, combined through a saturating curve that models correlation between signals (real fingerprints overlap; naive sums overcount).

Weights (bits, source, date): canvas rendering 8.5 (Eckersley, EFF Panopticlick, PETS 2010-05) · installed fonts 6.5 (same, 2010-05) · WebGL graphics 6.0 (Cao et al., NDSS 2017-02) · audio processing 5.0 (Englehardt & Narayanan, CCS 2016-10) · screen resolution 4.0, time zone 3.0, languages 2.0 (Eckersley 2010-05). The machine-readable file is data/rarity-baselines.json in the repository, versioned and dated.

Limitations, plainly: ipdex has no telemetry — the Red Line forbids measuring our own visitors — so every baseline is external published research with its date attached; browser populations have shifted since those studies. The score varies between browsers and sessions. Signals with no credible published baseline are shown without any percentage rather than an invented one. The whole computation runs in your browser and is discarded: nothing is sent, stored, or put in a URL.