Long-running server exposes GET / (client IP) and GET /check?ip= for
ad-hoc lookups. signal.NotifyContext drives graceful shutdown; the
shared dataset.Group.Tick goroutine refreshes inbound/outbound views
in the background so the refresh path gets real exercise.
Factored the shared populate+report logic into a Checker struct so
oneshot and serve modes use the same code path.
Distilled from the previous net/dataset experiment and the inline
closure version in check-ip. Keeps what actually earned its keep:
- Group ties one Fetcher to N views; a single Load drives all swaps,
so shared sources (one git pull, one zip download) don't get
re-fetched per view.
- View[T].Value() is a lock-free atomic read; the atomic.Pointer is
hidden so consumers never see in-flight reloads.
- Tick runs Load on a ticker with stderr error logging.
Dropped from the v1 design: MultiSyncer (callers fan-out inline when
needed), Close (unused outside geoip), Name (callers wrap the logger),
standalone Dataset type (Group with one view covers it), Sync vs Init
asymmetry (Load handles first-call vs update internally).
check-ip rewires to use it — file/git/http modes all build a Group
with two views, uniform shape.
check-ip and geoip no longer use it; formmailer now takes
*atomic.Pointer[ipcohort.Cohort] for Blacklist so callers own the
refresh + swap lifecycle directly. gitshallow doc comments that
referenced dataset.Syncer are trimmed.
The concepts the package tried to share (atomic-swap, group sync,
ticker-driven refresh) may come back under sync/dataset once we have
more than one in-tree caller that wants them.
Databases is now just two *geoip2.Reader fields with Open/Close/PrintInfo.
OpenDatabases still auto-discovers conf and downloads stale .mmdb files
via httpcache before opening, but it no longer runs background goroutines
or holds atomic pointers. Long-running callers that want refresh can wire
httpcache.Cacher to atomic.Pointer themselves.
check-ip drops geo.Init/geo.Run — OpenDatabases does the fetch+open work
itself, and a one-shot CLI doesn't need background refresh.
Uses atomic.Pointer[ipcohort.Cohort] directly and builds a per-source
refresh closure (files / git / http). One goroutine drives the ticker.
Exercises what the dataset pkg was abstracting so we can judge which
bits are worth a shared pkg.
Extract the file/git/httpcache mode switch into newSource and the Group
wiring into newBlocklists. main becomes flag parsing + exit code logic
only; run owns ctx and the check. Helpers (loadCohort, cacheDir,
splitCSV, loadWhitelist) are small and single-purpose.
Still exercises dataset.Group + background refresh, gitshallow, and
httpcache as before.
OpenDatabases(confPath, cityPath, asnPath) handles conf discovery, cache
dir setup, and Databases construction. DefaultConfPaths lists the standard
GeoIP.conf locations. cmd/check-ip/geo.go deleted; main calls one function.
Remove View[T] — Add now returns *Dataset[T] directly. Callers use Load()
on the returned Dataset; Init/Run belong to the owning Group.
main.go simplified: declare syncer + file paths per case, then one
g.Init() and one g.Run(). No manual loops over individual datasets.
Add gitshallow.Repo.FilePath helper.
Blocklist:
- Add -inbound, -outbound, -whitelist flags for explicit file paths
- buildSources() replaces the old constructor trio; explicit flags always win
- -data-dir and -git still work as defaults for the bitwire-it layout
GeoIP:
- Auto-discover GeoIP.conf from ./GeoIP.conf then ~/.config/maxmind/GeoIP.conf
- If no conf found and no -city-db/-asn-db given: geoip disabled silently
- If no conf but paths given: use those files (Init fails if absent)
- Each package gets its own go.mod: net/{dataset,httpcache,gitshallow,ipcohort,geoip,formmailer}
- go.work with replace directives for cross-module workspace resolution
- dataset.Syncer/NopSyncer moved here from httpcache; callers duck-type it
- dataset.View[T] returned by Add to prevent Init/Sync/Run misuse on group members
- cmd/check-ip moved from net/ipcohort/cmd/check-ip to top-level cmd/check-ip
- Add net/ipcohort/cmd/ipcohort-contains for standalone cohort membership testing
Rewrite from feat-formmailer WIP:
- Blacklist is *dataset.View[ipcohort.Cohort] — caller wires dataset group
- http.Handler via ServeHTTP — drop-in for any mux
- SuccessBody/ErrorBody []byte — caller loads files; no file I/O per request
- Rate limiter per-instance (sync.Once init), not global
- Fields configurable (default standard names, not GravityForms input_N)
- AllowedCountries []string for geo-blocking via iploc (nil = allow all)
- ContainsAddr used directly (pre-parsed netip.Addr, no re-parse)
- No Init()/Run() — caller drives dataset lifecycle
- Fix getErrorBotty typo; expose support email only to legitimate errors
Records lastSynced time after each pull. A concurrent caller that was
waiting behind the mutex sees lastSynced < 1s ago and returns early,
avoiding a wasted network round-trip.
- geoip.DefaultCacheDir() → ~/.cache/maxmind (os.UserCacheDir based)
- check-ip defaults data dir to ~/.cache/bitwire-it; -data-dir flag overrides;
positional data-dir arg removed (IP is now the only required arg)
- geoip conf: DatabaseDirectory defaults to geoip.DefaultCacheDir() when blank
- httpcache integration tests now cover both inbound files (single_ips + networks)
httpcache: write <path>.meta JSON sidecar after each successful download;
load it on first Fetch so conditional GETs work after process restarts.
Tests verify: download, sidecar written, same-cacher 304, fresh-cacher 304
(the last being the key case — no in-memory state, sidecar drives ETag).
MaxMind integration test reads GeoIP.conf, downloads City+ASN, verifies
fresh-cacher conditional GET skips re-download via sidecar ETag.
Generic header pair works for any auth scheme — Bearer, X-API-Key, Basic, etc.
Auth is forwarded on redirects; the MaxMind-specific stripping is removed.
geoip.go encodes Basic auth credentials directly into AuthValue.
ConnTimeout (default 5s) caps TCP connect + TLS handshake via net.Dialer
and Transport.TLSHandshakeTimeout. Timeout (default 5m) caps the overall
request including body read. Previously a single 30s timeout covered both,
which was too short for large downloads and too long for connection failures.
Group-managed datasets must never have Init/Sync/Run called on them.
Rather than patching with NopSyncer, introduce View[T] — a thin wrapper
that exposes only Load(). The compiler now prevents misuse: callers can
read values but cannot drive fetch/reload cycles directly.
Dataset[T] no longer needs a syncer when owned by a Group; View.reload()
delegates to the inner Dataset.reload() for Group.reloadAll().
- gitshallow: replace in-place Depth mutation with effectiveDepth() method;
remove depth normalisation from New() since it was masking the bug
- ipcohort: extract sortNets() helper using cmp.Compare, eliminating 3 identical
sort closures; add ContainsAddr(netip.Addr) for pre-parsed callers; guard
Contains() against IPv6 panic (As4 panics on non-v4); add IPv6 test
- dataset: Add() now sets NopSyncer{} so callers cannot panic by accidentally
calling Init/Sync/Run on a Group-managed Dataset
Sources.Init() was redundant: gitshallow.Repo.Fetch() already clones
if missing via syncGit()->clone(). Removing it means blGroup.Init()
is the single entry point, no duplicate network calls.
httpcache.NopSyncer{} replaces the private nopSyncer in the cmd —
exported so any caller can build a file-only Dataset without a syncer.
dataset.Loader[T] is now func() (*T, error) — a closure capturing its own
paths/config, so multi-file cases (LoadFiles(paths...)) work naturally.
Dataset.Close func(*T) is called with the old value after each swap, enabling
resource cleanup (e.g. geoip2.Reader.Close).
Sources.Datasets() builds a dataset.Group + three typed *Dataset[ipcohort.Cohort].
main.go now uses blGroup.Run / cityDS.Run / asnDS.Run instead of hand-rolled
atomic.Pointer + polling loops. containsInbound/OutBound accept *Dataset[Cohort].
nopSyncer handles file-only GeoIP paths (no download, just open).
gitshallow.Repo.GCInterval int:
0 (default) = git auto gc (no explicit call)
N = aggressive gc + prune every Nth successful pull
GC() simplified to always aggressive+prune (the only mode we use).
Sync(), Init(), Fetch() all parameter-free; GCInterval baked into Repo.
Dataset[T]: one Syncer + one Loader + one atomic.Pointer. Init/Sync/Run.
Group: one Syncer driving N datasets — single Fetch, all reloads fire
together. Add[T](g, loader, path) registers a typed dataset in the group.
Discovered organically: the reload+atomic-swap pattern repeated across
every cmd is exactly this abstraction.
httpcache.Syncer interface: Fetch() (bool, error) — satisfied by both
*httpcache.Cacher and *gitshallow.Repo (new Fetch method + LightGC field).
httpcache.Cacher.Fetch now errors on zero-length 200 response instead of
clobbering the existing file with empty content.
Sources.Fetch/Init drop the lightGC param (baked into Repo.LightGC).
Sources.syncs []httpcache.Syncer replaces the separate git/httpInbound/
httpOutbound fields — Fetch iterates syncs uniformly, no more switch.
Sources itself satisfies httpcache.Syncer.
geoip.ParseConf() extracted from geoip-update into the geoip package so
both cmds can read GeoIP.conf without duplication.
check-ip gains -geoip-conf flag: reads AccountID+LicenseKey, resolves
mmdb paths into data-dir, builds httpcache.Cachers with geoip.NewCacher.
Background runLoop now refreshes both blocklists and GeoIP DBs on each
tick, hot-swapping geoip2.Reader via atomic.Pointer.Swap + old.Close().
httpcache.Cacher gains:
- Username/Password: Basic Auth, stripped before following redirects
- MaxAge: skip HTTP if local file mtime is within this duration
- MinInterval: skip HTTP if last Fetch attempt was within this duration
- Transform: post-process response body (e.g. extract .mmdb from tar.gz)
geoip.Downloader now builds an httpcache.Cacher via NewCacher(), removing
its own HTTP client. ExtractMMDB is now exported for use as a Transform.
check-ip-blacklist renamed to check-ip; adds -city-db / -asn-db flags
for GeoLite2 lookup (country, city, subdivision, ASN) printed after each
blocklist result.
Sources (blacklist.go) now owns only fetch/load logic — no atomic state.
main.go holds the three atomic.Pointer[Cohort] vars, calls reload() on
startup, and runs the background ticker directly. This makes the dataset
pattern (fetch → load → atomic.Store → poll) visible at the call site.
Top-layer callers (IPFilter) now drive all reloads directly after
Sync/Fetch return. gitshallow.Init now returns (bool, error).
httpcache drops Init and Sync — callers just call Fetch.
Blacklist → IPFilter with three separate atomic cohorts: whitelist
(never blocked), inbound, and outbound. ContainsInbound/ContainsOutbound
each skip the whitelist. HTTP sync fetches all cachers before a single
reload to avoid double-load. Also fixes httpcache.Init calling c.Fetch().