73 Commits

Author SHA1 Message Date
673d084bd2
refactor: dataset uses closure Loader + Close callback; check-ip uses Dataset/Group
dataset.Loader[T] is now func() (*T, error) — a closure capturing its own
paths/config, so multi-file cases (LoadFiles(paths...)) work naturally.

Dataset.Close func(*T) is called with the old value after each swap, enabling
resource cleanup (e.g. geoip2.Reader.Close).

Sources.Datasets() builds a dataset.Group + three typed *Dataset[ipcohort.Cohort].
main.go now uses blGroup.Run / cityDS.Run / asnDS.Run instead of hand-rolled
atomic.Pointer + polling loops. containsInbound/OutBound accept *Dataset[Cohort].
nopSyncer handles file-only GeoIP paths (no download, just open).
2026-04-20 09:28:20 -06:00
7c0cd26da1
refactor: GCInterval replaces LightGC; Sync/Init drop lightGC param
gitshallow.Repo.GCInterval int:
  0 (default) = git auto gc (no explicit call)
  N = aggressive gc + prune every Nth successful pull

GC() simplified to always aggressive+prune (the only mode we use).
Sync(), Init(), Fetch() all parameter-free; GCInterval baked into Repo.
2026-04-20 09:24:51 -06:00
10c4b6dbc3
feat: add net/dataset — generic Syncer→atomic.Pointer with Dataset and Group
Dataset[T]: one Syncer + one Loader + one atomic.Pointer. Init/Sync/Run.
Group: one Syncer driving N datasets — single Fetch, all reloads fire
together. Add[T](g, loader, path) registers a typed dataset in the group.

Discovered organically: the reload+atomic-swap pattern repeated across
every cmd is exactly this abstraction.
2026-04-20 09:23:14 -06:00
105e99532d
refactor: Syncer interface, zero-length guard, Sources uses []Syncer
httpcache.Syncer interface: Fetch() (bool, error) — satisfied by both
*httpcache.Cacher and *gitshallow.Repo (new Fetch method + LightGC field).

httpcache.Cacher.Fetch now errors on zero-length 200 response instead of
clobbering the existing file with empty content.

Sources.Fetch/Init drop the lightGC param (baked into Repo.LightGC).
Sources.syncs []httpcache.Syncer replaces the separate git/httpInbound/
httpOutbound fields — Fetch iterates syncs uniformly, no more switch.
Sources itself satisfies httpcache.Syncer.
2026-04-20 09:22:16 -06:00
2abdc1c229
feat: geoip.ParseConf, geoip-update uses it, check-ip auto-downloads+hot-swaps GeoIP
geoip.ParseConf() extracted from geoip-update into the geoip package so
both cmds can read GeoIP.conf without duplication.

check-ip gains -geoip-conf flag: reads AccountID+LicenseKey, resolves
mmdb paths into data-dir, builds httpcache.Cachers with geoip.NewCacher.
Background runLoop now refreshes both blocklists and GeoIP DBs on each
tick, hot-swapping geoip2.Reader via atomic.Pointer.Swap + old.Close().
2026-04-20 00:38:54 -06:00
52f422ec93
feat: httpcache auth+rate-limit, geoip via httpcache, rename cmd to check-ip
httpcache.Cacher gains:
  - Username/Password: Basic Auth, stripped before following redirects
  - MaxAge: skip HTTP if local file mtime is within this duration
  - MinInterval: skip HTTP if last Fetch attempt was within this duration
  - Transform: post-process response body (e.g. extract .mmdb from tar.gz)

geoip.Downloader now builds an httpcache.Cacher via NewCacher(), removing
its own HTTP client. ExtractMMDB is now exported for use as a Transform.

check-ip-blacklist renamed to check-ip; adds -city-db / -asn-db flags
for GeoLite2 lookup (country, city, subdivision, ASN) printed after each
blocklist result.
2026-04-20 00:31:49 -06:00
e29c294a75
docs: add MaxMind DB binary format spec to net/geoip 2026-04-20 00:23:49 -06:00
da33660c7c
feat: add net/geoip for MaxMind GeoLite2 database downloads
Downloader checks file mtime before fetching (30/day rate limit).
Extracts .mmdb atomically from tar.gz, preserving MaxMind's release
date as mtime so freshness checks survive restarts. Strips auth header
on redirects (302 → Cloudflare R2 presigned URL). Default: 3-day
threshold, 5-minute timeout.

Also ignores GeoIP.conf and *.mmdb in .gitignore.
2026-04-20 00:21:31 -06:00
8c578ee0c6
docs: update ipcohort README with git and HTTP periodic-update examples 2026-04-19 23:42:06 -06:00
4895553a91
refactor: move atomic swaps and polling loop into main
Sources (blacklist.go) now owns only fetch/load logic — no atomic state.
main.go holds the three atomic.Pointer[Cohort] vars, calls reload() on
startup, and runs the background ticker directly. This makes the dataset
pattern (fetch → load → atomic.Store → poll) visible at the call site.
2026-04-19 23:36:38 -06:00
e2236aa09b
refactor: remove callbacks from gitshallow and httpcache
Top-layer callers (IPFilter) now drive all reloads directly after
Sync/Fetch return. gitshallow.Init now returns (bool, error).
httpcache drops Init and Sync — callers just call Fetch.
2026-04-19 23:30:30 -06:00
5f48a9beaa
feat: ipcohort filter with inbound/outbound/whitelist cohorts
Blacklist → IPFilter with three separate atomic cohorts: whitelist
(never blocked), inbound, and outbound. ContainsInbound/ContainsOutbound
each skip the whitelist. HTTP sync fetches all cachers before a single
reload to avoid double-load. Also fixes httpcache.Init calling c.Fetch().
2026-04-19 23:17:12 -06:00
ff224c5bb1
feat: support split single_ips/networks files; ipcohort.LoadFiles variadic 2026-04-19 23:01:51 -06:00
a9adc3dc18
feat: add net/httpcache; wire git+http+file into Blacklist 2026-04-19 22:57:36 -06:00
4b0f943bd7
feat: add Blacklist type to check-ip-blacklist to test ergonomics 2026-04-19 22:55:39 -06:00
73b033c3e1
refactor: remove Run from gitshallow.Repo
Polling loop (ticker + Sync check) is generic to any update source —
git HEAD, HTTP ETag, file mtime. Caller drives the loop.
2026-04-19 22:54:21 -06:00
d6837d31ed
refactor: fold dataset into gitshallow, caller owns atomic.Pointer
fs/dataset deleted — generic File[T] wrapper didn't earn its abstraction layer
gitshallow.ShallowRepo → Repo (redundant with package name)
gitshallow.Repo.Register(func() error) — callbacks fire after each sync
gitshallow.Repo.Init/Run — full lifecycle in one package
caller (check-ip-blacklist) holds atomic.Pointer[Cohort] directly
2026-04-19 22:51:52 -06:00
8731eaf10b
refactor: decouple gitdataset/ipcohort for multi-file repos
gitshallow: fix double-fetch (pull already fetches), drop redundant -C flags
gitdataset: split into GitDataset[T] (file+atomic) and GitRepo (git+multi-dataset)
  - NewDataset for file-only use, AddDataset to register with a GitRepo
  - one clone/fetch per repo regardless of how many datasets it has
ipcohort: split Cohort into hosts (sorted /32, binary search) + nets (CIDRs, linear)
  - fixes false negatives when broad CIDRs (e.g. /8) precede specific entries
  - fixes Parse() sort-before-copy order bug
  - ReadAll always sorts; unsorted param removed (was dead code)
2026-04-19 22:34:25 -06:00
a8e108a05b
wip: ipcohort: move atomics to gitdataset 2026-04-19 19:49:52 -06:00
29f9760f4d
wip: feat: add net/gitdataset for data that updates via git 2026-04-19 19:49:52 -06:00
98fb592435
f: ipcohort / blacklist 2026-04-19 19:34:21 -06:00
0f909da44c
feat: add net/ipcohort (for blacklisting, whitelisting, etc) 2026-04-19 19:34:21 -06:00
eb5e1d1336
feat: add net/gitshallow (for incremental updates to data repos) 2026-04-19 19:34:21 -06:00