281 Commits

Author SHA1 Message Date
9b92136f91
refactor(geoip,check-ip): lift download/refresh out of geoip into cmd
geoip.Open now just opens files; download/refresh/polling logic lives at
the cmd layer using dataset.Group with a combined httpcache.Cacher
fetcher (or PollFiles when no GeoIP.conf is available). Removes
geoip.OpenDatabases — the library is no longer concerned with refresh.
2026-04-20 16:10:51 -06:00
d8b6638d97
refactor(check-ip): defer signal ctx + Tick until server starts
Signal handling and periodic refresh are only meaningful when the HTTP
server runs. Load blocklists with context.Background(); start Tick and
the signal-aware ctx inside the serve branch.
2026-04-20 16:04:23 -06:00
7aa4493cb0
refactor(check-ip): IPCheck struct holds flag config + handler method
Follow golang-cli-flags pattern: config struct holds parsed flags and
loaded resources; handle and serve are methods on *IPCheck. Adds -V/help
pre-parse handling. Inlines clientIP into the handler.
2026-04-20 16:02:55 -06:00
0c281a494b
fix(check-ip): explicit error handling for UserCacheDir + geo.Close 2026-04-20 15:58:13 -06:00
b61ca0aa94
refactor(check-ip): collapse remaining server indirection
- drop format type / formatPretty / formatJSON / requestFormat /
  write / writeGeo — inline into one handler
- drop the inner check closure — inline into the handler
- one handler serves both GET / and GET /check
- fatal() replaced with log.Fatalf
- --serve is optional; without it, databases load and main returns
2026-04-20 15:57:48 -06:00
a84116f806
refactor: strip all optional/nil-guard plumbing from check-ip + geoip
- drop Checker struct, loadCohort helper, and contains() nil-wrapper
- inline check logic into server as a closure
- geoip.Databases: no nil-receiver guards, no nil-field branches, no
  "disabled" mode. City + ASN are both required; caller hands explicit
  paths and OpenDatabases returns a fully-initialized value or an err
- main.go is now straight-line wiring with no helper functions
2026-04-20 15:55:55 -06:00
cdce7da04c
refactor(check-ip): simplify to 4 flags, push MkdirAll into libs
check-ip now takes only --serve, --geoip-conf, --blocklist-repo,
--cache-dir. Blocklist always comes from git; GeoIP mmdbs always go
through httpcache (when GeoIP.conf is available). Format negotiation
lives entirely server-side.

main.go is now straight-line wiring: parse flags, build the two
databases, run the server. All filesystem setup (MkdirAll for clone
target, for cache Path parents) is pushed into gitshallow and
httpcache so the cmd doesn't do filesystem bookkeeping.
2026-04-20 15:51:46 -06:00
3b5812ffcd
feat(dataset): add PollFiles fetcher for local-file sources
Stats the given paths and reports updated when any size/modtime
changes since the last call. First call always reports true so the
initial Load populates views.

check-ip uses it for --inbound/--outbound so edits to local lists
get picked up by Group.Tick without a restart.
2026-04-20 15:39:23 -06:00
7b798a739a
refactor(check-ip): split server into server.go, linearize main.go
main.go now reads top-to-bottom as setup + usage of the three
databases (blocklists group, whitelist cohort, geoip readers), then
dispatch to one-shot or serve. HTTP server code moved to server.go.

No behavior change.
2026-04-20 15:36:02 -06:00
786463cecd
refactor(dataset): Tick takes an onError callback, no more stderr
Libraries shouldn't decide where errors go. Tick now passes Load
errors to onError (nil to ignore); callers pick log/count/page.
check-ip supplies its own stderr writer.
2026-04-20 14:19:26 -06:00
912e1179d4
feat(check-ip): --format pretty|json, move rendering out of geoip
geoip.Databases now exposes a structured Lookup(ip) Info. Rendering
moved up to the cmd — the library no longer writes to io.Writer.

check-ip adds a Result struct and --format flag (pretty/json). Serve
mode dispatches on ?format=json or Accept: application/json. Pretty
is the default for both one-shot and HTTP.
2026-04-20 14:18:39 -06:00
a3d657ec61
fix(check-ip): create cache dir before httpcache writes into it
httpcache.Cacher.Fetch writes to <path>.tmp without MkdirAll; the
library expects the caller to own the directory. cacheDir now
MkdirAll's before returning.
2026-04-20 14:15:52 -06:00
82f0b53ba3
feat(check-ip): add --serve HTTP mode to exercise dataset.Tick
Long-running server exposes GET / (client IP) and GET /check?ip= for
ad-hoc lookups. signal.NotifyContext drives graceful shutdown; the
shared dataset.Group.Tick goroutine refreshes inbound/outbound views
in the background so the refresh path gets real exercise.

Factored the shared populate+report logic into a Checker struct so
oneshot and serve modes use the same code path.
2026-04-20 14:10:29 -06:00
11743c9a10
feat(sync/dataset): minimal group/view/fetcher for hot-swap refresh
Distilled from the previous net/dataset experiment and the inline
closure version in check-ip. Keeps what actually earned its keep:

  - Group ties one Fetcher to N views; a single Load drives all swaps,
    so shared sources (one git pull, one zip download) don't get
    re-fetched per view.
  - View[T].Value() is a lock-free atomic read; the atomic.Pointer is
    hidden so consumers never see in-flight reloads.
  - Tick runs Load on a ticker with stderr error logging.

Dropped from the v1 design: MultiSyncer (callers fan-out inline when
needed), Close (unused outside geoip), Name (callers wrap the logger),
standalone Dataset type (Group with one view covers it), Sync vs Init
asymmetry (Load handles first-call vs update internally).

check-ip rewires to use it — file/git/http modes all build a Group
with two views, uniform shape.
2026-04-20 13:33:05 -06:00
01a9185c03
refactor: delete net/dataset package
check-ip and geoip no longer use it; formmailer now takes
*atomic.Pointer[ipcohort.Cohort] for Blacklist so callers own the
refresh + swap lifecycle directly. gitshallow doc comments that
referenced dataset.Syncer are trimmed.

The concepts the package tried to share (atomic-swap, group sync,
ticker-driven refresh) may come back under sync/dataset once we have
more than one in-tree caller that wants them.
2026-04-20 13:22:08 -06:00
5985ea5e2d
refactor(geoip): drop dataset dep, become barebones load/open/get
Databases is now just two *geoip2.Reader fields with Open/Close/PrintInfo.
OpenDatabases still auto-discovers conf and downloads stale .mmdb files
via httpcache before opening, but it no longer runs background goroutines
or holds atomic pointers. Long-running callers that want refresh can wire
httpcache.Cacher to atomic.Pointer themselves.

check-ip drops geo.Init/geo.Run — OpenDatabases does the fetch+open work
itself, and a one-shot CLI doesn't need background refresh.
2026-04-20 13:20:34 -06:00
990b9e430c
refactor(check-ip): drop dataset pkg, inline atomic-swap + ticker
Uses atomic.Pointer[ipcohort.Cohort] directly and builds a per-source
refresh closure (files / git / http). One goroutine drives the ticker.
Exercises what the dataset pkg was abstracting so we can judge which
bits are worth a shared pkg.
2026-04-20 13:16:47 -06:00
9e9bd98540
refactor(check-ip): factor source selection, keep demo of all three backends
Extract the file/git/httpcache mode switch into newSource and the Group
wiring into newBlocklists. main becomes flag parsing + exit code logic
only; run owns ctx and the check. Helpers (loadCohort, cacheDir,
splitCSV, loadWhitelist) are small and single-purpose.

Still exercises dataset.Group + background refresh, gitshallow, and
httpcache as before.
2026-04-20 13:05:37 -06:00
bf4cba6fb5
feat: add bitwireGitURL const, show default URL in --git flag help 2026-04-20 12:55:08 -06:00
8c9924e559
refactor: check-ip uses CheckIPConfig struct + flag.FlagSet, adds -V/help 2026-04-20 12:54:13 -06:00
f5f992ae94
refactor: move geoip setup into geoip.OpenDatabases, remove cmd/check-ip/geo.go
OpenDatabases(confPath, cityPath, asnPath) handles conf discovery, cache
dir setup, and Databases construction. DefaultConfPaths lists the standard
GeoIP.conf locations. cmd/check-ip/geo.go deleted; main calls one function.
2026-04-20 12:51:50 -06:00
994d91b2bf
refactor: dataset.Add returns *Dataset, no View; main uses Group for all cases
Remove View[T] — Add now returns *Dataset[T] directly. Callers use Load()
on the returned Dataset; Init/Run belong to the owning Group.

main.go simplified: declare syncer + file paths per case, then one
g.Init() and one g.Run(). No manual loops over individual datasets.
Add gitshallow.Repo.FilePath helper.
2026-04-20 12:48:38 -06:00
cc945b0c09
refactor: dataset.Sync() = fetch+conditional-swap, no public Swap()
Callers only need Init() + Run() + Load(). Sync() handles the full
fetch→swap cycle internally when the source reports a change.
2026-04-20 12:46:02 -06:00
03ea6934e9
refactor: HTTP datasets are independent, no Group; Group only for shared git repo 2026-04-20 12:41:07 -06:00
7b71dec445
feat: gitshallow.File for per-file path/open/sync; use in check-ip git case 2026-04-20 12:39:24 -06:00
6b420badbc
refactor: merge blacklist.go into main.go via dataset.MultiSyncer 2026-04-20 12:23:13 -06:00
3ac9683015
style: use blacklist/whitelist (industry standard) 2026-04-20 12:20:59 -06:00
e1108f3de7
fix: explicit path flags for blocklist; auto-discover GeoIP.conf
Blocklist:
- Add -inbound, -outbound, -whitelist flags for explicit file paths
- buildSources() replaces the old constructor trio; explicit flags always win
- -data-dir and -git still work as defaults for the bitwire-it layout

GeoIP:
- Auto-discover GeoIP.conf from ./GeoIP.conf then ~/.config/maxmind/GeoIP.conf
- If no conf found and no -city-db/-asn-db given: geoip disabled silently
- If no conf but paths given: use those files (Init fails if absent)
2026-04-20 12:18:33 -06:00
ddd0986e20
refactor: push complexity into packages; main.go is orchestration only
- geoip.Databases: wraps city+ASN datasets with nil-safe Init/Run/PrintInfo
- geoip.(*Downloader).NewDatabases: builds Databases from downloader
- cmd/check-ip/geo.go: setupGeo() handles conf parsing, dir creation, DB path resolution
- cmd/check-ip/blacklist.go: isBlocked() + cohortSize() moved here
- cmd/check-ip/main.go: flags, source selection, init, check, print — nothing else
2026-04-20 12:15:14 -06:00
3e48e0a863
fix: check-ip fails on startup if data cannot be downloaded 2026-04-20 12:08:03 -06:00
34a54c2d66
refactor: multi-module workspace + dataset owns Syncer interface
- Each package gets its own go.mod: net/{dataset,httpcache,gitshallow,ipcohort,geoip,formmailer}
- go.work with replace directives for cross-module workspace resolution
- dataset.Syncer/NopSyncer moved here from httpcache; callers duck-type it
- dataset.View[T] returned by Add to prevent Init/Sync/Run misuse on group members
- cmd/check-ip moved from net/ipcohort/cmd/check-ip to top-level cmd/check-ip
- Add net/ipcohort/cmd/ipcohort-contains for standalone cohort membership testing
2026-04-20 11:22:01 -06:00
225faec549
fix: FormFields defaults to GravityForms-compatible input_N names 2026-04-20 11:02:46 -06:00
d57c810c2e
feat: add net/formmailer with updated paradigms
Rewrite from feat-formmailer WIP:
- Blacklist is *dataset.View[ipcohort.Cohort] — caller wires dataset group
- http.Handler via ServeHTTP — drop-in for any mux
- SuccessBody/ErrorBody []byte — caller loads files; no file I/O per request
- Rate limiter per-instance (sync.Once init), not global
- Fields configurable (default standard names, not GravityForms input_N)
- AllowedCountries []string for geo-blocking via iploc (nil = allow all)
- ContainsAddr used directly (pre-parsed netip.Addr, no re-parse)
- No Init()/Run() — caller drives dataset lifecycle
- Fix getErrorBotty typo; expose support email only to legitimate errors
2026-04-20 11:01:15 -06:00
b2eb5aef9a
fix: skip redundant pull when another caller just synced under the lock
Records lastSynced time after each pull. A concurrent caller that was
waiting behind the mutex sees lastSynced < 1s ago and returns early,
avoiding a wasted network round-trip.
2026-04-20 10:15:53 -06:00
bd62122ac8
feat: default cache dirs; test both inbound files
- geoip.DefaultCacheDir() → ~/.cache/maxmind (os.UserCacheDir based)
- check-ip defaults data dir to ~/.cache/bitwire-it; -data-dir flag overrides;
  positional data-dir arg removed (IP is now the only required arg)
- geoip conf: DatabaseDirectory defaults to geoip.DefaultCacheDir() when blank
- httpcache integration tests now cover both inbound files (single_ips + networks)
2026-04-20 10:11:49 -06:00
d24a34e0e5
test: strengthen gitshallow integration tests to assert updated=false on re-pull 2026-04-20 10:07:06 -06:00
297fba10f5
feat: persist ETag/Last-Modified to sidecar file; add integration tests
httpcache: write <path>.meta JSON sidecar after each successful download;
load it on first Fetch so conditional GETs work after process restarts.

Tests verify: download, sidecar written, same-cacher 304, fresh-cacher 304
(the last being the key case — no in-memory state, sidecar drives ETag).
MaxMind integration test reads GeoIP.conf, downloads City+ASN, verifies
fresh-cacher conditional GET skips re-download via sidecar ETag.
2026-04-20 10:04:56 -06:00
344246362f
test: add integration tests for httpcache and gitshallow 2026-04-20 10:01:57 -06:00
4e8321af97
fix: restore auth stripping on redirect, keyed off AuthHeader 2026-04-20 09:59:27 -06:00
3feb248ce1
refactor: replace Username/Password with AuthHeader/AuthValue in httpcache
Generic header pair works for any auth scheme — Bearer, X-API-Key, Basic, etc.
Auth is forwarded on redirects; the MaxMind-specific stripping is removed.
geoip.go encodes Basic auth credentials directly into AuthValue.
2026-04-20 09:58:08 -06:00
d0a5e0a9d2
fix: split connection and download timeouts in httpcache
ConnTimeout (default 5s) caps TCP connect + TLS handshake via net.Dialer
and Transport.TLSHandshakeTimeout. Timeout (default 5m) caps the overall
request including body read. Previously a single 30s timeout covered both,
which was too short for large downloads and too long for connection failures.
2026-04-20 09:56:24 -06:00
86ffa2fb23
chore: remove IPv6 special-casing (YAGNI)
Drop the explicit IPv6 early-exit in ReadAll — ParseIPv4 already rejects
non-IPv4 via Is4(). Remove IPv6-specific tests and error message wording.
2026-04-20 09:54:04 -06:00
ad5d696ce6
refactor: dataset.Add returns View[T] instead of Dataset[T]
Group-managed datasets must never have Init/Sync/Run called on them.
Rather than patching with NopSyncer, introduce View[T] — a thin wrapper
that exposes only Load(). The compiler now prevents misuse: callers can
read values but cannot drive fetch/reload cycles directly.

Dataset[T] no longer needs a syncer when owned by a Group; View.reload()
delegates to the inner Dataset.reload() for Group.reloadAll().
2026-04-20 09:50:48 -06:00
896031b6a8
fix: idiomatic Go cleanup across net packages
- gitshallow: replace in-place Depth mutation with effectiveDepth() method;
  remove depth normalisation from New() since it was masking the bug
- ipcohort: extract sortNets() helper using cmp.Compare, eliminating 3 identical
  sort closures; add ContainsAddr(netip.Addr) for pre-parsed callers; guard
  Contains() against IPv6 panic (As4 panics on non-v4); add IPv6 test
- dataset: Add() now sets NopSyncer{} so callers cannot panic by accidentally
  calling Init/Sync/Run on a Group-managed Dataset
2026-04-20 09:47:50 -06:00
410b52f72c
test: ipcohort + dataset; fix ParseIPv4 panic on IPv6
- ParseIPv4 now returns an error instead of panicking on IPv6 addrs
- Add ipcohort tests: ParseIPv4, Contains (host/CIDR/mixed/fail-closed/empty), Size, LoadFile, LoadFiles, IPv6 skip
- Add dataset tests: Init, Sync (updated/no-update), error paths, Close hook, Run tick, Group (single fetch drives all loaders)
2026-04-20 09:36:13 -06:00
aeb94fc26b
fix: remove double-fetch, add httpcache.NopSyncer, drop Sources.Init
Sources.Init() was redundant: gitshallow.Repo.Fetch() already clones
if missing via syncGit()->clone(). Removing it means blGroup.Init()
is the single entry point, no duplicate network calls.

httpcache.NopSyncer{} replaces the private nopSyncer in the cmd —
exported so any caller can build a file-only Dataset without a syncer.
2026-04-20 09:31:58 -06:00
673d084bd2
refactor: dataset uses closure Loader + Close callback; check-ip uses Dataset/Group
dataset.Loader[T] is now func() (*T, error) — a closure capturing its own
paths/config, so multi-file cases (LoadFiles(paths...)) work naturally.

Dataset.Close func(*T) is called with the old value after each swap, enabling
resource cleanup (e.g. geoip2.Reader.Close).

Sources.Datasets() builds a dataset.Group + three typed *Dataset[ipcohort.Cohort].
main.go now uses blGroup.Run / cityDS.Run / asnDS.Run instead of hand-rolled
atomic.Pointer + polling loops. containsInbound/OutBound accept *Dataset[Cohort].
nopSyncer handles file-only GeoIP paths (no download, just open).
2026-04-20 09:28:20 -06:00
7c0cd26da1
refactor: GCInterval replaces LightGC; Sync/Init drop lightGC param
gitshallow.Repo.GCInterval int:
  0 (default) = git auto gc (no explicit call)
  N = aggressive gc + prune every Nth successful pull

GC() simplified to always aggressive+prune (the only mode we use).
Sync(), Init(), Fetch() all parameter-free; GCInterval baked into Repo.
2026-04-20 09:24:51 -06:00
10c4b6dbc3
feat: add net/dataset — generic Syncer→atomic.Pointer with Dataset and Group
Dataset[T]: one Syncer + one Loader + one atomic.Pointer. Init/Sync/Run.
Group: one Syncer driving N datasets — single Fetch, all reloads fire
together. Add[T](g, loader, path) registers a typed dataset in the group.

Discovered organically: the reload+atomic-swap pattern repeated across
every cmd is exactly this abstraction.
2026-04-20 09:23:14 -06:00
105e99532d
refactor: Syncer interface, zero-length guard, Sources uses []Syncer
httpcache.Syncer interface: Fetch() (bool, error) — satisfied by both
*httpcache.Cacher and *gitshallow.Repo (new Fetch method + LightGC field).

httpcache.Cacher.Fetch now errors on zero-length 200 response instead of
clobbering the existing file with empty content.

Sources.Fetch/Init drop the lightGC param (baked into Repo.LightGC).
Sources.syncs []httpcache.Syncer replaces the separate git/httpInbound/
httpOutbound fields — Fetch iterates syncs uniformly, no more switch.
Sources itself satisfies httpcache.Syncer.
2026-04-20 09:22:16 -06:00