52 Commits

Author SHA1 Message Date
da2b230bc3
refactor(dataset)!: plumb ctx into loader callbacks
Loader signature changes from func() (*T, error) to
func(context.Context) (*T, error). Set.Load(ctx) already accepts a
ctx; it now flows through reload() into the loader so long-running
parses or downloads can honor ctx.Err() for graceful shutdown.

check-ip's loaders don't consume ctx yet (ipcohort/geoip are
in-memory and fast), but the hook is in place for future work.

BREAKING: dataset.Add and dataset.AddInitial signatures changed.
2026-04-20 20:02:48 -06:00
b40abe0a06
feat(check-ip): --async-load flag for non-blocking server startup
With --serve --async-load, blocklists + whitelist start empty and
load in background goroutines so the HTTP server binds immediately.
/healthz returns 503 until loads complete, then 200. Ignored in CLI
mode. Geo stays synchronous — geoip readers aren't nil-safe.
2026-04-20 19:26:32 -06:00
46b31b75c2
style: format entry counts with comma thousands separators
3,406,727 scans cleanly; 3406727 does not. Go's fmt has no
thousands-separator verb and golang.org/x/text/message pulls in a
multi-MB Unicode tree for what is 15 lines inline, so each cmd gets
its own commafy helper.
2026-04-20 19:15:47 -06:00
a181133c2f
style(check-ip): blank line between stderr loading output and results 2026-04-20 17:37:50 -06:00
e3973b240e
feat(check-ip): print 'Loading X...' before each stage starts
Split the stage-timing lines into a pre-stage 'Loading X... ' (shown
before the work starts) and a post-stage '<duration> (<counts>)'. Makes
it obvious something is happening during the ~1s cold-start parse
instead of going silent and printing everything at the end.
2026-04-20 17:37:01 -06:00
36b015f84a
feat(check-ip): print stage timings to stderr
Always-on "Loading X... Nms" lines on stderr for blocklists, geoip,
and whitelist load stages. Makes it obvious at a glance that the cost
is cold-start parsing (not re-downloading) and surfaces the sizes of
the loaded sets. Stdout stays clean for pipe-friendly consumption.
2026-04-20 17:36:21 -06:00
c99cd3a2b8
refactor: default cache to ~/.cache on all platforms
os.UserCacheDir returns ~/Library/Caches on macOS, which is intended
for bundled desktop apps and hides files from anyone looking under
~/.cache. These are CLI tools — use the XDG convention everywhere so
the cache lives somewhere predictable and cross-platform-consistent.
2026-04-20 17:33:31 -06:00
5e6688c2a9
feat(gitshallow): add MaxAge gate via FETCH_HEAD mtime
Short-lived CLI invocations were doing a full git fetch+reset on every
run because the only debounce was an in-memory lastSynced field. MaxAge
skips the fetch when .git/FETCH_HEAD is younger than the configured
duration — git rewrites FETCH_HEAD on every successful fetch, so its
mtime is effectively "last time we talked to the remote", and it
survives process restart. Wire check-ip's blocklist repo to the same
47m refresh interval it uses for the background Tick.
2026-04-20 17:28:53 -06:00
e594f2503c
refactor(geoip): cache tarballs as <edition>_LATEST.tar.gz
Adds geoip.TarGzName(edition) as the single source of truth for the
cache filename. The _LATEST suffix signals that the file is whatever
MaxMind served most recently (versus the dated Content-Disposition
name) and keeps httpcache's ETag sidecar tied to a stable path across
releases.
2026-04-20 17:13:41 -06:00
5fc032dc56
docs(check-ip): keep --geoip-conf flag help concise 2026-04-20 17:10:00 -06:00
0c509fb563
docs: note GeoLite2 free signup in check-ip and geoip.Conf
Missing GeoIP.conf now points users at the free MaxMind signup with an
example config. Also documented on the geoip.Conf godoc.
2026-04-20 17:07:22 -06:00
f293f86b16
feat(check-ip): add --whitelist override, require GeoIP.conf
Whitelist is a combined IP+CIDR cohort file polled for mtime changes;
a match short-circuits the blocklist check and marks the result
allowlisted. Drops the geoip PollFiles fallback — missing GeoIP.conf
now fails fast instead of silently polling local tarballs.
2026-04-20 17:05:55 -06:00
f75d5c489a
refactor(httpcache): use http.Header instead of AuthHeader/AuthValue
Cacher.Header is a stdlib http.Header that's merged into every request.
Authorization is stripped on redirect unconditionally (presigned S3/R2
targets, etc). Callers build the header with the usual http.Header
literal; BasicAuth/Bearer still produce the Authorization value.
2026-04-20 16:55:15 -06:00
4753888402
refactor(geoip): ParseConf takes a string, not a file path
The old ParseConf opened the file itself, which the name did not
convey. Now it parses the config text directly, matching
encoding/json.Unmarshal-style conventions: callers read the file (or
source the string however they like) and pass it in. Also introduce
errors.ErrMissingCredentials for the credential-missing case so callers
can branch on it.
2026-04-20 16:53:17 -06:00
e329c0f86b
refactor(dataset): rename Group to Set, accept variadic fetchers
Set handles both single-fetcher (one git repo) and multi-fetcher
(GeoLite2 City + ASN) cases uniformly. Any fetcher reporting an update
triggers a view reload. This replaces the per-caller FetcherFunc wrapper
that combined the two MaxMind cachers and the ad-hoc atomic.Pointer +
ticker goroutine in cmd/check-ip — geoip now rides on the same
Set/View/Load/Tick surface as the blocklists.
2026-04-20 16:50:33 -06:00
01158aee55
revert: inline geoip sync instead of IPCheck.Sync method
Keep fetch+open+swap inline at both call sites (initial load in main,
background tick in the serve branch). No helper.
2026-04-20 16:43:38 -06:00
0c95156d4c
refactor(check-ip): add IPCheck.Sync for geoip, reuse for tick
geoip now syncs via an IPCheck.Sync() method that returns (updated, err)
— same signature as gitshallow.Repo.Sync / Fetcher.Fetch. The initial
load and the background refresh goroutine both call it, so there is no
duplicated fetch+open+swap logic.
2026-04-20 16:42:54 -06:00
b9295608db
feat(check-ip): accept IP args, require --serve or args
Positional args are IPs to check and print; at least one IP or --serve
must be provided. Refactor server.go: split handle into lookup + writeText
methods so main can reuse them.

geoip is no longer managed via dataset.Group — it's a single
atomic.Pointer[geoip.Databases]. The Fetcher (httpcache or PollFiles)
still drives refresh, but via an inline ticker in the serve branch that
fetches, reopens, and swaps.
2026-04-20 16:40:56 -06:00
6bcb493d02
refactor(check-ip): manage geoip via dataset.Group
Conditional Fetcher: httpcache cachers when GeoIP.conf basic auth is
present, dataset.PollFiles otherwise. geo is now a *dataset.View so the
background Tick in the serve branch refreshes it alongside blocklists.
2026-04-20 16:38:40 -06:00
35046bb17a
refactor(check-ip): rename ConfPath, resolve GeoIP conf + basic-auth early
IPCheck.ConfPath → GeoIPConfPath. After flag parsing, auto-discover the
conf path (when not explicit), parse it, and stash
cfg.GeoIPBasicAuth = httpcache.BasicAuth(...). The geoip download block
later just checks cfg.GeoIPBasicAuth != "" and uses the pre-built
value.
2026-04-20 16:35:32 -06:00
56a150826e
refactor: geoip opens tar.gz in place, no Transform, no intermediate mmdb
- httpcache.Cacher loses Transform (always atomic copy to Path); adds
  BasicAuth and Bearer helpers for Authorization header values.
- geoip.Open now reads <dir>/GeoLite2-City.tar.gz and GeoLite2-ASN.tar.gz
  directly: extracts the .mmdb entry in memory and opens via
  geoip2.FromBytes. No .mmdb files written to disk.
- geoip.Downloader/New/NewCacher/Fetch/ExtractMMDB removed — geoip is
  purely read/lookup; fetching is each caller's concern.
- cmd/check-ip/main.go is a single main() again: blocklists via
  gitshallow+dataset, geoip via two httpcache.Cachers (if GeoIP.conf
  present) + geoip.Open. No geo refresh loop, no dataset.Group for geo.
- cmd/geoip-update and the integration test construct httpcache.Cachers
  directly against geoip.DownloadBase + edition IDs, writing .tar.gz.
2026-04-20 16:27:32 -06:00
cb39f30d91
refactor(geoip,check-ip): inline literal mmdb filenames
Use 'GeoLite2-City.mmdb' / 'GeoLite2-ASN.mmdb' directly instead of
composing from the edition constants. Reads plainly — the actual
filename is right there.
2026-04-20 16:13:30 -06:00
359b740cec
refactor(geoip): Open takes dir, derives canonical edition paths
Filenames are deterministic (<dir>/GeoLite2-City.mmdb,
<dir>/GeoLite2-ASN.mmdb) — callers no longer pass both paths. cmd/check-ip
drops its cityPath/asnPath locals and just hands the maxmind dir to
geoip.Open and the fetcher builder.
2026-04-20 16:12:46 -06:00
9b92136f91
refactor(geoip,check-ip): lift download/refresh out of geoip into cmd
geoip.Open now just opens files; download/refresh/polling logic lives at
the cmd layer using dataset.Group with a combined httpcache.Cacher
fetcher (or PollFiles when no GeoIP.conf is available). Removes
geoip.OpenDatabases — the library is no longer concerned with refresh.
2026-04-20 16:10:51 -06:00
d8b6638d97
refactor(check-ip): defer signal ctx + Tick until server starts
Signal handling and periodic refresh are only meaningful when the HTTP
server runs. Load blocklists with context.Background(); start Tick and
the signal-aware ctx inside the serve branch.
2026-04-20 16:04:23 -06:00
7aa4493cb0
refactor(check-ip): IPCheck struct holds flag config + handler method
Follow golang-cli-flags pattern: config struct holds parsed flags and
loaded resources; handle and serve are methods on *IPCheck. Adds -V/help
pre-parse handling. Inlines clientIP into the handler.
2026-04-20 16:02:55 -06:00
0c281a494b
fix(check-ip): explicit error handling for UserCacheDir + geo.Close 2026-04-20 15:58:13 -06:00
b61ca0aa94
refactor(check-ip): collapse remaining server indirection
- drop format type / formatPretty / formatJSON / requestFormat /
  write / writeGeo — inline into one handler
- drop the inner check closure — inline into the handler
- one handler serves both GET / and GET /check
- fatal() replaced with log.Fatalf
- --serve is optional; without it, databases load and main returns
2026-04-20 15:57:48 -06:00
a84116f806
refactor: strip all optional/nil-guard plumbing from check-ip + geoip
- drop Checker struct, loadCohort helper, and contains() nil-wrapper
- inline check logic into server as a closure
- geoip.Databases: no nil-receiver guards, no nil-field branches, no
  "disabled" mode. City + ASN are both required; caller hands explicit
  paths and OpenDatabases returns a fully-initialized value or an err
- main.go is now straight-line wiring with no helper functions
2026-04-20 15:55:55 -06:00
cdce7da04c
refactor(check-ip): simplify to 4 flags, push MkdirAll into libs
check-ip now takes only --serve, --geoip-conf, --blocklist-repo,
--cache-dir. Blocklist always comes from git; GeoIP mmdbs always go
through httpcache (when GeoIP.conf is available). Format negotiation
lives entirely server-side.

main.go is now straight-line wiring: parse flags, build the two
databases, run the server. All filesystem setup (MkdirAll for clone
target, for cache Path parents) is pushed into gitshallow and
httpcache so the cmd doesn't do filesystem bookkeeping.
2026-04-20 15:51:46 -06:00
3b5812ffcd
feat(dataset): add PollFiles fetcher for local-file sources
Stats the given paths and reports updated when any size/modtime
changes since the last call. First call always reports true so the
initial Load populates views.

check-ip uses it for --inbound/--outbound so edits to local lists
get picked up by Group.Tick without a restart.
2026-04-20 15:39:23 -06:00
7b798a739a
refactor(check-ip): split server into server.go, linearize main.go
main.go now reads top-to-bottom as setup + usage of the three
databases (blocklists group, whitelist cohort, geoip readers), then
dispatch to one-shot or serve. HTTP server code moved to server.go.

No behavior change.
2026-04-20 15:36:02 -06:00
786463cecd
refactor(dataset): Tick takes an onError callback, no more stderr
Libraries shouldn't decide where errors go. Tick now passes Load
errors to onError (nil to ignore); callers pick log/count/page.
check-ip supplies its own stderr writer.
2026-04-20 14:19:26 -06:00
912e1179d4
feat(check-ip): --format pretty|json, move rendering out of geoip
geoip.Databases now exposes a structured Lookup(ip) Info. Rendering
moved up to the cmd — the library no longer writes to io.Writer.

check-ip adds a Result struct and --format flag (pretty/json). Serve
mode dispatches on ?format=json or Accept: application/json. Pretty
is the default for both one-shot and HTTP.
2026-04-20 14:18:39 -06:00
a3d657ec61
fix(check-ip): create cache dir before httpcache writes into it
httpcache.Cacher.Fetch writes to <path>.tmp without MkdirAll; the
library expects the caller to own the directory. cacheDir now
MkdirAll's before returning.
2026-04-20 14:15:52 -06:00
82f0b53ba3
feat(check-ip): add --serve HTTP mode to exercise dataset.Tick
Long-running server exposes GET / (client IP) and GET /check?ip= for
ad-hoc lookups. signal.NotifyContext drives graceful shutdown; the
shared dataset.Group.Tick goroutine refreshes inbound/outbound views
in the background so the refresh path gets real exercise.

Factored the shared populate+report logic into a Checker struct so
oneshot and serve modes use the same code path.
2026-04-20 14:10:29 -06:00
11743c9a10
feat(sync/dataset): minimal group/view/fetcher for hot-swap refresh
Distilled from the previous net/dataset experiment and the inline
closure version in check-ip. Keeps what actually earned its keep:

  - Group ties one Fetcher to N views; a single Load drives all swaps,
    so shared sources (one git pull, one zip download) don't get
    re-fetched per view.
  - View[T].Value() is a lock-free atomic read; the atomic.Pointer is
    hidden so consumers never see in-flight reloads.
  - Tick runs Load on a ticker with stderr error logging.

Dropped from the v1 design: MultiSyncer (callers fan-out inline when
needed), Close (unused outside geoip), Name (callers wrap the logger),
standalone Dataset type (Group with one view covers it), Sync vs Init
asymmetry (Load handles first-call vs update internally).

check-ip rewires to use it — file/git/http modes all build a Group
with two views, uniform shape.
2026-04-20 13:33:05 -06:00
5985ea5e2d
refactor(geoip): drop dataset dep, become barebones load/open/get
Databases is now just two *geoip2.Reader fields with Open/Close/PrintInfo.
OpenDatabases still auto-discovers conf and downloads stale .mmdb files
via httpcache before opening, but it no longer runs background goroutines
or holds atomic pointers. Long-running callers that want refresh can wire
httpcache.Cacher to atomic.Pointer themselves.

check-ip drops geo.Init/geo.Run — OpenDatabases does the fetch+open work
itself, and a one-shot CLI doesn't need background refresh.
2026-04-20 13:20:34 -06:00
990b9e430c
refactor(check-ip): drop dataset pkg, inline atomic-swap + ticker
Uses atomic.Pointer[ipcohort.Cohort] directly and builds a per-source
refresh closure (files / git / http). One goroutine drives the ticker.
Exercises what the dataset pkg was abstracting so we can judge which
bits are worth a shared pkg.
2026-04-20 13:16:47 -06:00
9e9bd98540
refactor(check-ip): factor source selection, keep demo of all three backends
Extract the file/git/httpcache mode switch into newSource and the Group
wiring into newBlocklists. main becomes flag parsing + exit code logic
only; run owns ctx and the check. Helpers (loadCohort, cacheDir,
splitCSV, loadWhitelist) are small and single-purpose.

Still exercises dataset.Group + background refresh, gitshallow, and
httpcache as before.
2026-04-20 13:05:37 -06:00
bf4cba6fb5
feat: add bitwireGitURL const, show default URL in --git flag help 2026-04-20 12:55:08 -06:00
8c9924e559
refactor: check-ip uses CheckIPConfig struct + flag.FlagSet, adds -V/help 2026-04-20 12:54:13 -06:00
f5f992ae94
refactor: move geoip setup into geoip.OpenDatabases, remove cmd/check-ip/geo.go
OpenDatabases(confPath, cityPath, asnPath) handles conf discovery, cache
dir setup, and Databases construction. DefaultConfPaths lists the standard
GeoIP.conf locations. cmd/check-ip/geo.go deleted; main calls one function.
2026-04-20 12:51:50 -06:00
994d91b2bf
refactor: dataset.Add returns *Dataset, no View; main uses Group for all cases
Remove View[T] — Add now returns *Dataset[T] directly. Callers use Load()
on the returned Dataset; Init/Run belong to the owning Group.

main.go simplified: declare syncer + file paths per case, then one
g.Init() and one g.Run(). No manual loops over individual datasets.
Add gitshallow.Repo.FilePath helper.
2026-04-20 12:48:38 -06:00
03ea6934e9
refactor: HTTP datasets are independent, no Group; Group only for shared git repo 2026-04-20 12:41:07 -06:00
7b71dec445
feat: gitshallow.File for per-file path/open/sync; use in check-ip git case 2026-04-20 12:39:24 -06:00
6b420badbc
refactor: merge blacklist.go into main.go via dataset.MultiSyncer 2026-04-20 12:23:13 -06:00
3ac9683015
style: use blacklist/whitelist (industry standard) 2026-04-20 12:20:59 -06:00
e1108f3de7
fix: explicit path flags for blocklist; auto-discover GeoIP.conf
Blocklist:
- Add -inbound, -outbound, -whitelist flags for explicit file paths
- buildSources() replaces the old constructor trio; explicit flags always win
- -data-dir and -git still work as defaults for the bitwire-it layout

GeoIP:
- Auto-discover GeoIP.conf from ./GeoIP.conf then ~/.config/maxmind/GeoIP.conf
- If no conf found and no -city-db/-asn-db given: geoip disabled silently
- If no conf but paths given: use those files (Init fails if absent)
2026-04-20 12:18:33 -06:00
ddd0986e20
refactor: push complexity into packages; main.go is orchestration only
- geoip.Databases: wraps city+ASN datasets with nil-safe Init/Run/PrintInfo
- geoip.(*Downloader).NewDatabases: builds Databases from downloader
- cmd/check-ip/geo.go: setupGeo() handles conf parsing, dir creation, DB path resolution
- cmd/check-ip/blacklist.go: isBlocked() + cohortSize() moved here
- cmd/check-ip/main.go: flags, source selection, init, check, print — nothing else
2026-04-20 12:15:14 -06:00