Building the venue before the model showdown: a model catalog to pick the contender, and the groundwork for a verified Milwaukee inventory.
There are an absurd number of AI models out there, more landing every week, and the question that actually matters — which one can I use for this? — is buried under marketing and leaderboards that all measure slightly different things. So I built a catalog to cut through it. Describe what you need in plain English — say, “an open-weight model good at legal summarization” — and it returns only models that are actually in the catalog, ranked, with the reason each one matched. Never an invented model, never a made-up spec. And you can still browse and compare them directly by the attributes that decide whether one fits a real project.
Each entry lays out the things you’d otherwise have to dig for — who made it, how big it is and what that implies for running it, how much context it can hold, whether the weights are open or closed, what the license allows commercially, and a plain-language summary of what it’s actually good at — with a short editorial take and comparable models alongside. Technical terms in the writeups are wrapped with plain-English definitions you can hover for, so the explanation comes to you instead of sending you off to look it up.
It’s an early version and I’ll keep filling and refining it, but it’s live and usable now. Regular readers will recognize the bigger reason it exists: this is the tool I’ll use to pick the model that goes into the Milwaukee assistant build — the contender, in this week’s Building Intelligence terms. Have a look and tell me what’s missing.
This series has been pointing at a showdown for months now: a small, purpose-built Milwaukee tech assistant going up against the big general-purpose models with web search, judged head-to-head on a fixed rubric. That’s the whole bet — that something narrow and carefully fed can beat something vast and generic on its home turf. This week was not the fight. It was building the place the fight happens. You don’t stage a contest without a venue, and the venue didn’t exist yet.
So I spent the week on the website. hardais.com is where this model will eventually live — not a slide about a model, an actual thing you can query — and a fair showdown needs three things that weren’t there a week ago: a contender to put in the ring, a ground truth to judge its answers against, and a place to hold the whole thing. I made progress on the first two.
The contender first. You can’t put “an AI” in a ring — you have to pick a specific model, and there are a staggering number to pick from, with new ones landing weekly. So I built a model catalog: a search-and-compare feature for narrowing the field down to a model I can actually work with. That last part matters, because “best in general” and “best for what I’m doing” are different questions — I need one I can run, shape, and afford, not just one that tops a leaderboard. It launched this week and it’s in the Announcements above; consider it the tool I’ll use to choose the fighter.
The ground truth is the Milwaukee inventory — the database the assistant will draw from. Last week’s edition was about the hardest part of that: deciding what doesn’t go in, and building a schema that enforces the discipline instead of leaving it in my head. With that structure finally locked, this week it became easy to build a friendly way to actually fill it — and that turned out to be a small lesson in its own right, which is this week’s Under the Hood. The short version: getting the rigorous part right first is exactly what made the easy part easy.
But “fill it” doesn’t mean scrape the web and dump it in. The entire premise is that nothing goes in unverified, and a lot of what makes a community real isn’t published anywhere — it lives in the heads of the people running it. So this week I put correspondence out to several players in the Milwaukee tech scene, asking something more specific than “can I list you”: would they consent to being a source — someone I can point to when I claim a fact is true. That’s why the database carries consent flags on people and a paper trail on every source: a ground truth made of real Milwaukee folks who said yes is a very different thing from a list I assembled by guessing. The letters are out. I’m waiting to hear back, and I won’t pretend the inbox is full yet.
So that’s the honest state of things: the arena is going up, but the bell hasn’t rung. The contender-selection tool is live, the ground truth has a structure and its first real outreach, and the place it all lives is taking shape. The actual test — the model against the baselines, scored — only means anything once the venue is real, and the venue gets built before the fight, not during it. Next week tells me whether the inventory starts filling with real, consented, sourced organizations. Until then: the stage, not the show.
Building Intelligence this week made a claim in passing: that getting the rigorous part right first is what made the easy part easy. This is the easy part. With the inventory’s structure finally locked, I needed a way to actually put organizations into it — and I let an AI build that for me in an afternoon. The interesting thing isn’t that I did it fast. It’s why doing it fast and loose was a safe choice rather than a reckless one.
Start with how data actually gets into a database. The commands that talk to a database come in two flavors, and both have names worth knowing. DDL — Data Definition Language — is the set of commands that define the structure: “create a table called organizations, give it these columns.” That’s the work I did last week building the schema. DML — Data Manipulation Language — is the set that handles the contents: “insert a row, put this name here, this website here.” Filling the inventory is a DML job. The old-fashioned way to do it is to hand-type those DML commands one organization at a time — which works, but it’s tedious and unforgiving: one fumbled line and you’ve quietly entered a broken or half-filled row. The friendlier way is a GUI — a graphical user interface, which just means a screen with labeled boxes and dropdown menus where you fill in a form, click Save, and something else writes the DML for you. I wanted the form. So I vibe-coded it: I described what I wanted in plain English to an AI, it wrote the code, and I shaped it by reaction — “make that a dropdown,” “move that field” — instead of writing a line of it myself.
Here’s why that should make you nervous, in general. Vibe coding produces something that looks right very quickly, and “looks right” is exactly the trap — an AI will confidently generate code that does something subtly wrong, and you may not notice until the damage is done. Pointing that loosely-built tool straight at the database that’s supposed to be my trustworthy source of truth sounds like a great way to fill it with quiet garbage.
It isn’t, and the reason is last week’s work. The rigor doesn’t live in the form — it lives in the database underneath it. All those rules I built into the schema (a field that can’t be left blank, an entry that has to point at a real cited source, a switch that defaults to “no” until a human says otherwise) are enforced by the database itself, no matter what hands it the data. The form is just a messenger. If the vibe-coded GUI tries to save something that breaks one of those rules, the database refuses it and hands back an error. The guardrails are in the foundation, so the convenience layer bolted on top is allowed to be casual — it physically cannot write a row the structure forbids.
And the structure didn’t just permit the form — it shaped it into something that nudges me toward clean data by default. Because the database already defines the fixed set of, say, allowed source types, the GUI can read that list and turn it into a dropdown automatically: I’m picking from known options, not free-typing “website” one day and “web site” the next and creating two things where there’s one. It puts the “raw, as-found” description and the “human-approved” description in two separate boxes, so the act of curating is built into the act of entering. The form is good not because the AI is clever, but because the schema gave it a clean shape to fill.
So here’s the lesson, and it cuts against how vibe coding usually gets sold. It’s pitched as a way to skip the hard part. What actually happened is the opposite: the hard part is the only reason the shortcut was safe. I did the slow, careful structural thinking where it counted — the schema, the rules, the defaults — and that’s precisely what earned me the right to be fast and loose on the layer where it didn’t. Schema before code, I’ve said before. This is the “before code” part paying off: the front door’s built, it can’t let anything ugly through, and now the only thing left is to walk the real organizations in.
“Curiosity is the engine of achievement.”
— Ken Robinson — Ken Robinson was a British author, speaker, and international advisor on education in the arts to government, non-profits, education, and arts bodies. He is best known for his work on promoting creativity and innovation in education. Robinson was a professor emeritus at the University of Warwick in the UK and was knighted for his contributions to the arts.