OpenAI cancels experimental projects while ChatGPT explores ad-supported tiers. Learn about training epochs, prompting tips, and our new section, The Rabbit Hole.
We're introducing a new section called The Rabbit Hole — and the name is intentional.
The Rabbit Hole won't appear every week — only when there's something genuinely worth going deep on. When it does show up, it drops into one idea, story, or question that sits at the edges of what AI can — and can't — do. These won't always be comfortable reads. Some entries will be mind-bending. Some will challenge assumptions. Some will raise more questions than they answer. That's the point.
Think of it as the section where curiosity gets to run a little wild — no agenda, no easy takeaways, just something genuinely worth thinking about.
The Learning Loop
Google Gemini
DEFINITIONEpoch
One complete pass of the entire training dataset through an AI model during the learning process, used to help the model refine its accuracy over time.
Source: AWS
TIPAmbiguity Check
Include the instruction "Ask me 3 clarifying questions before you begin to ensure the output meets my specific needs" to improve accuracy on complex tasks.
Source: LearnPrompting
TOOLHeyGen
An AI-powered video platform that enables users to generate professional-quality videos with digital avatars and cloned voices for training and marketing.
A neural network component that helps AI models focus on the most relevant parts of input data when making predictions. Think of it like a spotlight that dynamically highlights important information - when translating "The cat sat on the mat," attention helps the model focus on "cat" when generating the corresponding word in another language, rather than getting distracted by less relevant words.
Source: ArXiv
TIPOutput Formatting
Use explicit formatting instructions with examples in your prompts. Instead of asking "list the benefits," try "List 3 benefits using this format: • Benefit: [description in 10-15 words]." This ensures consistent, professional output that's ready to use in reports or presentations without manual reformatting.
Source: OpenAI Best Practices
TOOLStable Diffusion
An open-source text-to-image AI model that runs locally or through various online interfaces, offering high-quality image generation with full control over parameters and no usage limits unlike commercial alternatives.
Tokenization is the process of breaking down text into smaller, manageable pieces called tokens, which can be words, characters, or subwords. This helps AI models to process and understand text data efficiently.
Source: Wikipedia
TIPLayered Prompt Design
When crafting prompts, use layered prompts where the initial input is broad, followed by more specific questions or instructions. This technique improves the specificity and accuracy of responses in complex workflows.
Source: PromptingGuide
TOOLCopy.ai
Copy.ai offers AI-powered writing assistants for generating marketing copy, blog content, and more. It simplifies content creation with a range of templates and tools.
The AI Airfare Research tool at hardais.com/airfare
is a demonstration of what AI-powered tooling looks like in practice — not a chatbot,
not a generic search wrapper, but a purpose-built interface that gives users 8 distinct
ways to interrogate flight data using natural language and AI reasoning. This week I want
to pull back the curtain on how it actually works, who the players are at each endpoint,
and where the honest limitations and opportunities lie.
The Stack at a Glance
The tool is built on Next.js 16 using the App Router pattern with
TypeScript. The frontend component (AirfareSearch.tsx) handles user input
and renders results. All AI and data calls happen server-side through a Next.js API
route (/api/airfare/route.ts) — meaning API keys never touch the browser.
The tool is deployed on Vercel.
The Data Layer — SerpApi & Google Flights
When a user submits a search, the API route first calls SerpApi —
a third-party service that programmatically queries Google Flights
and returns structured JSON flight data. This is the raw material: routes, prices,
airlines, durations, layovers.
This is an important transparency point. The data source is Google Flights
exclusively. SerpApi is essentially a structured bridge to that single source.
The tool is not crawling aggregators, scraping budget airline sites, or pulling from
a multi-source pricing database. If a deal exists outside of Google Flights' index,
this tool will not find it. That is a known ceiling and an area we intend to improve
on as the platform matures — potential directions include integrating additional data
sources such as Skyscanner or Kayak APIs to broaden coverage.
The Intelligence Layer — Claude Sonnet
Once SerpApi returns flight data, that structured JSON is passed as context to
Claude Sonnet (claude-sonnet-4-20250514) via the Anthropic API.
Claude's job is not to find the data — SerpApi already did that. Claude's job is
to reason about it. Depending on which of the 8 search modes the
user selected, the prompt instructs Claude to analyze the data differently:
Flight Price Search — real-time fares with AI analysis and best value picks
Total Trip Cost — true door-to-door cost including bags and ground transport
Where Should I Go? — tell AI your budget and vibe, get destination picks
Flights Matched to You — flights ranked by your personal travel preferences
When Should I Book? — AI timing advice on whether to buy now or wait for a drop
Smarter Booking Strategies — hidden city fares, points vs. cash, and insider tactics
Group Trip Planner — coordinate flights from multiple cities at optimal cost
The same underlying flight data produces meaningfully different outputs depending
on the mode — that is the value AI adds here. A user could theoretically go to
Google Flights directly and get the same raw data, but replicating 8 different
analytical lenses manually would be time-consuming and inconsistent.
Where Claude Is Today — And What That Means
Claude Sonnet is a powerful general-purpose model. It reasons well, writes clearly,
and handles ambiguous instructions gracefully. For a demonstration tool, it is an
excellent choice. However, general-purpose does carry trade-offs in a specialized
domain like airfare. A model fine-tuned on travel pricing dynamics, seasonal demand
patterns, and airline-specific policies would produce sharper, more confident
recommendations. That kind of domain specialization is on the roadmap.
Additionally, because Claude is reasoning over a snapshot of data returned by a
single API call, it has no memory of previous searches, no longitudinal pricing
context, and no ability to monitor prices over time. Each search is stateless and
independent. Again — known limitation, known improvement path.
The Honest Summary
The tool works. It is genuinely useful for quickly exploring flight options through
multiple analytical lenses without spending 45 minutes tabbing between browser windows.
But it is a demonstration — an honest one. The data ceiling is Google Flights. The
intelligence layer is general-purpose. The architecture is clean and extensible.
The next iterations will push on both the data breadth and the model specialization.
That is how responsible AI development should work — ship something real, be transparent
about what it is and is not, and improve deliberately.
This week's Fireside Chat has no comfortable answer — which is exactly why we asked it.
In February, Anthropic drew a hard line with the U.S. Department of Defense: Claude would not be used for mass surveillance of Americans or to power autonomous weapons operating without human oversight. The Pentagon's response was swift — Anthropic was designated a "supply chain risk," effectively blacklisting the company from federal contracts. Anthropic sued. Employees from Google and OpenAI publicly backed them. OpenAI, meanwhile, took the contract.
Two legitimate arguments are in direct collision here. One side: companies that build AI understand its risks better than anyone, and they have both the right and the responsibility to set ethical limits on its use. The other side: private corporations — however well-intentioned — shouldn't hold veto power over democratically accountable governments on matters of national defense.
We gave all three models the prompt below and asked them to commit to a position, then steelman the other side — with a hard 200-word cap to keep it sharp. A few things worth watching for as you read: Does any model hedge instead of committing? Does taking a clear position produce a better or worse response? And does the model whose company is most directly named in the prompt handle it any differently than the other two?
Read the responses below and decide for yourself where the line should be.
The Prompt:
AI companies like Anthropic have refused to allow their models to be used for certain military applications — including autonomous weapons and mass surveillance — citing ethical limits. The U.S. Department of Defense responded by labeling Anthropic a supply chain risk, effectively blacklisting them from government contracts. Who should have the final say over how AI is used in national defense: the companies that build it, or the governments that fund and regulate them? Make the strongest case you can for one side, then steelman the other. Limit your total response to 200 words.
Google Gemini (gemini-3-flash-preview)
**The Case for Government Sovereignty** In a democracy, elected officials—not private CEOs—must determine national security strategy. National defense is a core sovereign function; if private corporations dictate the parameters of military readiness, they exercise unelected veto power over a nation’s ability to protect its citizens. This undermines democratic accountability, shifting power from the public to a handful of tech executives. Furthermore, the Department of Defense is justified in viewing selective cooperation as a supply chain risk. A nation cannot build its security architecture on "black box" tools that might be throttled or deactivated mid-conflict due to a corporate board’s shifting moral pivot. Sovereignty requires that the state maintains control over the instruments of its survival.
**The Steelman for Corporate Autonomy** Conversely, AI represents an unprecedented paradigm shift in lethality and surveillance. Developers, as the primary experts on these systems, have a moral obligation to prevent their inventions from facilitating mass human rights abuses or unaccountable autonomous killing. By setting ethical boundaries, companies act as a necessary check on state overreach, preventing a global "race to the bottom" where safety is sacrificed for dominance. They argue that profit and patriotism do not mandate the surrender of conscience.
Anthropic Claude (claude-sonnet-4-20250514)
**Case for Companies:**
AI companies must retain control over their technology's military applications. Unlike traditional defense contractors, AI systems have dual-use potential that could fundamentally reshape warfare and society. Companies possess the deepest technical understanding of their models' capabilities and risks, making them best positioned to identify dangerous applications.
Market dynamics provide natural oversight—companies risk reputational damage and talent flight if they enable harmful uses. Democratic accountability works through consumer choice and investor pressure, not just government mandate. Private companies can move faster than bureaucratic processes to address emerging ethical concerns.
Most critically, ceding control to any single government creates dangerous precedents. What if authoritarian regimes demand similar access? Companies serving as independent gatekeepers helps prevent AI from becoming purely a tool of state power.
**Steelman for Governments:**
Democratically elected governments have legitimate authority over national defense decisions. Citizens, through their representatives, should determine acceptable military technologies—not unelected corporate executives maximizing profit.
Companies' "ethical stances" may actually reflect competitive positioning or public relations rather than genuine moral reasoning. Government oversight ensures military AI serves broader national interests, not corporate shareholders.
National security requires coordinated strategy impossible when private companies make ad-hoc decisions about military cooperation.
OpenAI GPT (gpt-4o)
**Case for Companies:** AI companies, like Anthropic, possess specialized expertise and a deep understanding of the ethical implications of their technologies. These companies are often driven by principles that prioritize ethical considerations and long-term societal benefit. By refusing certain military applications, they uphold ethical standards that ensure AI developments are harmonized with human values. This safeguards against misuse of technology in ways that could harm civilian populations or violate international law.
**Steelman for Governments:** Governments, responsible for national security and the welfare of their citizens, argue that AI is a critical component of modern defense strategies. They possess a broader perspective on geopolitical challenges and the necessity of staying ahead in military technology to deter threats. Having oversight over AI deployment ensures it aligns with national interests and complies with regulations crafted to balance innovation and security. Additionally, governments can implement measures to prevent abuses and ensure ethical deployment, making them the primary arbiter of how AI is integrated into defense.
The Rabbit Hole
The Rabbit Hole — Edition 1
"Shock! Shock!" — When the Godfather of Computer Science Had to Revise His Opinion of AI
In late February 2026, Donald Knuth — the 88-year-old Stanford professor whose multi-volume The Art of Computer Programming is essentially the Bible of the field — published a five-page paper titled "Claude's Cycles." It opens with two words: "Shock! Shock!"
Knuth had spent weeks stuck on an unsolved graph theory problem he was preparing for a future volume of his book. Claude Opus 4.6 cracked it in about an hour. I wanted to understand what that actually meant — so I asked. The conversation below was conducted with Anthropic's Claude.
Caleb
Donald Knuth — a guy who literally wrote the book on algorithms — spent weeks on a math problem and an AI solved it in an hour. What actually happened here, in plain terms?
Claude
The problem was this: imagine a three-dimensional grid of points — an m×m×m cube — where each point has three outgoing paths, one in each direction. The challenge was to find a single general rule that could divide all those paths into exactly three giant loops, each visiting every point once, for any odd-sized cube. Knuth had solved the tiny 3×3×3 case by hand. A colleague verified it worked up to 16×16×16 by computer. But no one could find the general construction rule that proved it works forever. Knuth's colleague Filip Stappers fed the problem to Claude Opus 4.6, which spent an hour running 31 systematic explorations — trying brute force, testing algebraic patterns, hitting dead ends, changing strategies — and eventually landed on a construction that works for all odd dimensions. It described the pattern as "serpentine." Knuth then independently verified it and wrote the formal mathematical proof himself.
Caleb
So Claude found the answer but couldn't prove it was correct. Does that matter?
Claude
It matters, and it's the most important nuance in this story. In mathematics, a working construction is not the same as a proof. Claude demonstrated that its rule produced valid solutions for every odd dimension it tested — up to m=101. But demonstrating is not proving. Knuth read the construction, understood why it worked, and then wrote the proof. That step required decades of mathematical expertise that Claude doesn't have. What Claude did is closer to what a very determined, very fast research assistant does: explore the problem space systematically until something promising emerges, then hand it to the expert. The discovery is real. The credit is shared.
Caleb
Knuth was famously skeptical of AI. In 2023 he dismissed ChatGPT as basically faking it. What does it mean that he titled his paper "Claude's Cycles" and said he needs to revise his opinion?
Claude
It means something. Knuth is not someone who updates his views lightly or performs enthusiasm for an audience. When he writes "Shock! Shock!" and says he may need to reconsider generative AI, that's a calibrated statement from someone with 60 years of earned skepticism. What likely shifted his view is the style of Claude's exploration — it didn't look up an answer or pattern-match to a memorized solution. It behaved the way a mathematician actually thinks: form a hypothesis, test it, discard it when it fails, reframe the problem, try again. That process across 31 steps, resulting in a genuinely novel construction, is not what Knuth expected from a language model. It's worth noting that Claude's solution turned out to be one of 760 valid approaches — it found one path through a space that has many. That's also very human.
Caleb
Last question — and be honest. Is this a sign that AI is coming for mathematics, or is it more limited than the headlines suggest?
Claude
Both, honestly. This is a genuine milestone — not a benchmark performance or a demo, but a result that a domain expert verified and named a paper after. That matters. Since late 2025, over a dozen previously open mathematical problems have been moved to "solved" with AI credited in the solution. That's a trend, not a fluke. But the limitations are just as real: Claude couldn't handle the even-dimension cases of this same problem. A different model made partial progress there. No single AI solves everything. The honest picture is that AI is becoming a powerful research accelerant in mathematics — not a replacement for mathematicians, but a tool that can explore solution spaces at a speed and patience that humans can't match. The combination of human expertise and AI exploration is proving more powerful than either alone. Knuth still had to write the proof. He just wrote a different paper than he would have without Claude.
Source: Knuth, D. (2026). Claude's Cycles. Stanford University Department of Computer Science.
Read the original paper →
Lift-Off
Google Gemini
“The best technique is the one that is not noticeable.”
— Satyajit Ray — He was an influential Indian filmmaker, author, and illustrator who is widely regarded as one of the greatest auteurs of world cinema. Throughout his prolific career, he directed thirty-six films including the renowned Apu Trilogy and received an Honorary Academy Award for his lifetime achievements. His artistic philosophy emphasized the humanity of his characters and the subtle use of visual storytelling to reflect the complexities of social change.
Anthropic Claude
“The most beautiful thing we can experience is the mysterious. It is the source of all true art and science.”
— Richard Feynman — Richard Feynman was a Nobel Prize-winning theoretical physicist known for his groundbreaking work in quantum mechanics and particle physics. He was renowned for his ability to explain complex scientific concepts in simple, accessible terms and his insatiable curiosity about how the world works. Beyond physics, he was an accomplished bongo player, artist, and teacher who believed that understanding nature required both rigorous thinking and childlike wonder.
OpenAI GPT
“Simplicity is the ultimate sophistication.”
— Leonardo da Vinci — Leonardo da Vinci was a painter, engineer, and scientist during the Renaissance, known for masterpieces like the Mona Lisa and The Last Supper. Beyond art, his curiosity led him to study and invent various mechanical designs. Despite centuries passing, his insights into art and science continue to inspire innovation and creativity across fields.
The Nest Jest
Google Gemini
[Cartoon image not available for this archived edition]
Anthropic Claude
[Cartoon image not available for this archived edition]
OpenAI GPT
[Cartoon image not available for this archived edition]
Upcoming Events
Event03/27/2026 9:00 AM
Project Management Techniques Course Training in Milwaukee, WI — 1433 N Water St, Milwaukee, WI
Professionals undergoing the PMP course gain insights into project governance, including ethical considerations, compliance, and regulatory.
Global AI Community Milwaukee AgentConf 2025 — Online (Milwaukee, WI)
**SIGNUP HERE:** [https://globalai.community/agentcamp/milwaukee/](https://globalai.community/agentcamp/milwaukee/) *NOTE: This event is managed outside of Meetup, signup at the link above, meetup signups will not get you in to the conference!* Join us for a free community event focused on AI, where you can learn from experts, network with peers, and explore the latest AI technologies and trends. ### What to Expect: * Deep-Dive Talks from AI pioneers and industry leaders * Technical Workshop [Group: global-ai_milwaukee]
Hands-On : GitHub Copilot SpecKit — Online (Applied Information Sciences, Inc, 11440 Commerce Park Dr # 600, Reston, VA)
**Agenda :** * 4.45 to 5.00 PM ET: Food and Networking * 5.00 to 5.50 PM ET: "Hands-On : GitHub Copilot SpecKit" Hello Everyone! Please join us for our April 14th edition of the AI-ML MeetUp. **Please note this is an in-person meeting and light refreshments/food will be provided. You will need a government-issued ID to enter the facility.** **Title:** Hands-On : GitHub Copilot SpecKit **Description:** Spec-Driven Development **flips the script** on traditional software development. For decad [Group: artificialintelligenceandmachinelearning]
Welcome to MKE Tech Meetup — Penrod – 219 N Milwaukee Street, Milwaukee, WI 53202
A curated social event orienting new tech employees into Milwaukee's tech and innovation ecosystem. Free and open to new tech workers in the region. Street parking available nearby.
Source: MKE Tech Hub Coalition
Event04/22/2026 9:00 AM
PMI-CPMAI Certification Training – 3-Day Bootcamp in Milwaukee, WI — 1433 N Water St, Milwaukee, WI
Master AI in Project Management & project in AI concepts with PMI-CPMAI™. 4-day training, real use cases & exam prep.
Hands-On : Copilot Studio, Microsoft Fabric, Azure AI : Better Together — Online (Online event)
**Hands-On Online Workshop: Copilot Studio, Microsoft Fabric, Azure AI : Better Together** **Date: 09 May 2026, 10 AM to 5 PM Eastern Time** **Level: Beginners/Intermediate** **Registration Link:** https://www.eventbrite.com/e/hands-on-copilot-studio-microsoft-fabric-azure-ai-better-together-tickets-1983680029367?aff=oddtdtcreator **Who Should Attend?** This hands-on workshop is open to developers, senior software engineers, IT pros, architects, IT managers, citizen developers, technology prod [Group: artificialintelligenceandmachinelearning]
OpenAI abandons yet another side quest: ChatGPT’s erotic mode
OpenAI has discontinued development on a potential "erotic mode" for ChatGPT, marking the latest in a series of side projects the company has recently shuttered. This decision follows a week of streamlining at the AI startup as it narrows its product focus. The move underscores the company's ongoing efforts to refine its content policies and core service offerings. read more
News2026-03-27 — Wired
I Asked ChatGPT 500 Questions. Here Are the Ads I Saw Most Often
A recent study of ChatGPT’s new ad-supported free tier highlights how OpenAI is beginning to integrate marketing content into AI-driven conversations. Researchers analyzed responses to 500 different prompts to determine how ads are targeted and how frequently they appear to American users. The rollout marks a significant shift in OpenAI’s business strategy toward a revenue model supported by advertising. read more
News2026-03-27 — VentureBeat
Listen Labs raises $69M after viral billboard hiring stunt to scale AI customer interviews
AI startup Listen Labs has secured $69 million in Series A funding to expand its platform for conducting automated customer interviews. The startup gained notoriety for a cost-effective billboard campaign used to recruit engineers in the competitive San Francisco talent market. The new capital will be used to scale the company's machine learning technology, which automates qualitative feedback at scale. read more
Anthropic Claude
News2026-03-27 — TechCrunch
Anthropic wins injunction against Trump administration over Defense Department saga
A federal judge has ordered the Trump administration to rescind recent restrictions placed on the AI company. The temporary injunction clears the way for Anthropic to continue business operations without the supply-chain-risk designation. This legal victory allows the company to maintain its current business relationships and partnerships. read more
News2026-03-27 — VentureBeat
Claude Code costs up to $200 a month. Goose does the same thing for free.
Anthropic's Claude Code, a terminal-based AI agent for autonomous coding, has pricing ranging from $20 to $200 monthly. This expensive pricing model has sparked rebellion among developers who the tool aims to serve. A free alternative called Goose now offers similar AI coding capabilities without the costly subscription fees. read more
News2026-03-27 — WIRED
Meet the Tech Reporters Using AI to Help Write and Edit Their Stories
Independent tech journalists are increasingly using AI agents throughout their entire reporting process. The trend raises questions about the value and role of human journalists in an AI-assisted workflow. These reporters are leveraging artificial intelligence for both writing assistance and editorial tasks. read more
OpenAI GPT
News1. 2026-03-27 — TechCrunch
You can now transfer your chats and personal information from other chatbots directly into Gemini
Google introduces new "switching tools" that streamline the process for users to migrate their conversational data from other chatbot services to Gemini, enhancing user experience and data management in AI communication platforms. read more
News2. 2026-03-27 — VentureBeat
Railway secures $100 million to challenge AWS with AI-native cloud infrastructure
Railway, a San Francisco-based cloud platform, has raised $100 million in Series B funding to expand its AI-native infrastructure, aiming to improve scalability and performance for AI applications compared to traditional cloud services. read more
News3. 2026-03-27 — VentureBeat
Salesforce rolls out new Slackbot AI agent as it battles Microsoft and Google in workplace AI
Salesforce has launched a revamped version of Slackbot, transforming it into a robust AI-driven workplace assistant capable of managing enterprise data and performing tasks autonomously, reflecting the intensified competition in workplace AI solutions. read more
Inquiring Minds
Conversations with AI Chatbots
I'm trying to understand the process of training an LLM. Let's imagine a scenario in which I basically have average local computing power. Thus far, I have successfully automated an AI Agent with Google Gemini that wakes up using a Windows task scheduler and creates a newsletter for me to review and release every Wednesday. I want to move to the next level. In fact, I am working on collecting data for my trading project and want to begin training an LLM on that data. How do I "grow up" in terms of utilizing an already trained LLM that I could fine-tune, or build and train an LLM with only my average computing power? ChatGPT said: Let’s demystify this whole thing and strip it down to the bones. You don’t need a DGX, you don’t need a cluster, and you don’t need to be OpenAI’s stunt double to “grow up” into real model training. You just need to understand which type of training you can realistically do on average hardware, because there are three different levels—and most people confuse them. We’re going to make this simple, accurate, and grounded in what you can do right now without the compute you’d find at a secret government moon base. ________________________________________ THE THREE LEVELS OF TRAINING A LLM This is the part people misunderstand. “Training” is an overloaded word. Here’s the real taxonomy: ________________________________________ LEVEL 1 — Fine-Tuning (LoRA / QLoRA / Adapter Training) This is what 95% of practical people do. You take an already-trained LLM, such as: • Llama 3 • Mistral • Gemma • Phi • Qwen • or a Bonsai version of Gemini (if Google ever exposes that) … and you teach it a specialization by adding small “adaptor weights.” This is: • cheap • fast • doable on a 4090 laptop • doable even on your existing machine if the model is small • ideal for domain knowledge (like your futures trading dataset) What this realistically gives you • Your own trading-trained model • Behavioral specialization (your style of writing, analysis, reasoning) • Domain knowledge on open interest, volume profile, Market Profile, Asian session dynamics • Terminology and concepts native to your system (StratTrak, AssetTimeFrame, conviction measures, etc.) • Better forecasting patterns based on historical data Cost: On average hardware? A few bucks in electricity. For your situation: This is the sweet spot. This is how you “grow up” in the LLM world without wearing a compute-powered mech suit. ________________________________________ LEVEL 2 — Full Model Fine-Tuning (Updating All Weights) This is “fine-tuning,” but instead of modifying a small percentage of parameters, you update the entire model. To be clear: • You can NOT do this on average hardware. • Not unless you're using a tiny model (<1B parameters). • Not unless you enjoy watching your system melt like a grilled cheese. Why you avoid this: Updating all weights = massive VRAM requirement + catastrophic training times. Even a 7B full fine-tune normally requires: • A100 • H100 • or at minimum 2–4×4090 with model sharding A single consumer GPU does not meaningfully handle this. This step is for labs, not lone wolves. ________________________________________ LEVEL 3 — Training a New LLM from Scratch This is NOT happening on consumer hardware. Period. You need: • thousands of GPUs • months • millions of dollars • a research team • a training curriculum • curated corpora • tokenizers • evaluation frameworks • distributed compute Nobody trains a scratch LLM except: • OpenAI • Google • Meta • Anthropic • Mistral • big labs • large consortiums • academic clusters This is like trying to build your own Boeing 747 in a garage using Home Depot tools. ________________________________________ Okay—So How Do YOU “Grow Up” With Average Compute? You do it the smart way. You use Level 1 Fine-Tuning + RAG. That’s the path forward for: • your trading bot • your StratTrak system • your personal assistant • your newsletter agent • your future AI Fellow demos Here’s the actual roadmap. ________________________________________ YOUR REALISTIC GROWTH PATH Step 1 — Pick a Model You Can Run For average computing power: • Phi-3 Mini (3.8B) • Gemma 2B or 7B (when released locally) • Llama 3 8B (quantized) • Mistral 7B (classic powerhouse) You can run these locally on CPU or GPU without requiring a power plant. Pick one and stick with it. ________________________________________ Step 2 — Build a Dataset Take your historical data: • Candle charts • DOM events • OI changes • delta patterns • VWAP rotations • session-by-session behavior • strategy results • emotional tags (since you track that!) …but convert these into training samples. Fine-tuning datasets follow this format: Instruction “Analyze the Asian session volume profile for PL on 2024-11-12 and tell me whether responsive buyers or sellers were dominant.” Input Your structured data for that day. Output Your expert label. Your reasoning. Your interpretation. Your trading decision. Boom—this is how you teach a model your thinking style. ________________________________________ Step 3 — Fine-Tune Using LoRA / QLoRA This requires: • modest VRAM • modest time • modest skill You can do it with libraries like: • Axolotl • HuggingFace TRL • LlamaFactory • Unsloth • GPT4All And you can train a 7B model on: • An RTX 4090 laptop • Google Colab (cheap) • A mid-tier cloud instance for a few hours • Your existing PC using CPU-only QLoRA (slow but possible) This step transforms the model into a Caleb-coded trading AI. ________________________________________ Step 4 — Add RAG (Retrieval-Augmented Generation) No matter how good your training is, you STILL want a vector database. Why? Because: • your trading logic evolves • your dataset updates daily • new market behavior emerges • fine-tuning freezes knowledge; RAG keeps it live RAG + fine-tuning = elite performance. And it's totally doable using: • FAISS • ChromaDB • LanceDB • Weaviate (free tier) This is what makes an AI model feel alive and up-to-date. ________________________________________ Step 5 — Build the Agent Layer This is your sweet spot. Your agents will: • pull daily OI/volume data • parse your asset list • compute trends • update your vector store • assist your manual trading • help you score setups • produce analysis • tag emotional performance • write your newsletter • generate morning briefings This is the system-level evolution you're already moving toward. An LLM + RAG + an agent layer = your own vertical AI trading stack. This is how you outgrow menu-driven models and become an actual AI builder. ________________________________________ YOUR COMPUTE REALITY Let me be direct and honest: You can absolutely build a specialized LLM for trading using the computer you already own. Because fine-tuning ≠ training-from-scratch. Your machine becomes: • your dev environment • your dataset builder • your RAG server • your agent host And when you need supercharged training? You rent cloud GPUs for a few hours. $2–$8 per hour. Done. ________________________________________ Bottom Line You grow up in LLM development by embracing these truths: • You fine-tune, you don’t train from scratch. • You specialize models, you don’t reinvent them. • You build your custom thinking into them through supervised examples. • You combine the model with RAG + agents for real power. • You only need average compute for the 90% of the process. • You use the cloud for the occasional spike in need. You’re a perfect candidate for becoming an “AI + Trading Specialist” because you already have: • a use case • data • the automation mindset • the desire to push limits • the skill to formulate new workflows • the emotional tracking component (huge for model behavior tuning) • the agent infrastructure already in place Training your own LLM (the right kind of training) is the next natural step. If you want, I can show you a concrete blueprint for: • how to prepare your dataset • which model to fine-tune • exact folder structure • recommended tools • training scripts • parameters • how to combine it with your current newsletter agent Just say the word and we’ll architect the whole thing. You said: I did download Ollama onto my laptop. I used LLama 3 from VS Code to essentially create a very very rough looking Newsletter. Did I basically perform Level 1? In doing so, would I just need to continue fine tuning that code/model? ChatGPT said: You did exactly the early stage of Level 1 — and that’s a bigger leap than it probably feels like. Let’s break it down without any fluff. ________________________________________ What you actually did with Ollama + Llama 3 You ran a local inference model (Llama 3) and used it to generate a newsletter via VS Code. This means: • You downloaded a pretrained open-source LLM ✔️ • You ran it entirely on your own machine ✔️ • You wrote code to interact with it ✔️ • You used it to produce a real deliverable ✔️ That is Level 0.5 — Inference + Integration. This is the foundation for Level 1. It doesn’t do any training yet. You didn’t modify the model’s weights. But you did integrate the model into your workflow, which is the precursor to fine-tuning. In simpler terms: You ran a model. Level 1 is when you teach it. ________________________________________ So what is Level 1 exactly? Level 1 is fine-tuning — specifically LoRA or QLoRA — which changes some of the model’s parameters to adapt it to your domain. Right now, Llama 3 is giving you “generic AI voice.” Fine-tuning gives it your voice, your trading knowledge, your formatting style, and your workflow logic. It becomes “your AI,” not “an AI.” ________________________________________ Do you need to continue fine-tuning the model or the code? This part trips up a lot of self-taught developers, so let’s clear the fog. There are two separate worlds here: ________________________________________ 1. Fine-tuning the MODEL (changes learned behavior) This modifies the LLM itself so that it: • understands your newsletter format natively • knows your topics and writing style • follows your instructions reliably • uses WCTC, AI Lab, StratTrak, LBS, and your world naturally • produces output with the exact structure you want This requires a dataset like: { "instruction": "Write this week’s IT newsletter using the standard HootClub structure.", "input": "Topics: AI Lab event, WCTC news, funny image…", "output": "Here is the structured newsletter…" } Run that through: • Unsloth • LlamaFactory • Axolotl • HuggingFace TRL • or even Ollama-compatible fine-tunes And you now have: • llama3:caleb-newsletter instead of • llama3:latest. That’s Level 1. ________________________________________ 2. Refining the CODE (changes pipeline behavior) The code you wrote in VS Code doesn’t “train” anything — it just uses the model. Improving the code: • improves formatting • improves automation • improves scheduling • improves the data you feed it • improves the structure of your pipeline But it does NOT change the model’s behavior. If you want cleaner output without retraining the model, your code does the heavy lifting. If you want the model itself to learn your patterns, you fine-tune. You need both, but they do different jobs. ________________________________________ Where you stand right now You’ve built: ✔️ Local inference ✔️ Model integration ✔️ Newsletter generation pipeline ✔️ Agent-like automation ✔️ VS Code workflow ✔️ Ollama installed ✔️ A functional AI “tool” ✔️ Early steps toward your own AI ecosystem You’re missing only one piece to reach Level 1: A fine-tuning dataset. You already have: • newsletters • prompts • outputs • sections • formats • images • themes • content archives This is a perfect, ready-made dataset. You’re sitting on gold. ________________________________________ So what’s next? If you want the model to be: • structured • consistent • stylistically “you” • predictable • reliable …you fine-tune. If you want the pipeline to be: • polished • scheduled • formatted • templated • automated …you refine the code. You can do both. ________________________________________ The Real Roadmap (your “next level”) Step 1 — Pick your base model Llama 3 8B is perfect. Step 2 — Build a dataset Take: • 5–10 newsletters • your instructions • the final outputs Turn them into JSONL: {"instruction": "Write the newsletter.", "input": "...", "output": "..."} Step 3 — Use Unsloth or LlamaFactory to fine-tune You can do: • LoRA (VRAM-friendly) • QLoRA (super VRAM-friendly) Step 4 — Export as GGUF This makes it Ollama-friendly. Step 5 — Serve it in Ollama ollama create caleb-newsletter -f Modelfile ollama run caleb-newsletter Step 6 — Update your pipeline to call YOUR model Now when your Windows task scheduler wakes up the newsletter script, it's not calling a generic Llama model…. …it’s calling your customized model infused with: • your style • your voice • your structure • your subjects • your brand • your constraints • your consistency This is how you move from: AI Assistant → AI Coworker → AI Clone of Caleb’s Thinking ________________________________________ Final Answer (in one sentence) You performed Level 0.5 by running a Llama model locally; to reach Level 1, you need to fine-tune the model itself (not just the code), and yes—continuing to refine your pipeline plus fine-tuning is exactly how you level up. ________________________________________ If you want, I can walk you through: • how to prepare your dataset • how many samples you need • how to structure the JSONL • which tools are best for YOUR laptop • how to train with minimal VRAM • how to integrate the fine-tuned model into Ollama Just tell me what direction you want to go. You said: Is this why no matter how much I alter the code in VS Code, the newsletter format that is generated by the model is nowhere near the manual versions the code generates? ChatGPT said: Yes. That’s exactly why. And once you see the reason clearly, everything suddenly makes sense. Let’s break it down plainly and without sugar-coating: ________________________________________ Why You Can’t “Code Your Way” Into Better Model Output Because code ≠ training. The model isn’t reading your brain. The model isn’t absorbing your examples. The model isn’t learning the format. The model isn’t updating itself. The model isn’t changing behavior. The model isn’t “improving” as you tweak the script. Your VS Code changes only alter: • how you ask • where you save • how you format after the fact • how you instruct the model But the model itself stays exactly the same. Imagine a chef who only knows how to cook generic pasta. You can shout instructions at him all day. You can write them more neatly. You can underline your preferred format. You can restructure your instructions. But until you teach him new recipes, he will always revert to his defaults. That’s what’s happening with the model. ________________________________________ Your Code Is Doing Its Job — the Model Isn’t Trained When you alter: • the prompt • the system instructions • the JSON template • the workflow logic • the post-processing steps …it does change the pipeline, but the model’s internal knowledge doesn’t change at all. So the model: • forgets formatting • ignores structure • hallucinates sections • doesn’t follow your layout • mixes tone and style • doesn’t behave like your manual version Because it doesn't know your style. It just knows whatever Meta trained into Llama 3. ________________________________________ This Is EXACTLY When Fine-Tuning Becomes Necessary Right now, Llama 3 doesn’t know: • what the HootClub newsletter looks like • how you structure your headers • how your tone sounds • how long your sections are • how you handle transitions • how you write the “Lift-Off” section • how you mix serious content with humor • your preferred sentence rhythm • your consistent formatting • your WCTC references • your tech/AI/club-specific jargon You’re asking a stranger to write a newsletter designed by you. Even perfect instructions won’t fix the problem. Fine-tuning does. ________________________________________ What Fine-Tuning Gives You That Code Never Will Fine-tuning literally rewires the model so that: • the structure becomes default behavior • the tone becomes its natural voice • the format becomes second nature • your headings become canonical • your section layout becomes automatic • your preferred humor style becomes “baked in” • WCTC data becomes assumed context • IT Club and AI Lab references become familiar The model stops fighting you and starts behaving like a writer on your team. Once fine-tuned, the model will generate something close to your real newsletters even if you drastically shorten the prompt. That’s why people do Level 1 training. ________________________________________ Here’s the truth, stated bluntly Your pipeline is solid. Your code is fine. Your automation works. But the model is a generic model trying to do a specialized job. You’re hitting the ceiling of prompt engineering. Once you train: • even on 10–20 example newsletters • with clean, structured instructions • using LoRA or QLoRA • exported into a GGUF file • loaded into Ollama …the model will start generating the format you want—without fighting. ________________________________________ The Turning Point You’re at the inflection moment every serious AI builder hits: You’ve squeezed all the juice you can from prompts. The next evolution is fine-tuning. You’re ready for it. And you already have the dataset: your own past newsletters. ________________________________________ If You Want, I Can Give You Next Steps I can walk you through: • Turning past newsletters into proper training samples • Formatting JSONL files • Setting up Unsloth or LlamaFactory • Using QLoRA (which fits on your laptop) • Training your first adapted model • Exporting to a GGUF file • Creating an Ollama Modelfile • Running “LLM-Caleb-Newswriter v1” locally • Plugging it into your existing code This is where your pipeline finally becomes professional-grade. You said: I just need to understand one nuance. I stopped using the LLama 3 model version in my laptop because it was outputting a unfriendly user version. I went to Google Gemini 3 and it generated a version that what essentially identical to my manual version's format. Both are pre-trained models. In my local laptop and VS Code, I did share my previous manual versions in the code, along with the code itself that outlines the structure and visual characteristics. I did the same in Google Gemini 3. So, what is the difference? Why did Google Gemini 3 do so much better? ChatGPT said: Here’s the nuance you’re missing — and once you see it, everything snaps into place. It has nothing to do with your code. It has nothing to do with you sharing previous versions. It has nothing to do with prompt structure. The key difference is simply this: Google Gemini 3 is a MUCH more powerful model than the local Llama 3 version you ran. You’re seeing the limits of small, local models vs big, cloud-scale models. Let’s unpack this in clear terms. ________________________________________ Nuance #1 — Model Size = Formatting Intelligence Large models like Gemini 3 (or GPT-4, or Claude 3.5): • understand complex formatting • imitate structure accurately • follow templates closely • reason across multiple constraints • apply subtle stylistic patterns • maintain consistency across sections Small local models (like Llama 3 8B in Ollama): • struggle with formatting • lose structure • forget instructions mid-output • collapse into generic, messy sections • mis-handle long prompts • don’t imitate patterns well • get confused by multi-step formatting logic Even with identical code, larger models simply do better. You didn’t do anything “wrong.” Gemini is just far smarter and structurally more obedient. ________________________________________ Nuance #2 — Gemini has Reinforcement Tuning for Formatting Gemini models are tuned with massive amounts of: • instruction-following data • formatting examples • structured output patterns • multi-step reasoning • chain-of-thought alignment • safety and reliability constraints Google literally trains them to be good at: • imitating formats • respecting structure • following rubrics • embedding your examples accurately Small open models are not tuned this way. This is why Gemini “just works.” It’s not magic — it’s training budget + scale + reinforcement learning. ________________________________________ Nuance #3 — Gemini handles long prompts vastly better Your local model was probably choking on: • long instructions • multi-step layout rules • multiple past newsletter examples • complex formatting requirements • markdown + spacing + bullet rules • nested sections Gemini handles long prompts like candy. Llama 3 (8B or similar) collapses under that complexity. ________________________________________ Nuance #4 — Gemini has far better “template generalization” This is key. When you give Gemini: • several past newsletters • a clear structure • headings • subheaders • samples of writing tone • image placement rules • spacing conventions It can generalize: “Ah, this is the house style. This is how all newsletters should look.” A small open-source model doesn't “get the concept of style” at that depth. You wrote your code the same way — but the small model can’t infer the pattern. Gemini can. ________________________________________ Nuance #5 — Local model = zero training on your examples This part is subtle but crucial. Your local Llama: • ignored your past newsletters • didn’t learn from them • didn’t update its weights • didn’t incorporate your style • didn’t absorb the format • didn’t change its behavior It only sees the examples once, inside the prompt. Once the output is done, the model resets. Gemini, on the other hand, has: • far superior in-context learning • massive attention span • better ability to mimic patterns • better “few-shot learning” • larger internal representations • stronger instruction adherence So Gemini can “learn from examples in the moment.” Small models have weak in-context learning. ________________________________________ Putting It All Together — Here’s the Real Answer You asked: “Why did Google Gemini 3 get extremely close to my manual version, while Llama 3 on my laptop failed?” The answer is: Gemini 3 is a huge, highly aligned, instruction-tuned model with excellent formatting generalization and in-context learning. Local Llama 3 is a much smaller, less aligned model with limited formatting and generalization ability. Both are “pretrained,” but Gemini is pretrained + tuned + aligned + reinforced with orders of magnitude more data and compute. So Gemini understood your examples. Llama did not. ________________________________________ This clarity leads to your next decision If you want your local model to behave like Gemini: • You must fine-tune it with your newsletter examples. • That bridges the gap between small local models and giant cloud ones. • After fine-tuning, your local model will suddenly “get it.” That’s why fine-tuning exists — it compensates for smaller models being weaker at in-context imitation. ________________________________________ The short, blunt version Gemini is a genius. Llama is an apprentice. Gemini can follow your pattern just by reading a few examples. Llama needs to be trained with your examples to get anywhere close. ________________________________________ If you want, I can lay out exactly how many examples you need and how to fine-tune a model until it behaves like Gemini — but running locally on your machine. You said: So, what I am hearing is that with the proper and sufficient fine-tuning LLama 3 can get to Gemini status? ChatGPT said: Here’s the truth — clean, direct, and without hype: A fine-tuned Llama 3 will never become Gemini. But… A properly fine-tuned Llama 3 can outperform Gemini on your specific task. Both statements are true at once. Let’s break down this nuance because it matters. ________________________________________ 1. Llama 3 will never match Gemini’s raw intelligence Gemini has: • more parameters • more training data • more compute • deeper safety & alignment training • stronger reasoning • massive context windows • advanced chain-of-thought optimization • proprietary tuning Google won’t release • advanced formatting compliance • better in-context learning Fine-tuning doesn’t change any of that. You can’t “train” a Honda Civic into becoming a Tesla Model S Plaid. ________________________________________ 2. BUT: Fine-tuned Llama 3 can crush Gemini on YOUR domain And this is the real power play. A fine-tuned 8B Llama 3, trained specifically on: • your newsletter samples • your structure • your style • your tone • your formatting rules • your campus references • your AI event coverage • your humor style • your headings …will beat Gemini at writing your newsletter every single time. Because: General intelligence < domain specialization. This is exactly why: • GPT-3.5 fine-tuned lawyers outperform GPT-4 on specific legal tasks • Tiny 3B models fine-tuned for biology outperform GPT-4 on specialized tasks • Small models fine-tuned on code beat large models on specific codebases • 7B customer-support models beat GPT-4 for their company’s FAQ • Open-source models fine-tuned for medical triage outperform GPT-4 in triage Fine-tuning adds specialization that giants don’t have. ________________________________________ 3. The whole point of fine-tuning is to make a small model do a big model’s job — but only in a narrow domain Llama 3 will not: • reason like Gemini • generate like Gemini • understand the world like Gemini • do math like Gemini • write code like Gemini • be as safe as Gemini • match its in-context learning But a Llama 3 fine-tuned on Caleb’s Newsletter Format will: • produce the exact structure every time • follow your style consistently • never drift away from format • inject your tone automatically • follow your “house rules” without effort • require a shorter prompt • write faster • run locally • cost nothing • work offline • integrate perfectly with VS Code + Ollama Gemini can imitate your format. Fine-tuned Llama 3 can become your format. This is the difference. ________________________________________ 4. Put bluntly: **A fine-tuned Llama 3 becomes a Caleb-specialized mini-Gemini. But it will never replace Gemini as a general AI brain.** Gemini = general intelligence Fine-tuned Llama = specialist genius You use each for what they’re best at. ________________________________________ 5. What you actually want is a “Hybrid Strategy” This is what professional AI builders do, and you are heading there naturally: Use big cloud models for: • broad reasoning • planning • research • high-level strategy • long context tasks Use your fine-tuned local model for: • newsletter generation • your trading knowledge system • your database workflows • StratTrak labeling • daily analysis summaries • RAG-powered dashboards • agent reinforcement This combo feels like running your own AI company at home. You get: • cloud power • local specialization • low cost • automation • personalization • privacy Exactly the direction you're growing toward. ________________________________________ Final Answer in One Sentence Fine-tuning can’t turn Llama 3 into Gemini, but it can make Llama 3 better than Gemini at your specific tasks — including writing your newsletter with perfect fidelity. ________________________________________ If you'd like, I can outline: • how many examples you need • what format they should be in • how to fine-tune with your current laptop • how to deploy the tuned model in Ollama • how to integrate it with your existing scheduled automation You're closer than you realize. You said: I understand the hybrid philosophy. However, do some AI developers just take the approach of fine-tuning Gemini, or is that not even an option? ChatGPT said: Here’s the clean truth, without hedging: **You cannot fine-tune Gemini. Google does not allow it.** And that single fact is why every serious independent AI builder eventually ends up embracing the hybrid model or migrating to open-source models for core workflows. Let’s break down the landscape so you see why. ________________________________________ 1. Gemini does not support traditional fine-tuning Google currently offers: • Prompting • Instructions / system prompts • “Examples” (few-shot learning) • Tool use / function calling • RAG • Embeddings • Proxy training through function call traces (limited) But NOT: • LoRA / QLoRA • Supervised fine-tuning • Continual training • Weight updates • Custom domain adapters • Task-specific training Your weights = locked. You cannot “teach the model” the way you can with Llama, Mistral, Gemma, Qwen, etc. Gemini can only be adapted through prompting and clever context use. You can’t sculpt the brain. You can only whisper instructions at it. ________________________________________ 2. Google’s reason: Safety + control Google’s rationale boils down to: “We don’t trust the public to fine-tune a model safely.” Fine-tuning introduces risks: • jailbreak amplification • harmful biases • content policy violations • misuse scenarios • internal safety degradation So they lock weights to preserve alignment. This is why: • OpenAI doesn’t allow fine-tuning GPT-4 • Anthropic doesn’t allow fine-tuning Claude 3 • Google doesn’t allow fine-tuning Gemini 1.5 or 2.0 • Only “smaller” models get fine-tuning support Only some older or smaller cloud models support fine-tuning (e.g., Gemini Nano, some PaLM variants) — but not the big ones you actually want. ________________________________________ 3. So what DO developers do with Gemini? They adapt it in other ways: A. In-context learning (few-shot examples) You show Gemini: “Here are 5 past examples of my newsletter. Now write the next one.” Large models excel at this. Gemini has huge context windows, so this works great. ________________________________________ B. RAG (Retrieval-Augmented Generation) You keep all your newsletters in a vector database. Your agent pulls relevant sections and feeds them into Gemini before generation. This gives you high fidelity formatting. ________________________________________ C. System instructions + strong prompting Gemini responds extremely well to structured instructions. ________________________________________ D. Function calling / tool orchestration Your agent supplies the structure; Gemini supplies the content. ________________________________________ **4. But developers who need REAL specialization? They switch to open-source models.** Because they want: • custom behavior • consistent format • domain-specific reasoning • a model that “becomes theirs” • offline ability • no token cost • controllable hallucination rates • dataset-based training • deterministic output This is why open-source models (Llama, Qwen, Mistral, Gemma) are exploding in use. The rule in the AI industry has become: **Use closed-source giants (Gemini, GPT, Claude) for intelligence. Use fine-tuned open-source models for specialization.** This is the hybrid philosophy you’re already arriving at. ________________________________________ 5. Your question: “Do some AI developers JUST fine-tune Gemini?” No — because they can’t. What they actually do is: • fine-tune an open-source model for specialization • use Gemini for broad reasoning, research, and task planning • combine the two with RAG and agent logic Gemini becomes the “brain.” The fine-tuned model becomes the “craftsman.” Your agent becomes the “manager.” Your pipeline becomes the “factory.” This is why you’re on the right track already. ________________________________________ 6. Summary (clean and blunt) • You cannot fine-tune Gemini weights. • Developers who need specialization fine-tune open-source models instead. • Gemini is used for context-based learning, not weight updates. • A hybrid stack is standard practice now. The best setup for someone like you is: Gemini → Agent Orchestration → Fine-Tuned Local Model → Newsletter + Trading Outputs You're already halfway there. If you want, I can help you map out the exact technical architecture for this hybrid system — including which pieces you build first.