Weekend Builder/Intensive
/ A two-day build

The Weekend Builder's Intensive

Ship a real product with Claude. Build a local AI agent. Two days, two things that actually work.

Prep Β· before the weekend0%
Saturday Β· ship a product0%
Sunday Β· build an agent0%
Copy any code with the button β€” no retyping. Tick boxes to fill your progress bars. Guidance boxes explain every term in plain words. πŸ“Œ For later notes hold alternatives β€” skip them now.
/ Pack 0

Before the weekend

~2 hours the evening before, so the weekend is spent building, not installing.

0.1 Setup checklist

For the non-tech learner

Tick each box. If something fails, that's normal β€” note the error and move on; you'll have buffer time Saturday morning. A "verify" command just confirms a tool installed correctly: it prints a version number if all is well.

Accounts (all free to start)

Laptop (the machine you code on)

Mac mini (for Sunday)

Pre-reading (15-min skim)

Alternatives (for later)
  • Editor: Cursor (AI-first) or Zed instead of VS Code.
  • Local model runner: LM Studio (point-and-click, no terminal) or llama.cpp instead of Ollama.
  • Node install: nvm, to switch Node versions easily.

0.2 Credentials & keys tracker

For the non-tech learner

Keys are easy to lose. Track where each one lives β€” never paste a real secret value into a shared doc or your code. Keep real values in a password manager.

ItemWhere it livesNotes
Anthropic API keyPassword manager + .envStarts sk-ant-… β€” never commit to GitHub
GitHub loginPassword managerEnable 2-factor auth
Vercel login"Sign in with GitHub"No separate password
Google Places keyPassword manager + .envRestrict the key in Google Cloud
/ Pack 1 Β· Saturday

Ship a vibecode product

Goal: a working web app at a live public URL, version-controlled and auto-deploying.

How a web app is wired

For the non-tech learner

A web app is four parts talking to each other. You don't need to memorize this β€” just recognize the words when Claude uses them.

flowchart LR
  U([User browser]) -->|clicks, types| FE[Frontend<br/>what users see]
  FE -->|asks for data| BE[Backend<br/>rules and logic]
  BE -->|reads / writes| DB[(Database<br/>stored data)]
  BE -->|calls| EXT[External APIs<br/>Claude, Maps]
  EXT --> BE --> FE --> U
      

Frontend = seen by user Β· Backend = logic Β· Database = storage Β· API = the messenger between systems.

The whole Saturday, as one flow

flowchart LR
  A[Idea] --> B[PRD / MVP]
  B --> C[Build with Claude Code]
  C --> D[Test on localhost]
  D --> E[Push to GitHub]
  E --> F[Deploy to Vercel]
  F --> G([Live public URL])
  G -.feedback.-> B
      

1.1 Product vision statement

For the non-tech learner

One paragraph that keeps you honest. When you're tempted to add a 10th feature at 4pm, re-read this. If the new idea doesn't serve it, it waits.

PRODUCT VISION

My app is called: <NAME>
It helps: <WHO β€” the specific person>
do: <WHAT β€” the one core job>
so that: <WHY β€” the benefit they get>

In one sentence: "<NAME> lets <WHO> <DO THE JOB> without <THE OLD PAIN>."

The single thing it must do well by tonight: <ONE CORE FEATURE>

1.2 One-page PRD

For the non-tech learner

A PRD (Product Requirements Document) is just a plan on one page: what you're building and where the edges are. The most important section is "Out of scope this weekend" β€” that's what protects your timeline. Let Claude draft it, then you cut it down.

PRD β€” <APP NAME>                         Date: <DATE>

1. PROBLEM        <who hurts, and how>
2. TARGET USER    <one specific kind of person>
3. CORE VALUE     <the one thing that must work>

4. MVP FEATURES  (max 5 β€” ruthless)
   [ ] F1: <feature>   β€” must have
   [ ] F2: <feature>   β€” must have
   [ ] F3: <feature>   β€” nice if time

5. USER STORIES   As a <user>, I want <action>, so that <benefit>.
6. SCREENS        <Home>: <what's on it>
7. DATA           <what gets stored>
8. OUT OF SCOPE   <login, payments, mobile app...>   <- protect this
9. DONE =         <one sentence describing the demo>
Alternatives (for later)
  • For bigger projects, split into separate spec + roadmap docs, or use Notion / Linear. For now, one file is better.

1.3 CLAUDE.md β€” Claude Code's memory

For the non-tech learner

This file is Claude Code's "sticky note" about your project β€” it reads it automatically each session. Keep it short. For every line ask: "if I deleted this, would Claude make a mistake?" If no, delete it. Generate a starter with the /init command, then trim.

# Project: <APP NAME>

## What this is
<One line. e.g. "A habit tracker β€” add habits, tick them daily.">

## Tech stack
- Framework: Next.js (React)
- Hosting: Vercel
- Data: <in-memory / JSON file / Postgres later>

## Commands
- Run locally:  npm run dev
- Run tests:    npm test
- Build:        npm run build

## Conventions
- Keep components small and clearly named.
- Don't add dependencies without telling me first.
- After each feature: run the app, then I review before commit.

## Never
- Never commit secrets or .env files.
- Never delete files without asking.
Alternatives (for later)
  • Cursor uses .cursorrules; some teams keep a /docs folder Claude reads. Same idea, different filename.

1.4 Claude Code prompts

For the non-tech learner

Reusable "scripts" for talking to Claude Code. The pattern behind all of them is Explore β†’ Plan β†’ Code β†’ Commit: make Claude look and plan before it writes, so it doesn't run off in the wrong direction.

flowchart TD
  E[Explore: Claude reads code] --> P[Plan: propose steps]
  P --> R{You approve?}
  R -->|No, adjust| P
  R -->|Yes| C[Code: make changes]
  C --> T[Test / run app]
  T --> OK{Works?}
  OK -->|No, paste error| C
  OK -->|Yes| CM[Commit to Git]
  CM --> E
      

A Β· Plan Mode kick-off

I'm building <APP NAME>. Here is my PRD: [paste PRD].
Before writing any code, use Plan Mode:
1. Propose a simple file structure.
2. List the steps to build MVP feature F1 only.
3. Flag anything risky or any choice you're making for me.
Wait for my approval before changing files.

B Β· Build one feature

Let's build feature <F1: NAME> from the PRD, and ONLY that.
- Keep it the simplest version that works.
- Explain each new file in one line as you create it.
- When done, tell me exactly how to run and see it.

C Β· Debug (when something breaks)

I ran <COMMAND> and got this error:

<PASTE THE FULL ERROR TEXT>

Diagnose the cause, propose the smallest fix, and show me the
change before applying it. Don't change anything unrelated.

D Β· Explain it to me

Explain what <FILE or CONCEPT> does as if I'm new to coding,
in 4 sentences. Then tell me the one thing I should understand
about it to not break it later.

E Β· Write a test

Write 2 simple tests for feature <F1> that would fail if it breaks.
Then run them and show me the result.
The golden habits

(1) approve plans before code; (2) review changes before committing; (3) when Claude drifts, stop and re-plan rather than piling on instructions; (4) run /clear when you switch to an unrelated task so Claude's memory stays clean.

1.5 Git workflow β€” your safety net

For the non-tech learner

Git saves snapshots of your work so you can always go back. GitHub stores those snapshots online and triggers your deploy. You only need five commands today. A "commit" = a saved snapshot with a label; a "push" = upload it to GitHub.

flowchart LR
  W[Change files] --> ADD[git add .]
  ADD --> COMMIT[git commit -m msg]
  COMMIT --> PUSH[git push]
  PUSH --> DEPLOY[Vercel auto-deploys]
      
# One-time, at project start
git init
git add .
git commit -m "Initial commit: project setup"
git remote add origin <YOUR_GITHUB_REPO_URL>
git push -u origin main

# Repeat after every working feature
git add .
git commit -m "Add <feature>: <what it does>"
git push

# Undo uncommitted changes
git restore .
Alternatives (for later)
  • GitHub Desktop (a click-based app, no terminal) or VS Code's Source Control panel do the same thing visually.

1.6 Secrets & .env

For the non-tech learner

Secrets (like your API key) must never go into your code or GitHub. They live in a file called .env that Git ignores, and in your host's settings. You commit a fake version called .env.example so others know what's needed β€” with no real values.

# .env.example   (SAFE to commit β€” placeholders only)
ANTHROPIC_API_KEY=your-key-here
DATABASE_URL=your-db-url-here

# .gitignore must contain at least:
.env
.env.local
node_modules/
Rule of thumb

If a value would let a stranger spend your money or read your data, it's a secret β†’ .env + host settings only. Set a spending limit in the Anthropic Console and a usage alert on your host.

1.7 Deploy to production

For the non-tech learner

"Deploying" = publishing your app to a hosting company's computers so anyone can visit it. Production = that live version. The first deploy feels scary; it's mostly clicking "Import" and "Deploy."

Alternatives (for later)
  • Netlify β€” same flow, great for static/frontend sites.
  • Render / Railway β€” when you need an always-on backend or hosted database.
  • Custom domain β€” buy from Namecheap/Cloudflare, point it at Vercel; do this after the app works.

1.8 Post-launch backlog

For the non-tech learner

The moment you launch you'll see things to improve. Don't fix them live in a panic β€” write them here and tackle them calmly. This is how real products evolve.

BACKLOG β€” <APP NAME>

NOW (this week)    [ ] <bug or tiny win>
NEXT (this month)  [ ] <feature people asked for>
LATER (someday)    [ ] <bigger idea>

FEEDBACK LOG
- <date> <who> said: <quote>  -> action: <what you'll do>
/ Pack 2 Β· Sunday

Build a local AI agent

Goal: a Mac-mini pipeline that turns compliant data into confidence-scored POIs, proven with accuracy evals.

The pipeline at a glance

For the non-tech learner

You build a small "assembly line." Raw place data comes in from a source you're allowed to use; a local AI (Gemma, running on your Mac β€” nothing leaves the machine) cleans and labels each place; you attach a confidence score (how sure it is, 0–1); out comes a tidy spreadsheet. Then you prove it's good with evals. (POI = Point of Interest = a place: name + category + location.)

flowchart TD
  S[Compliant source<br/>OSM / Foursquare / Overture] --> I[Ingest raw records]
  I --> X[Gemma extracts and normalizes]
  X --> SC[Assign confidence 0-1]
  SC --> TH{Confidence high enough?}
  TH -->|Yes| OUT[(Write CSV / JSON)]
  TH -->|No| RV[Flag for review<br/>or route to Claude]
  RV --> OUT
      

2.1 Mac mini setup

For the non-tech learner

This gets a local AI running on your Mac. "Local" means it runs on your own hardware β€” free per use, private, works offline. Ollama downloads and runs the model; it automatically uses your Mac's graphics chip with zero setup.

Which model size for your Mac? (memory decides)

Mac mini RAMComfortable Gemma sizeRoughly
16 GB12B (recommended)good quality, ~6.7 GB
24 GB12B or 27B-classhigher quality
48 GB27B+ with long contextbest
Alternatives (for later)
  • LM Studio β€” a click-based app to run models without the terminal.
  • Smaller model (gemma3:4b) if 12B feels slow β€” faster, slightly less accurate.
  • MLX backend (Apple's) β€” noticeably faster on Apple Silicon, a tuning step for later.

2.2 Compliant data source picker

For the non-tech learner

The most important decision today. We do not scrape Facebook β€” it breaks their rules and is fragile. We use data we're allowed to use. Rule of thumb: open dataset or official API > scraping. Check three words on any source: License (am I allowed?), Rate limit (how fast may I ask?), robots.txt / ToS (what do they forbid?).

SourceWhat it givesWhy it's safeCost
OpenStreetMap (Overpass)POIs by type + areaOpen data, query in browser firstFree
Foursquare Open Source Places100M+ POIs, 22 fieldsApache-2.0, download as filesFree
Overture Maps (places)Tens of millions of POIsOpen license, has a confidence fieldFree
Google/Foursquare/Mapbox/HERE APIRich, current POIsOfficial API β€” stay within termsFree tier+
Gov / civic open dataLocal registriesPublic, licensedFree
Alternatives (for later)
  • Meta's official Graph API (within its terms) for pages you own or manage β€” the compliant way to touch Meta data. Never scraping.

2.3 Overpass query (POIs from OpenStreetMap)

For the non-tech learner

Overpass is a free way to ask OpenStreetMap "give me all the X in this rectangle." Paste this into overpass-turbo.eu, press Run, then Export β†’ JSON. The four numbers are south, west, north, east β€” a box on the map.

[out:json][timeout:25];
// all cafes in a small bounding box (south,west,north,east)
node["amenity"="cafe"](40.700,-74.020,40.730,-73.990);
out body;

Swap cafe for restaurant, pharmacy, bank, school, hotel, supermarket…

2.4 First Gemma script

For the non-tech learner

Five lines to prove Gemma answers from your own Python. Save as hello_gemma.py, run python3 hello_gemma.py.

import ollama

reply = ollama.chat(
    model="gemma3:12b",
    messages=[{"role": "user", "content": "Say hello in one short sentence."}],
)
print(reply["message"]["content"])

2.5 Extraction prompt

For the non-tech learner

The instruction you give Gemma for each place. The tricks that make a local model reliable: be explicit, show examples, ask for JSON only, and ask for a calibrated confidence. "Calibrated" means: if it says 0.9, it should be right about 9 times out of 10. Save as extraction_prompt.txt.

You are a data cleaner for Points of Interest (POIs).
For the raw record below, return ONE JSON object with these fields:
- name: cleaned business name (Title Case, no extra symbols)
- category: one of [cafe, restaurant, shop, pharmacy, bank, hotel, other]
- address: a single tidy line, or "" if unknown
- lat: number or null
- lon: number or null
- confidence: 0.0-1.0 β€” how sure YOU are this is correct and complete.
  Be calibrated: 0.9 means you'd be right ~9 times in 10.

Rules:
- Return JSON only. No commentary, no markdown.
- If a field is unknown, use "" or null. Do not invent data.

EXAMPLE INPUT:  {"nm":"joe's  COFFEE","type":"coffee shop","addr":"12 main st"}
EXAMPLE OUTPUT: {"name":"Joe's Coffee","category":"cafe","address":"12 Main St","lat":null,"lon":null,"confidence":0.78}

RAW RECORD:
<PASTE ONE RECORD HERE>

2.6 POI schema β€” force clean output

For the non-tech learner

A "schema" is a strict shape for the data β€” every POI gets the same fields, so your spreadsheet isn't a mess. Ollama can enforce it. temperature=0 makes answers consistent. Save as poi_schema.py.

from pydantic import BaseModel
from typing import Optional
import ollama

class POI(BaseModel):
    name: str
    category: str
    address: str
    lat: Optional[float]
    lon: Optional[float]
    confidence: float

def extract_one(raw_record: str, prompt_template: str) -> POI:
    resp = ollama.chat(
        model="gemma3:12b",
        messages=[{"role": "user",
                   "content": prompt_template.replace("<PASTE ONE RECORD HERE>", raw_record)}],
        format=POI.model_json_schema(),   # enforce the shape
        options={"temperature": 0},        # consistent output
    )
    return POI.model_validate_json(resp["message"]["content"])

2.7 Pipeline orchestration

For the non-tech learner

"Orchestration" is the glue that runs each step in order, survives a bad answer, and writes the file. Keep it one simple script. Tip: use Claude Code from Saturday to write and debug this β€” paste it with your data sample and say "make this run on my file."

import json, csv
from poi_schema import POI, extract_one   # from 2.6

PROMPT = open("extraction_prompt.txt").read()   # from 2.5
THRESHOLD = 0.6                                   # tune later

def run(input_json_path, output_csv_path):
    raw = json.load(open(input_json_path))         # ingest
    records = raw.get("elements", raw)             # OSM uses "elements"
    kept, flagged = [], []

    for rec in records:
        try:
            poi = extract_one(json.dumps(rec), PROMPT)   # extract + score
        except Exception as e:
            flagged.append({"error": str(e), "record": rec})
            continue
        if poi.confidence >= THRESHOLD:
            kept.append(poi.model_dump())
        else:
            flagged.append(poi.model_dump())

    with open(output_csv_path, "w", newline="") as f:
        cols = ["name","category","address","lat","lon","confidence"]
        w = csv.DictWriter(f, fieldnames=cols)
        w.writeheader()
        for r in kept:
            w.writerow({k: r.get(k, "") for k in cols})

    print(f"Kept {len(kept)} POIs, flagged {len(flagged)} for review.")

if __name__ == "__main__":
    run("raw_pois.json", "pois.csv")
Alternatives (for later)
  • Agent frameworks like LangChain or LlamaIndex add structure for bigger systems β€” overkill today. A plain script is the right altitude for learning.

2.8 Golden dataset β€” your answer key

For the non-tech learner

To know if your pipeline is good, you need known-right answers to compare against. Hand-check ~30–50 places (confirm name/category/location) and save as the "golden" truth. This is the most valuable hour of the day β€” quality here decides everything. Save as golden.csv.

id,name,category,address,lat,lon
1,Joe's Coffee,cafe,12 Main St,40.71,-74.00
2,City Pharmacy,pharmacy,5 Oak Ave,40.72,-74.01
3,...

Include a few tricky cases (odd names, duplicates, missing addresses) β€” they reveal where your pipeline struggles.

2.9 Eval grader β€” prove quality in numbers

For the non-tech learner

An "eval" is a repeatable quality test. Four numbers tell the story: Precision (of what I returned, how much was right?), Recall (of what existed, how much did I catch?), F1 (their balance), Accuracy (overall fraction right). Higher is better; 1.0 is perfect.

flowchart LR
  G[Golden dataset<br/>known-correct] --> CMP[Match by name + location]
  P[Pipeline output] --> CMP
  CMP --> M[Precision / Recall<br/>F1 / Accuracy]
  CMP --> CAL[Bucket by confidence<br/>then calibration]
  M --> R[Eval report]
  CAL --> R
      
import pandas as pd

def norm(s):
    return str(s).strip().lower()

def evaluate(pred_csv, gold_csv):
    pred = pd.read_csv(pred_csv)
    gold = pd.read_csv(gold_csv)
    gold_names = set(norm(n) for n in gold["name"])
    pred_names = [norm(n) for n in pred["name"]]

    tp = sum(1 for n in pred_names if n in gold_names)
    fp = sum(1 for n in pred_names if n not in gold_names)
    fn = sum(1 for n in gold_names if n not in set(pred_names))

    precision = tp / (tp + fp) if (tp + fp) else 0
    recall    = tp / (tp + fn) if (tp + fn) else 0
    f1 = 2*precision*recall/(precision+recall) if (precision+recall) else 0

    print(f"Precision: {precision:.2f}")
    print(f"Recall:    {recall:.2f}")
    print(f"F1:        {f1:.2f}")
    return precision, recall, f1

if __name__ == "__main__":
    evaluate("pois.csv", "golden.csv")
Alternatives (for later)
  • Use the Anthropic Console's Evaluation tool to auto-generate extra test cases and grade them; add error bars so you don't over-read small samples.

2.10 Confidence calibration check

For the non-tech learner

Does the AI know what it knows? Group POIs by the confidence it gave, then check how often each group was actually right. If the "0.9" group is right ~90% of the time, it's well-calibrated. Big gaps mean the confidence number can't be trusted yet.

CALIBRATION TABLE  (fill after running the grader per bucket)

Confidence bucket | # POIs | Actually correct | Actual accuracy
0.9 - 1.0         |        |                  |        %
0.7 - 0.9         |        |                  |        %
0.5 - 0.7         |        |                  |        %
below 0.5         |        |                  |        %

Reading it: actual accuracy should roughly MATCH the bucket.
If the 0.9 bucket is only 60% correct -> the model is
overconfident -> lower the THRESHOLD or improve the prompt.

2.11 Model routing β€” Gemma vs Claude

For the non-tech learner

You don't have to pick one AI forever. A smart pattern: do the bulk locally on Gemma (free, private), and send only the hard, low-confidence cases to Claude (smarter, costs a little). This is your one "agentic" decision β€” the system choosing its own next step.

flowchart TD
  POI[POI to process] --> LOC[Gemma local<br/>free, private]
  LOC --> CONF{Confidence high?}
  CONF -->|Yes| KEEP[Keep result]
  CONF -->|No| CLA[Send hard case<br/>to Claude API]
  CLA --> KEEP
      
Gemma (local)Claude (hosted)
Privacydata stays on Macsent to cloud
Cost$0 per usepay per token
Qualitygoodhigher
Speeddepends on Macfast
Offlineyesno
Alternatives (for later)
  • Swap which Claude model you escalate to (Haiku = cheapest/fast, Sonnet = balanced, Opus = strongest). The migration is often just replacing ollama.chat() with the Anthropic SDK call.
/ Pack 3

Reference cards

Keep these open while you work.

Command cheat-sheet

CLAUDE CODE
  claude              start it
  /init               create CLAUDE.md
  /clear              wipe conversation memory (between tasks)
  /cost               see how much you've spent
  claude doctor       check your install

GIT
  git add .           stage changes
  git commit -m "msg" save a snapshot
  git push            upload to GitHub
  git restore .       undo uncommitted changes

WEB APP (Next.js)
  npm run dev         run locally
  npm test            run tests
  npm run build       production build check

OLLAMA / GEMMA
  ollama pull gemma3:12b   download the model
  ollama run gemma3:12b    chat in terminal
  ollama ps                what's loaded + memory
  ollama list              downloaded models

Troubleshooting β€” first things to try

For the non-tech learner

When stuck, copy the exact error and paste it into Claude Code's debug prompt (1.4-C). 90% of fixes start there.

SymptomFirst thing to try
"command not found"tool isn't installed / restart the terminal
app won't load on localhostis npm run dev still running? right URL/port?
secret / key errorsis it in .env AND in the host's settings?
Gemma is very slowtry gemma3:4b, or close other heavy apps
Gemma returns broken JSONset temperature 0, keep "JSON only", retry
deploy build failsread the build log top-to-bottom; paste it to Claude
git push rejectedrun git pull first, then push again

Glossary at a glance

Vibe coding
describe in words, AI writes the code, you review & test
LLM
the AI (Claude, Gemma)
IDE
the app you edit code in (VS Code)
Terminal
the text window for typing commands
Git / GitHub
version snapshots / online store + deploy trigger
PRD / MVP
one-page plan / smallest useful version
Production
the live version users visit
Deploy
publish the app to a host
localhost
your own computer as a private server
.env / secret
values like API keys, kept out of code & GitHub
Context window
the AI's working memory; clear it between tasks
Agent / workflow
AI picks its own steps / fixed steps AI fills in
Open / local model
a model you download & run yourself (Gemma)
Quantization (Q4)
shrink a model to fit memory, minor quality loss
POI
a place: name + category + location
Schema
a strict shape for data (every record same fields)
Eval / golden set
a repeatable quality test / hand-verified correct answers
Precision / Recall
right-of-returned / caught-of-existing
Calibration
does "90% sure" mean right 90% of the time?
robots.txt / ToS
bot rules / the contract you accept

Go deeper (official first)

Anthropic AcademyClaude Code docs Building Effective AgentsPrompt engineering tutorial Ollama docsGoogle Gemma docs GitHub Hello WorldVercel docs Overpass TurboFoursquare OS Places Overture Maps
A note on accuracy

Model names, prices, and free tiers change month to month. Double-check any figures (Claude pricing, Gemma sizes, Mac configs, host tiers) on the official pages before you budget or buy.