Want Local Vibe Coding? This Ai Stack Replaces Claude Code And Codex - And It's Free

Trending 4 hours ago
Want section vibe coding? This AI stack replaces Claude Code and Codex - and it's wholly free
Elyse Betters Picaro / ZDNET

Follow ZDNET: Add america arsenic a preferred source on Google.


ZDNET's cardinal takeaways

  • Goose acts arsenic nan supplier that plans, iterates, and applies changes.
  • Ollama is nan section runtime that hosts nan model.
  • Qwen3-coder is nan coding-focused LLM that generates results.

If you've been programming for immoderate number of years, you've beautiful overmuch lived done a bunch of hype cycles. Whether it's a caller improvement environment, a caller language, a caller plugin, aliases immoderate caller online work pinch an oh-so-powerful time-saving API, it's each "revolutionary" and "world-changing," astatine slightest according to nan PR reps hawking The Big New Thing.

And past there's agentic AI coding. When a instrumentality tin thief you do four years of merchandise improvement successful 4 days, nan effect is world-changing. While vibe coding has its detractors (for bully reason), AI coding agents for illustration OpenAI's Codex and Claude Code really are revolutionary. They are radically transforming nan package industry.

Also: I tried a Claude Code replacement that's local, unfastened source, and wholly free - really it works

In my testing, I wished you tin get a fewer hours of agentic coding done present and location pinch nan $20/month plans from nan AI companies. But if you're going to put successful afloat days of coding, you'll request to upgrade to $100 aliases $200/month plans. Otherwise, you'll consequence getting put connected clasp until your token allocation resets.

While some OpenAI and Anthropic person many times said they respect nan privateness of codification bases, nan truth is that some are doing their activity connected unreality infrastructure. That effort has an inherent information risk. Using these technologies mightiness besides break agreements based connected really you negociate your root codification aliases moreover wherever your activity is done.

Recently, however, a imaginable solution to these challenges has been released. By combining 3 abstracted tools, it whitethorn beryllium imaginable to switch pricy cloud-based coding platforms pinch a free AI supplier that runs connected your section computer.

Also: I've tested free vs. paid AI coding devices - here's which 1 I'd really use

In my erstwhile article, I showed you how to group up this situation and did immoderate basal testing. I was capable to corroborate that this setup tin tally agentic coding (although I only gave it a elemental problem, and it did person immoderate challenges).

In this article, I'm going to return you done nan 3 devices (Goose, Ollama, and Qwen3-coder) and explicate what each contributes to nan wide solution.

Then, successful a follow-on article, I'll effort to usage this strategy to build a large project, extending my Claude Coded iPhone, Mac, and Apple Watch app to nan iPad. Instead of utilizing Claude Code for nan project, I'm going to spot if these 3 batches of bits tin do nan full point connected my Mac, and for free.

Qwen3: The coding LLM

Let's commencement pinch Qwen3-coder, nan coding-specific ample connection model. I picked Qwen because of Jack Dorsey's station connected X, saying "goose + qwen3-coder = wow", and besides because ZDNET's Jack Wallen recommended it to maine erstwhile I asked astir downloadable coding models.

Also: Stop utilizing ChatGPT for everything: My go-to AI models for research, coding, and much (and which I avoid)

That's an rumor I want to reinforce. We cognize models for illustration OpenAI's GPT-5.2-codex and Anthropic's Opus-4.5 are awesome astatine coding, but they're cloud-based and travel pinch a fee. We're looking astatine Qwen3-coder because it is free and downloadable.

Let's talk astir what a ample connection exemplary is. Think astir ChatGPT. When you usage it, you tin take a exemplary (or, pinch nan free version, a exemplary is usually chosen for you). The interface, aliases nan chatbot, is simply a abstracted portion of package from nan model.

If we were to usage a car analogy, nan exemplary is nan engine, and nan chatbot is nan rider compartment pinch nan steering instrumentality and dashboard.

Qwen3-coder is simply a specialized type of nan Qwen3 LLM from Alibaba. It's nan portion of package that really writes nan code. This exemplary generates codification from prompts and understands programming languages, frameworks, and patterns. It tin refactor codification (make code-wide changes), tally diffs (compare code), create codification explanations, and hole code.

Also: Xcode 26.3 yet brings agentic coding to Apple's developer tools

The coding exemplary is incapable of managing multi-step workflows. It doesn't cognize erstwhile to extremity moving connected a problem aliases erstwhile to iterate connected a problem. The exemplary besides has nary representation of thing beyond nan presently moving context.

Ollama: The exemplary runtime

Ollama is nan section exemplary runtime and serving layer. Models don't tally connected their own. Using a database arsenic an analogy, a exemplary is for illustration nan database itself, a postulation of information. In nan lawsuit of a model, it's a elephantine repository of knowledge.

Ollama is for illustration nan database engine. The main quality betwixt a database and a database motor is that a database motor inserts and extracts information from nan existent database. Ollama only extracts accusation from nan ample connection model, truthful it's much of a runtime (a strategy that runs thing antecedently built by different system) than a afloat engine.

Ollama is nan infrastructure that really runs ample connection models connected your instrumentality and makes them disposable to different processes via a section API. It downloads, installs, and manages section LLMs. It runs conclusion processes connected your hardware (CPU aliases GPU). It makes nan models disposable to different processes done a accordant API endpoint. It besides handles exemplary switching, versioning, and assets control.

Also: Is ChatGPT Plus still worthy your $20? I compared it to nan Free, Go, and Pro plans - here's my advice

On nan different hand, Ollama does not understand your task goals. It does not negociate conversations aliases tasks.

There's 1 different point to note. Ollama itself isn't a specialized coding tool. It only knows coding if nan LLM it's presently moving knows coding.

Because it accepts API calls for LLM access, Ollama is thing of an AI server, sitting betwixt nan LLM and nan chatbot interface.

Goose: The coding manager

Goose is fundamentally nan supplier portion of nan puzzle, providing orchestration for nan different main components. It's nan portion that understands intent, manages tasks, and decides what to inquire nan exemplary to do next.

Goose interprets your programming prompts. If you for illustration nan thought of vibe coding, Goose decodes nan vibes you springiness it and breaks activity into steps related to analysis, planning, codification generation, and testing. It's nan portion of nan strategy that maintains nan conversational and task discourse crossed iterations.

Also: How to create your first iPhone app pinch AI - nary coding acquisition needed

In performance pinch nan quality guiding it, Goose decides whether a alteration merits a module aliases artifact rewrite, and whether codification tin conscionable beryllium modified. It besides handles workflow commands for illustration "scan nan repo, propose changes, use diffs."

Goose doesn't make codification itself. It doesn't tally nan models straight (although it talks to them). And it doesn't cognize thing astir codification syntax unless nan exemplary it's utilizing helps out.

Goose is fundamentally nan head and task head of nan vibe coding process.

A emblematic workflow

So, let's look astatine really each 3 components activity together to alteration you to make code:

  • The quality provides a punctual describing a programming goal.
  • Goose interprets that extremity and decides what to do.
  • Goose sends a precise coding punctual to Ollama.
  • Ollama runs Gwen3-coder locally connected your computer.
  • Gwen3-coder returns codification aliases analysis.
  • Goose decides whether to use it, refine it, aliases inquire again.

This workflow exemplary is why vibe coding feels fluid. You tin enactment absurd and intuitive while nan strategy translates your prompts into tangible codification changes.

Also: I utilized Claude Code to vibe codification a Mac app successful 8 hours, but it was much activity than magic

While this attack useful really good for these 3 tools, different agentic coding environments for illustration Claude Code aliases OpenAI Codex person their ain operation of nan coding LLM, nan exemplary runtime, and nan programming manager. They're conscionable each moving down nan front-end interface that nan coding products coming to their developer users.

In position of nan 3 devices we're talking astir here, this architecture provides a batch of elasticity and control. For example, you tin switch retired nan Gwen3-coder LLM for different coding exemplary without changing Goose. You tin update aliases optimize Ollama without rubbing your workflows. Over time, Goose whitethorn germinate into a smarter supplier without retraining models. Plus, everything is local, inspectable (I deliberation that's a word), and modular.

Your package engineering section successful a container

Here's a nosy measurement to deliberation astir this approach. Once you group up Goose, Ollama, and Qwen3-coder connected your section machine, you efficaciously person a package engineering section successful a box. Goose is nan elder technologist guiding nan session. Ollama is nan infrastructure technologist who manages your computing environment. Qwen3-coder is simply a fast, talented inferior developer who's penning code.

What astir you? Have you tried local, agent-based coding devices for illustration Goose pinch Ollama and a downloadable coding model? Or, are you still relying connected cloud-based services for illustration Claude Code aliases Codex?

Does nan thought of keeping your codification and prompts wholly connected your ain instrumentality entreaty to you, aliases do you spot trade-offs that would make this attack impractical for your work? How do you consciousness astir mixing and matching components, specified arsenic swapping models aliases runtimes, alternatively of utilizing an all-in-one coding platform? Let america cognize successful nan comments below.


You tin travel my day-to-day task updates connected societal media. Be judge to subscribe to my play update newsletter, and travel maine connected Twitter/X astatine @DavidGewirtz, connected Facebook astatine Facebook.com/DavidGewirtz, connected Instagram astatine Instagram.com/DavidGewirtz, connected Bluesky astatine @DavidGewirtz.com, and connected YouTube astatine YouTube.com/DavidGewirtzTV.

More