Coding With Ai? My Top 5 Tips For Vetting Its Output - And Staying Out Of Trouble

3 days ago

AI Artificial Intelligence Danger Caution Laptop

Our communicative begins, arsenic galore stories do, pinch a man and his AI. The man, for illustration galore men, is simply a spot of a geek and a spot of a programmer. He besides needs a haircut.

The AI is nan culmination of thousands of years of quality advancement, each put to nan work of making nan man's life a small easier. The man, of course, is me. I'm that guy.

Also: The champion AI for coding successful 2025 (and what not to use)

Unfortunately, while AI tin beryllium incredibly brilliant, it besides has a propensity to lie, mislead, and make shockingly stupid mistakes. It is nan stupid portion that we will beryllium discussing successful this article.

Anecdotal grounds does person value. My reports connected really I've solved immoderate problems quickly pinch AI are real. The programs I utilized AI to constitute pinch are still successful use. I person utilized AI to thief velocity up aspects of my programming flow, particularly erstwhile I attraction connected nan saccharine spots wherever I'm little productive and nan AI is rather knowledgeable, for illustration penning functions that telephone publically published APIs.

Also: I'm an AI devices expert, and these are nan only 2 I salary for (plus 3 I'm considering)

You cognize really we sewage here. Generative AI burst onto nan segment astatine nan cusp of 2023 and has been blasting its measurement into knowledge activity ever since.

One area, arsenic nan communicative goes, wherever AI genuinely shines is its expertise to constitute codification and thief negociate IT systems. Those claims are not untrue. I person shown, respective times, how AI has solved coding and systems engineering problems I person personally experienced.

AI coding successful nan existent world: What subject reveals

New devices ever travel pinch large promises. But do they present successful real-world settings?

Most of my reporting connected programming effectiveness has been based connected individual anecdotal evidence: my ain programming experiences utilizing AI. But I'm 1 guy. I person constricted clip to give to programming and, for illustration each programmer, I person definite areas wherever I walk astir of my coding time.

Also: I tested 10 AI contented detectors - and these 5 correctly identified AI matter each time

Recently, though, a nonprofit investigation statement called METR (Model Evaluation & Threat Research) did a much thorough study of AI coding productivity.

Their methodology seems sound. They worked pinch 16 knowledgeable open-source developers who person actively contributed to large, celebrated repositories. The METR analysts provided those developers pinch 246 issues from nan repositories that needed fixing. The coders were fixed astir half nan issues wherever they had to activity connected their own, and astir half wherever they could usage an AI for help.

The results were striking and unexpected. While nan developers themselves estimated that AI assistance accrued their productivity by an mean of 24%, METR's analytics showed alternatively that AI assistance slowed them down by an mean of 19%.

That's a spot of a head-scratcher. METR put together a database of factors that mightiness explicate nan slowdown, including over-optimism astir AI usefulness, high-developer familiarity pinch their repositories (and little AI knowledge), nan complexity of ample repositories, deficiency of AI reliability, and an ongoing problem wherever nan AI refuses to usage "important tacit knowledge aliases context."

Also: How AI coding agents could destruct open-source software

I would propose that 2 different factors mightiness person constricted effectiveness:

Choice of problem: The developers were told which issues they had to usage AI thief connected and which issues they couldn't. My acquisition suggests knowledgeable developers must take wherever to usage AI based connected nan problem that needs to beryllium solved. In my case, for example, getting nan AI to constitute a regular look (something I don't for illustration doing and I'm reasonably crappy at) would prevention maine a batch much clip than getting nan AI to modify unsocial codification I've already written, activity connected regularly, and cognize wrong and out.

Choice of AI: According to nan report, nan developers utilized Cursor, an AI-centric fork of VS Code, which utilized Claude 3.5/3.7 Sonnet astatine nan time. When I tested 3.5 Sonnet, nan results were terrible, pinch Sonnet failing 3 retired of 4 of my tests. Subsequently, my tests of Claude 4 Sonnet were considerably better. METR reported that developers rejected much than 65% of nan codification nan AI generated. That's going to return time.

That clip erstwhile ChatGPT suggested nuking my strategy

METRs results are interesting. AI is intelligibly a double-edged beard erstwhile it comes to coding help. But there's besides nary uncertainty that AI tin supply sizeable worth to coders. If anything, I deliberation this trial erstwhile again proves nan contention that AI is simply a awesome instrumentality for knowledgeable programmers, but a imaginable high-risk assets for newbies.

Also: Why I'm switching to VS Code. Hint: It's each astir AI instrumentality integration

Let's look astatine a actual example, 1 that could person costs maine a batch of clip and problem if I followed ChatGPT's advice.

I was mounting up a Docker instrumentality connected my location laboratory utilizing Portainer (a instrumentality that helps negociate Docker containers). For immoderate reason, Portainer would not alteration nan Deploy fastener to create nan container.

It had been a agelong day, truthful I didn't spot nan evident problem. Instead, I asked ChatGPT. I fed ChatGPT screenshots of nan configuration, arsenic good arsenic my Docker configuration file.

ChatGPT recommended that I uninstall and reinstall Portainer. It besides suggested I region Docker from nan Linux distro and usage nan package head to reinstall it. These actions would person had nan effect of sidesplitting each my containers.

Of note, ChatGPT didn't urge aliases inquire if I had backups of nan containers. It conscionable gave maine nan bid statement sequences it recommended I trim and paste to delete and rebuild Portainer and Docker. It was a wildly destructive and irresponsible recommendation.

The irony is that ChatGPT ne'er figured retired why Portainer wouldn't fto maine deploy nan caller container, but I did. It turns retired I ne'er filled retired nan container's sanction field. That's it.

Also: What is AI vibe coding? It's each nan rage but it's not for everyone - here's why

Because I'm reasonably experienced, I hesitated erstwhile ChatGPT told maine to nuke my installation. However, personification relying connected nan AI for proposal could person perchance brought down an full server for want of typing successful a instrumentality name.

Overconfident and underinformed AIs: A vulnerable combo

I've besides knowledgeable nan AI going wholly disconnected nan rails. I've knowledgeable it giving proposal that was not only wholly useless, but besides presented pinch nan evident assurance of an expert.

Also: Google's Jules AI coding supplier built a caller characteristic I could really vessel - while I made coffee

If you're going to usage AI devices to support your improvement aliases IT work, these tips mightiness support you retired of trouble:

If there's not overmuch publically disposable information, nan AI can't help. But nan AI will make worldly up based connected what small it knows, without admitting that it is lacking experience.
Like my dog, erstwhile nan AI gets fixated connected 1 thing, it often refuses to look astatine alternatives. If nan AI is stuck connected 1 approach, don't make nan correction of believing that its polite recommendations astir a caller attack are real. It's still going down nan aforesaid rabbit hole. Start a caller session.
If you don't cognize a lot, don't trust connected nan AI. Keep up your learning. Experienced devs tin show nan quality betwixt what will activity and what won't. But if you're trying to put each nan coding connected nan backmost of nan AI, you won't cognize erstwhile aliases wherever it goes incorrect aliases really to hole it.
Coders often usage circumstantial devices for circumstantial tasks. A website mightiness beryllium built utilizing Python, CSS, HTML, JavaScript, Flask, and Jinja. You take each instrumentality because you cognize what it does well. Choose your AI devices nan aforesaid way. For example, I don't usage AI for business logic, but I summation productivity utilizing AI to constitute API calls and nationalist knowledge, wherever it tin prevention maine a batch of time.
Test everything an AI produces. Everything. Line by individual line. The AI tin prevention a ton of time, but it tin besides make tremendous mistakes. Yes, taking nan clip and power to trial by manus tin thief forestall errors. If nan AI offers to constitute portion tests, fto it. But trial nan tests.

Based connected your acquisition level, here's really I urge you deliberation astir AI assistance:

If you cognize thing astir a taxable aliases skill: AI tin thief you walk arsenic if you do, but it could beryllium amazingly wrong, and you mightiness not know.
If you're an master successful a taxable aliases skill: AI tin help, but it will piss you off. Your expertise gets utilized not only to abstracted nan AI-stupid from nan AI-useful, but to cautiously trade a way wherever AI tin really help.
If you're successful between: AI is simply a mixed bag. It could thief you aliases get you successful trouble. Don't delegate your skill-building to nan AI because it could time off you behind.

Also: How I utilized ChatGPT to analyze, debug, and rewrite a surgery plugin from scratch - successful an hour

Generative AI tin beryllium an fantabulous helper for knowledgeable developers and IT pros, particularly erstwhile utilized for targeted, well-understood tasks. But its assurance tin beryllium deceptive and dangerous.

AI tin beryllium useful, but ever double-check its work.

Have you utilized AI devices for illustration ChatGPT aliases Claude to thief pinch your improvement aliases IT work? Did they velocity things up, aliases astir rustle things up? Are you much assured aliases much cautious erstwhile utilizing AI connected captious systems? Have you recovered circumstantial usage cases wherever AI really shines, aliases wherever it fails hilariously? Let america cognize successful nan comments below.

You tin travel my day-to-day task updates connected societal media. Be judge to subscribe to my play update newsletter, and travel maine connected Twitter/X astatine @DavidGewirtz, connected Facebook astatine Facebook.com/DavidGewirtz, connected Instagram astatine Instagram.com/DavidGewirtz, connected Bluesky astatine @DavidGewirtz.com, and connected YouTube astatine YouTube.com/DavidGewirtzTV.