Post

AI pair-programming

AI pair-programming

Recently, I had to design and build a large HTTP API facade for an existing system. Instead of sticking to my usual tools and patterns, I decided to use this as an opportunity to learn new technologies and try different approaches. Also, I wanted to see how helpful AI coding assistants could really be in a real project.

The Inspiration and Setup

I read this interesting post by Harper Reed about his workflow with LLMs for code generation, and I thought, “Why not try something similar?” So, I started by brainstorming and planning my API specification with Claude Sonnet. Then, I moved to the actual coding part, experimenting with different tools along the way.

For this project, I tested several AI coding assistants: GitHub Copilot, Amazon Q, Cursor, Aider, and Cline. I also tried different LLMs: Claude Sonnet and Haiku, OpenAI’s GPT-4o and o3, Google’s Gemini, and local models like Mistral, Gemma3, Phi4, and DeepSeek Coder.

Lastly, during the whole process, I frequently used Perplexity and Claude Sonnet for research and answering questions.

The Process

I spent about 3 weeks working intensively on this project, using these AI tools for 4–6 hours every day. I tried different combinations of models for both planning (ask) and coding (act) steps. Sometimes I would plan with GPT-o3 and code with GPT-4o, other times I would use Gemini for research and implementation. Or, I would use Claude Sonnet chat to build a detailed action plan that I then saved to a Markdown file, then execute it with Claude Haiku using Aider or Cline.

When I was done, around 40% of my final code was generated by these AI tools. The rest was either written by me from scratch or heavily modified from AI suggestions.

What I Learned

Planning First, Then Coding

I quickly realized that the most valuable way to use these tools was in chat/ask mode, like having a pair programming partner. I would “talk” to the AI about ideas, challenges, and ask questions about code and libraries. This felt much more natural than just asking it to generate code for me.

The most effective was having “design chats” with a reasoning model (like R1), fine-tuning the ideas, correcting the model’s mistakes, and making it follow my project’s conventions. Then, after reviewing the plan, having it “code” it. This allowed me to catch many discrepancies and bugs early on.

For example, I decided to use Vitest instead of Jest (which is the default test runner for NestJS). All the AI models constantly forgot this and kept generating code with Jest. I had to remind them repeatedly: “This project uses Vitest, not Jest. Pledge to follow the project’s conventions!”

Tools and Models

The best results came from using DeepSeek R1 for planning, and then Claude Sonnet 3.7 for the actual coding.

As for local models, they are much slower than online models unless you have a powerful GPU like RTX 4070 Ti Super. I tried running them on my M1 Pro with 32GB RAM, and while it could handle models up to 14B parameters, it was quite slow. The smaller 7B and 4B models were often not accurate enough.

I found that code-completion tools like GitHub Copilot or Amazon Q weren’t as helpful as I expected. I ended up using Aider and Cline much more because they could understand the context and codebase better. They also felt more natural to use.

However, using Cline is costly! If you’re not careful, it can easily burn $5-10 USD on a single task. Aider was more cost-effective for me.

The “Vibe Coding” Trap

I recently read an article about “vibe coding” and I can confirm it is a thing. Using AI tools can easily make you lazy and sloppy because it feels like the AI is doing the work for you. This is dangerous and led to bugs and changes that were difficult to track down later. Some of them significantly diverged from the initial design and the task I gave to the model.

Complexity Matters

This is very important: the more complex and niche the problem, the more knowledge and experience YOU need as a developer. Using AI to solve problems where you have little-to-zero knowledge is a significant risk. You must always understand what the code does and why.

For example, I had an e2e test failing because of an issue with dependency resolution in NestJS. This turned out to be impossible for any AI to solve properly. I burned dozens of dollars on tokens, and every LLM fell into a loop of trying various approaches, eventually just rewriting the e2e test entirely—mocking and substituting everything until the test no longer served its purpose. And the AI happily explained that “pretending to test is the correct solution”! What a joke. I later manually traced the issue using a debugger and checked GitHub issues for reference information. It took me about two hours to understand, identify, and fix the problem, but I was still more effective than the LLMs!

Where AI Shines

AI coding tools worked best for me when:

  • I had established strong project conventions and guidelines,
  • There was a sufficient amount of existing code to reference,
  • I needed help with mundane tasks like refactoring modules, generating test cases, writing boilerplate code, or updating documentation.

Good reasoning models were helpful for designing change plans and researching ideas, but only when I used good prompts and provided clear requirements. Claude Sonnet makes a good pair-programming partner to challenge ideas, especially when asked to generate questions and identify edge cases.

Also, Perplexity Pro has become my default tool for research — it’s just that good.

Final Thoughts

Overall, I’m mildly impressed with the current AI coding tools. They’re helpful but not magic. They work best as assistants rather than replacements. I’ll continue using these tools in moderation for future projects, but with realistic expectations about what they can and cannot do.

The biggest value I found was not in generating complete code but in having these tools as thinking partners, helping me explore options, navigate documentation, and catch issues I might have missed. Just remember that YOU are still the developer — the AI is just another tool.

This post is licensed under CC BY 4.0 by the author.