The experiment with vibe coding can be considered finished. The goal was to understand how viable this development method is, how effective it is, and its limitations.

As the task, I chose building an iOS app. First, I hadn’t been interested in it at all for the last 10 years, and second, I was missing it. The point of the app is to automate routine operations using LLMs. In my case, that’s fixing my clumsy texts in English and Ukrainian, translating to English and Bulgarian, and summarizing web pages. Building something like this for Raycast took me one evening. So the problem was specifically iOS development. And that part was supposed to be fully handled by the LLM. My task was to help the LLM as little as possible.

The result: the app was built in 60 hours. It works and performs its functions. That’s 6 screens, a custom look, and 6500 ncloc of Swift code. It has about a dozen unresolved issues, but they are livable. It’s uploaded to TestFlight, from where I can distribute it. And it most likely won’t pass moderation due to guideline mismatches 🙂

Takeaways:

  • I definitely couldn’t have built such an app on my own in 60 hours. Too much would have had to be learned.
  • I lost about 15 hours fighting xcodegen, building tooling around it, swiftlint, swiftformat, and so on. If I’d taken tuist right away, I most likely wouldn’t have lost a couple of hours on elementary tasks like “the app isn’t in the share dialog because of wrong nesting in YAML” or “the app icon doesn’t appear due to a typo in the name”.
  • On the other hand, if I hadn’t tried to make an environment where everything can be done via CLI, I’d still be poking buttons in Xcode by instructions.
  • This implies the importance of building the fastest possible change/test loop for LLMs. Via CLI or MCP, maybe. The faster it is and the better the error messages, the faster development goes.
  • Familiarity with basic concepts would likely speed up development significantly by avoiding errors and “workarounds”.
  • The model absolutely can’t generalize from existing code. An experienced programmer usually senses where to lay an abstraction for future changes (the so‑called “technical gap for future hacks”). In my base instructions I required it to propose 2 methods to improve code at the end of each task. And not once was there anything useful at the level of the overall architecture. Because of this, I sometimes had to ask it to do an architecture review, and there were actually useful tips there.
  • You need to constantly refine instructions, adding frequent mistakes, new requirements, etc. And even that doesn’t help in long sessions when the model starts “forgetting” the beginning. For example, I had constant attempts to import code from the Common module, even though it was visible without imports.
  • You need to constantly keep the project in a healthy state from the standpoint of linter and tests.
  • Sometimes it’s easier to start the task over than to try to figure out what’s wrong in the current solution.
  • Image support in cursor is an amazing feature for explaining problems. You can solve even alignment and other GUI issues. You take a screenshot, circle the problem spot, upload it to the IDE. If the LLM doesn’t get confused about what the padding consists of, it can even fix it 😉
  • Next time, if I decide to develop from scratch in an unfamiliar infrastructure, it will be:
    • very careful tooling choice with maximum validation and clear error messages
    • as detailed as possible specs that are constantly updated (maybe it’s worth ditching architecture.md and implementation.md from the memory bank)
    • a short and clear change/test loop for the model with understandable error messages

But still, the app is written and:

  • works with the specified OpenAI key
  • translates, simplifies, summarizes, fixes errors
  • can translate into Klingon and Sindarin