---
title:

2025 GenAI Recap

date: 2026-01-03
draft: false
---

Illustration

  • From “sometimes works” to “mostly works”: all types of models have made a qualitative leap. Reasoning capabilities, multimodal speech-to-speech models (including real-time translation), mixture-of-experts architecture, and new optimization techniques have emerged.
  • The boom of new modalities: explosive growth in image and video generation quality. Neural networks have learned to “see” and create video taking into account the physics of the real world. While artifacts like “background moving the wrong way” or door-related oddities still occur, they are becoming the exception.
  • LLM problems remain the same: hallucinations, probabilistic nature (non-determinism), lack of “common sense,” and high cost.
  • There is no single best model, and there likely won’t be: they are becoming more numerous, and each is good at something specific. Sometimes it’s more effective to chain 2–3 requests to a cheap model than to make one to a flagship.
  • Fast vs. Smart: with the advent of reasoning models, a clear division into “smart but slow” and “fast” has solidified. Competition among fast models is lower for now, as the industry is focused on top-tier benchmarks.
  • From “prompt magic” to context engineering: a shift to an engineering approach has occurred — RAG, tools, MCP, and other techniques for shaping context for specific tasks.
  • Every company is building agents: many (if not most) are still in the stage of experimentation and infrastructure creation (processes, tools, and data).
  • Humans are still more efficient: co-pilots are still more stable than fully autonomous agents. Even in programming with specialized models, “vibe-coding” rarely goes beyond PoC and small utilities.
  • Few autonomous agents in production: real results are visible only where tasks are either simple (first-line support) or natural for LLMs (text analysis and generation).
  • Multi-agent systems are still exotic: orchestrated chains of specialized agents are almost absent from mass products. Although such an architecture seems more promising than a single universal agent.
  • AI democratization and economics: the decrease in relative inference costs (thanks to DeepSeek and open-source models) and the development of local solutions are gradually breaking the monopoly of cloud giants.
  • Towards a single assistant: major players are beginning to combine disparate tools into universal assistants.
  • Standardization: protocols like MCP, A2A, A2UI are appearing. The OpenAI Chat Completions API has become the de facto standard for interaction.
  • On-device generation: not mainstream yet, but the trend is set (Windows Copilot+, Apple Intelligence). In browsers, Web Neural Network API technology is still in draft status.