Instruction Nuances

Codex Instructions Example

The 4282nd example of why context engineering is still difficult and not always reliable, even for companies building cutting-edge models.

I needed to translate 50 files into three languages. There’s no requirement for perfect style matching or other complexities, so any reasonably modern model should handle it. Codex provides free limits for users with a ChatGPT subscription, so why not use it.

The problem is that despite the direct instruction “do not use scripts,” GPT-5.2 tried to either write a script to traverse directories, a script to translate via an external model, or a one-liner '/bin/zsh -lc "cat <<'"'"'EOF'"'"' > index.en.md to save a file.

It would seem that writing a script to save a file when you have a tool for writing to a file is obvious nonsense. But that’s for us, not for the model. Its system prompt in Codex clearly states:

Try to use apply_patch for single file edits, but it is fine to explore other options to make the edit if it does not work well. Do not use apply_patch for changes that are auto- generated (i.e. generating package.json or running a lint or format command like gofmt) or when scripting is more efficient (such as search and replacing a string across a codebase).

And as a result, the model tries to write scripts for any more or less bulk operation, even if they make absolutely no sense.