This release significantly improves GPT-4o-mini's performance with the following steps:
adding a separate reasoning step with a structured response over each label
PrefectHQ/ControlFlow#372, in particular how success tool instructions are generated
tweaking instructions to better reflect the objective
4o-mini is now capable of handling more complex instructions like "never apply this label" (when a label is otherwise obviously appropriate) or contextual application that would fail previously and require 4o proper.
What's Changed
Full Changelog: v0.3.0...v0.4.0