The new baibot version (v1.5.0) supports the new Claude Sonnet 3.7
model, which is supposedly improved and priced the same way, so it makes
sense to upgrade to it in our static definitions.
This:
- brings consistency - no more mixing `_name_prefix` and `_registry_prefix`
- adds extensibility - a future patch will allow reconfiguring all registry prefixes for all roles in the playbook
We still have `_docker_` vs `_container_` inconsistencies.
These may be worked on later.
Closes https://github.com/spantaleev/matrix-docker-ansible-deploy/pull/4039
Partially reverts 30dad8ba27 which renamed
`config.yml` to `config.yaml` in the playbook and on the server, for
consistency with the rest of the playbook.
The problem is that:
- baibot defaults to looking for `config.yml`, not `config.yaml` (as provided).
This can be worked around by specifying a new `BAIBOT_CONFIG_FILE_PATH=config.yaml`
environment variable. This brings more complexity.
- renaming the target file (on the server) to `config.yaml` means people
with an existing installation would drag around the old file (`config.yml`) as well,
unless we create a new Ansible task (`ansible.builtin.file` with `state: absent`) to remove
the old file. This brings more complexity as well.
https://github.com/spantaleev/matrix-docker-ansible-deploy/pull/4039 adjusts where the file is mounted,
which fixes the immediate problem (baibot not starting), but still means
people will end up with 2 config files for baibot (`config.yml` and `config.yaml`).
This patch, reverts a bit more, so that we still continue to use `config.yml` on the server.
People who have upgraded within the last ~17 hours may end up with 2 files, but it shouldn't be too many of them.
Reasoning models like `o1` and `o3` and their `-mini` variants
report errors if we try to configure `max_response_tokens` (which
ultimately influences the `max_tokens` field in the API request):
> invalid_request_error: Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead. (param: max_tokens) (code: unsupported_parameter)
`max_completion_tokens` is not yet supported by baibot, so the best we
can do is at least get rid of `max_response_tokens` (`max_tokens`).
Ref: db9422740c
The new prompt makes use of the new `baibot_conversation_start_time_utc`
prompt variable, which is not a moving target (like `baibot_now_utc`)
and as such allows prompt caching to work.
Ref: https://platform.openai.com/docs/guides/prompt-caching
Since 2024-10-02, `gpt-4o` is actually the same as `gpt-4o-2024-08-06`.
We previously used `gpt-4o-2024-08-06`, because it was pointing to a
much better (longer context) model. Since they're both the same now,
we'd better stick to the unpinned model and make it easier for future
users to get upgrades.
`gpt-4o` will point to `gpt-4o-2024-08-06` after 2nd of October 2024
anyway. At that time, we can revert to pointing to `gpt-4o`.
The reason `gpt-4o-2024-08-06` was chosen now instead of `gpt-4o`:
- the `max_response_tokens` configuration was set to 16k, which matches
`gpt-4o-2024-08-06`, but is too large for `gpt-4o` (max 4k)
- baibot's own configs for dynamically created agents, as well as static
config examples use `gpt-4o-2024-08-06` and the larger
`max_response_tokens` value
The playbook did not use to define a prompt for statically-defined
agents.
Since prompt variables support landed in v1.1.0
(see 2a5a2d6a4d)
it makes sense to make use of it for a better out-of-the-box experience
(see https://github.com/etkecc/baibot/issues/10).