That’s far-fetched. It’s in the interest of the model builders to solve your problem as efficiently as possible token-wise. High value to user + lower compute costs = better pricing power and better margins overall.
What are the details of this? I'm not playing dumb, and of course I've noticed the decline, but I thought it was a combination of losing the battle with SEO shite and leaning further and further into a 'give the user what you think they want, rather than what they actually asked for' philosophy.
As recently as 15 years ago, Google _explicitly_ stated in their employee handbook that they would NOT, as a matter of principle, include ads in the search results. (Source: worked there at that time.)
Now, they do their best to deprioritize and hide non-ad results...
Only if you are paying per token on the API. If you are paying a fixed monthly fee then they lose money when you need to burn more tokens and they lose customers when you can’t solve your problems within that month and max out your session limits and end up with idle time which you use to check if the other providers have caught up or surpassed your current favourite.
> It’s in the interest of the model builders to solve your problem as efficiently as possible token-wise. High value to user + lower compute costs = better pricing power and better margins overall.
It's only in the interests of the model builders to do that IFF the user can actually tell that the model is giving them the best value for a single dollar.
Interesting point about model variation. It would be useful to run multiple trials and look at the statistical distribution of results rather than single runs. This could help identify which models are more consistent in their outputs.
> Interesting point about model variation. It would be useful to run multiple trials and look at the statistical distribution of results rather than single runs. This could help identify which models are more consistent in their outputs.
That doesn't help in practical usage - all you'd know is their consistency at the point in time of testing. After all, 5m after your test is done, your request to an API might lead to a different model being used in the background because the limits of the current one were reached.