The profile as a memory primitive is pretty interesting, how do you scale it for a scoped access/hierarchical setting? (for example, overall memory of processes => customer specific => project specific memory)
I've struggled with adding evals to my AI agents for last few months, and felt that vibe evals should have a path to building a robust system down the line.
Working on a plugin for langfuse to create evals functions and dataset from ingested traces automatically, based on ad-hoc user feedback.
reply