I was on Chain of Thought to talk about ResearchRubrics, a benchmark to evaluate deep research agents