across the industry and academia and the publication of our research findings. As a Research Scientist working on Scalable... experiments to simulate expertise and capability gaps between supervisor and model for scalable oversight experiments Develop new...
on frontier agent data, evaluation and safety; scalable oversight and alignment of LLMs; science of evaluation for LLM... for frontier models Building frontier evaluations for LLMs and agents such as AI R&D Developing scalable oversight protocols...