Evals of a Healthy Person

All of us know this classic flow of developing LLM features - "poked the prompt, it seems to work", and happily rolling it all into production. To move from chaotic testing to normal predictable systems, the guys from the Higher School of Mathematics are conducting a webinar on simple approaches to systematically improving AI products. The content looks like a must-have for ML engineers and developers. Products will finally be able to digitize user feedback and turn product hypotheses into measurable metrics for assessing the economic efficiency of features, while tech leads will understand how to properly build processes around all this AI development in the team. The stream promises not just dry methodology for measuring the quality of language model responses. There will be a full live demo of the entire product evaluation cycle in real-time from collecting raw logs to setting up automated systems. They will also review the production stack of tools and provide a ready-made framework that can be immediately implemented in their commercial or pet projects. The speakers will be quite competent people - Andrey Kiselev, Head of Product at an AI company with a background from Revolut and Yandex, and Fedor Azarov, leading the data research direction at Sber CIB. Start on May 28, 2026, at 19:30 Moscow time.