March 25, 2025
The rapid pace of development of large language model (LLM) agents such as ChatGPT, Gemini, Perplexity, Claude, and others has made it increasingly clear that they will play a big role in shaping how we work in the near future. I've spoken with many people who are already using LLM agents for diverse tasks - but are they ready for prime time yet in science?
I evaluated a few of the most common LLM agents for their performance on a summarization task in life science research. Read the report as a Google doc or download the PDF.