Benjamin Xie
Benjamin Xie
about me
projects
papers
teaching
in the news
blog
cv
contact
benchmarking
Using Benchmarking Infrastructure to Evaluate LLM Performance on CS Concept Inventories: Challenges, Opportunities, and Critiques
Used automated benchmarking infrastructure and expert review to understand differences in LLM and student performance on CS assessments with validity evidence.
Cite
×