The real measure for robotic capabilities involves conducting “quantitative, large-scale evaluations” in real-world ...