Crash test dummies have basically this problem also. They're designed for realism in certain very narrow ways, and then the very small number of approved dummies are used for testing car safety.
The industry has made a bit of progress, surprisingly unprompted by regulations - female and child dummies came into circulation before they were required in tests. But overall, testing is still run against a tiny handful of body types which move 'realistically' in only a few regulation-guided respects.
I think some of this falls into the simulation paradox: the more accurate the simulation, the closer the simulation is to the thing being modelled. But it's a quadratic relationship in most cases, so at some point meaningful increases in simulation accuracy cease to be economically viable.
The industry has made a bit of progress, surprisingly unprompted by regulations - female and child dummies came into circulation before they were required in tests. But overall, testing is still run against a tiny handful of body types which move 'realistically' in only a few regulation-guided respects.