For the hardware I was using, the downstream CDR was actually doing a reasonable job locking on the spread-spectrum clock, such that PCIe negotiations would sometimes work through quite a few states without error (I was watching the state machine in Xilinx/Vivado's ChipScope), before inevitably failing when the CDR lost lock.