To be pedantic about this complexity estimate for the 4 x 4 case, both of them reduce to a O(1). If N == 4, it's a constant, so we have O(4^2.778) == O(4^2.8074) == O(1).
Talking about scaling for a problem that has no scaling factor is a bit odd.
Yeah the paper uses O notation for complexity, but never when N is constant. For constant N and M, the paper uses the number of scalar multiplication operations performed as complexity measure. According to Figure 3, their algorithm has decreased the best known value from 49 to 47 for 4x4 matrices, and from 98 to 96 for 5x5 matrices (as well as some further state of the art improvements).
This is for matrix multiplication where elements are themselves 4x4 matrices. So yes, indeed this is about multiplying many many 4x4 matrices where N is the size of the outer matrix.
Talking about scaling for a problem that has no scaling factor is a bit odd.