> If the copy is made by the callee, then the callee can avoid a copy if it can ...

mort96 · on May 9, 2021

I'm not explaining myself clearly.

The ABI I had in mind was similar to the AArch64 ABI:

>If the argument type is a Composite Type that is larger than 16 bytes, then the argument is copied to memory allocated by the caller and the argument is replaced by a pointer to the copy.

But with a slight modification to put the copy in the callee:

>If the argument type is a Composite Type that is larger than 16 bytes, then argument is replaced by a pointer to the copy. The callee copies the pointed-to object into memory allocated by the callee.

This immediately has the advantage of less binary bloats, because the amount of parameter copying instructions in the binary will become O(number of functions) rather than O(number of function calls). (As an aside: That can probably be a huge advantage for C++ with its large, inlined copy constructors.)

When the copy is made in the callee, we can start identifying cases where a copy isn't necessary, or cases where only certain parts of the struct has to be copied. It would have to be fairly conservative though, since unlike with your ABI, there would be no guarantee made by the caller that there are no other references to the parameter.

I think my version is a clear and obvious improvement over the status quo, with decreased binary sizes and as good or better performance. Your version is more risky where the worst case is two copies per large parameter but, your version will probably achieve zero copies in way more cases than my version. "Low risk / medium reward" versus "medium risk / probably high reward".

---

Anyways, I might end up writing a blog post on this stuff. If I do, it will refer to your blog post. How should I refer to you? Moonchild or elronnd or something else?