Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't understand how you achieve extra copies? My understanding is that the caller would never make a copy, it would always pass a pointer to large structs. So the absolute worst case, unless I'm missing something, is that we end up with the same number of copies as we do today (i.e one copy per large struct passed as a parameter).


  struct A { int m; } global = {5};
  int f(struct A a) {
   global.m = 7;
   return a.m;
  }
  
  int main() {
   f(global);
   // need to make a copy of 'global' here
   // otherwise f will return 5 instead of 7
  }


Hey, I've realized that there are two understandings of the proposed ABI: One in which the only promise is that the callee won't modify the object through the pointer, and one in which the callee promises to not modify the object through the pointer and the caller promises that nothing else will modify the object. Maybe you could shed some light on it since you're the author?

In the first version, the worst case situation is that only one copy is made, and it's always made by the caller. However, the caller has to make a copy if the object is referenced after any function is called, because that function might otherwise modify the parameter if a pointer to the caller's version of the object has leaked out somewhere.

In the second version, the worst case situation is that two copies are made where old ABIs would make just one copy (if the caller has to make a copy and the callee has to make a copy). However, the callee would only have to make a copy if it actually does something which might modify the object through the pointer passed as an argument, so the optimization would apply for more functions.

I think it's fairly clear from the article that your intended ABI is the first version, due to the sentence "In the event that a copy is needed, it will happen only once, in the callee, rather than needing to be repeated by every caller" . But in this comment, you're implying that the caller makes a copy if it can't guarantee that nothing else has a pointer to the object?


I should have been clearer; my intention was your second interpretation. The copying happening only once is predicated on the assumption that the struct wasn't aliased; since it's unlikely to be aliased if you're passing it around by value.

Your first interpretation is essentially what the ms/arm/riscv abis do. The reason I don't think that works as well is—

In general, it's rare for functions to mutate their parameters by value. We can effectively treat this as an edge case, and 'compensate' by making copies in the callee when necessary. But, when does the caller need to make a copy?

Version 1: whenever the object is aliased before the call, or read from after it

Version 2: whenever the object is aliased before the call

I think using the same struct multiple times is something that happens relatively frequently, so compared with v1, v2 elides a lot of caller-side copies. In exchange, it adds a relatively small number of callee-side copies. Which, despite the few pathological cases, seems likely to be overwhelmingly worth it most of the time.


Sorry, I messed up. I meant to write that in the first version, the copy is made by the callee. If the copy is made by the callee, then the callee can avoid a copy if it can guarantee that the caller's version of the object isn't changed before the callee uses it, and at most one copy is made.

Anyways, your intention is clear now at least. I'd be a bit worried about an ABI which might produce two copies for one parameter. It would be interesting to analyze a bunch of real-world code and see A) how often would my version create a copy, B) how often does the MS/ARM/RISC-5 version have to make a copy, C) how often would your version make a copy, and D) how often would your version require two copies.

Would also be interesting to see an analysis of code bloat due to copying parameters.


> If the copy is made by the callee, then the callee can avoid a copy if it can guarantee that the caller's version of the object isn't changed before the callee uses it, and at most one copy is made.

So the callee has to know what every caller of it will ever do? That's ... an ABI. The whole point is that functions can exist in a vacuum without knowledge of who they will be called by.

To be clear, I think it would be really cool if compilers could generate ad-hoc calling conventions using lto to optimize spillage, but that's not really useful as an ABI.

> would be interesting to analyze a bunch of real-world code and see A) how often would my version create a copy, B) how often does the MS/ARM/RISC-5 version have to make a copy, C) how often would your version make a copy, and D) how often would your version require two copies.

> Would also be interesting to see an analysis of code bloat due to copying parameters

I agree!


I'm not explaining myself clearly.

The ABI I had in mind was similar to the AArch64 ABI:

>If the argument type is a Composite Type that is larger than 16 bytes, then the argument is copied to memory allocated by the caller and the argument is replaced by a pointer to the copy.

But with a slight modification to put the copy in the callee:

>If the argument type is a Composite Type that is larger than 16 bytes, then argument is replaced by a pointer to the copy. The callee copies the pointed-to object into memory allocated by the callee.

This immediately has the advantage of less binary bloats, because the amount of parameter copying instructions in the binary will become O(number of functions) rather than O(number of function calls). (As an aside: That can probably be a huge advantage for C++ with its large, inlined copy constructors.)

When the copy is made in the callee, we can start identifying cases where a copy isn't necessary, or cases where only certain parts of the struct has to be copied. It would have to be fairly conservative though, since unlike with your ABI, there would be no guarantee made by the caller that there are no other references to the parameter.

I think my version is a clear and obvious improvement over the status quo, with decreased binary sizes and as good or better performance. Your version is more risky where the worst case is two copies per large parameter but, your version will probably achieve zero copies in way more cases than my version. "Low risk / medium reward" versus "medium risk / probably high reward".

---

Anyways, I might end up writing a blog post on this stuff. If I do, it will refer to your blog post. How should I refer to you? Moonchild or elronnd or something else?


If the address of the object escapes on the caller side then it has to make a copy as the object could be mutated or even just break the distinct address guarantee of the language.


I still don't understand, sorry. If the callee does something which could cause the caller's object to change, such as calling an unknown function or modifying through another pointer which might alias the parameter, the callee would just have to make a copy.

Could you provide an example of a situation where there would be more copies made using the proposed ABI than in traditional ABIs?


Sure, if calling any external function or writing though any pointer would force the callee to copy the object then yes you can have only the callee do the copy, but then it seems that this optimization would apply only to a very small subset of functions.


Right. That was my understanding, but I now see that there are more ways to understand it. I don't know which is correct, so I wrote a response to moonchild's comment here: https://news.ycombinator.com/item?id=27091726




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: