> Microsoft characterizes the output of Copilot as a series of code "suggestions". Microsoft "does not claim any rights" in these suggestions. But neither does Microsoft make any guarantees about the correctness, security, or extenuating intellectual-property entanglements of the code so produced. Once you accept a Copilot suggestion, all that becomes your problem:
> "You are responsible for ensuring the security and quality of your code. We recommend you take the same precautions when using code generated by GitHub Copilot that you would when using any code you didn’t write yourself. These precautions include rigorous testing, intellectual property scanning, and tracking for security vulnerabilities."
I can't help but recall:
"Linux is a cancer that attaches itself in an intellectual property sense to everything it touches."
With "normal" code I can generally see (or figure out) who posted/published it and reach out for explicit permission. It's not uncommon for me to do this.
How is one supposed to do that for the generated stuff? Seems like an awefully hands-off attitude. As challenging as it is, they really ought to be qualifying the input samples of training code before ingesting.
There are some techniques used mostly to detect when students copy paste code. I've seen some of the tools in that space and they have varying degrees of accuracy. MOSS is a common one[0].
There are some vendors in this space too (BlackDuck comes to mind) but they're $$$ so only within the scope of large corporations.
If anybody has any ideas relating to this type of analysis, I'd be excited to chat. I am working on a project[1] in this space for "Software Composition Analysis" which could potentially overlap with snippet detection for code like Co-Pilot. (We basically just have a big pipeline of analysis jobs that run on code and store the results. I need to update the docs!)
I don't think it's right to characterize it as hands off after they had their hands all up in the generated code. It's just malfeasant. They've produced a tool that is fundamentally (legally) unsafe to use and said that's not their problem.
It isn't so much a connection as an example of cognitive dissonance from the organisation.
On the one hand stating plainly that mixing in copy-left code and similar can be disastrously dangerous because it is a rampant virus. On the other hand not understanding why people think it might be a problem that their tool could encourage mixing in copy-left code.
Microsoft released a product which gives you cancer the moment you use it.
According to the opinions about what inclusion of open source code into your projects does, as per the ex-CEO of the company. That seems a bit of a far fetched conclusion, but then, Ballmer did say it.
The point is not clear, but if I were to guess, it's that Github Copilot should come with a California Prop 65 warning, because it can give your code "cancer" (GPL-licensed snippets from sources like Linux codebas).
Linux is open source and Ballmer is displaying Microsoft’s negative attitude towards open source that is demonstrated in the author’s arguments regarding copilot.
Fun story: That was my first employee town hall, in 2000. I was concerned for the fellow (and so very glad when he left, Satya has been so so so much better for the company and morale). It was definitely an... interesting introduction to the company.
At the time, I was doing Linux, OpenBSD and FreeBSD stuff in Bellingham. The reaction from the local and regional non-Microsoft community was really like "Holy shit what is going on down there?!"
> "You are responsible for ensuring the security and quality of your code. We recommend you take the same precautions when using code generated by GitHub Copilot that you would when using any code you didn’t write yourself. These precautions include rigorous testing, intellectual property scanning, and tracking for security vulnerabilities."
I can't help but recall:
"Linux is a cancer that attaches itself in an intellectual property sense to everything it touches."
- Steve Ballmer, while CEO of Microsoft