Is there read really a Python library called ImagePatch that can find any item in an image, and it works as well as in this video? Google didn’t find an obvious match for “Python ImagePatch”
There is a GitHub repo / Python lib called com2fun which exploits this. Allows you to get results from functions that you only pretend exist. (Am on mobile and can’t link to it right now.)
According to the ViperGPT paper their "ImagePatch.find()" uses GLIP.
According to the GLIP paper,† accuracy on a test-set not seen during training is around 60% so... neat demos but whether it'll be reliable enough depends on your application.
I guess the idea is to trick the model into generating pseudo code. Which really doesn’t do much more than to act as a “scratchpad“ to focus the attention of the model to reason through the problem.
Besides, the Codex models are free right now. So… one more reason to rephrase questions as coding questions ;-)
Oh, so maybe I misunderstood what I was seeing. It wrote pseudo-code that makes sense conceptually, not code that I can paste in Jupyter and run (given the right imports)?