What kind of blur is used? Blurs are annoyingly bad at obscuring things like faces. They may be good at making faces unrecognisable to people, but they’re not nearly as good at making faces unrecognisable to machines.
I've seen this sentiment mentioned quite a bit, but is it still true with the level of blur being shown in their example images? the blur level is extremely high to the point that it has essentially left behind a smooth gradient. Even with the algorithm known is there enough reversible information left?
Well, if you want to leave behind a smooth gradient, then leave behind a smooth gradient.
Suggestion for an algorithm:
* start with the blur
* sample the four colors at the four corners of the blurred region
* quantize them
* fill in the region with bilinear interpolation.
Then your whole region can only reveal these four quantized color values. If you only blur then you will have a harder time proving the leaked information content.
* detect (or get the user to select) faces
* replace the pixels in the face bounding box with a generic face
* blur the bounding box edges
* do whatever blur you think ends up "looking nice"
It's missing the step of detecting and applying detected skin tone to the generic face. Right now I can't imagine people being to happy about "generic white face blur", implying the generic face would be white.
I suspect there's not much information in the individual blurred face, but I wonder if given enough examples you'd be able to determine if an unblurred face is the one in a sample of images with any level of confidence? You can do that with text (http://dheera.net/projects/blur).
The face and its surroundings are blurred almost to a single colour. The average RGB value of my face might be unique-ish, but if you mix in some variable background, photographed on a camera whose lens has been smeared against the pocket of someone's jeans, the result should be human, not individual.
A side comment: AFAICT what the Signal developers have done is take code that was developed so that the phone camera could autofocus on faces, and and used that code to defocus faces. What a sweet hack.
A very sweet hack, but I think the concern was based on the example image provided in the link posted. While the face is blurred, there's still a lot of information you can glean about the person: their haircut, their neck, the clothes worn etc. -- so I'm guessing the threat vector here is that if you also have a general set of pictures from the same demo, you may be able to automatically identify who the blurred person is.
Blurring is better than nothing but the best picture when it comes to avoid being traced is the picture that was never taken.
Let's be real for 2 seconds here, this is pure nonsense. No court of law would do anything about "hey we arrested that guy because he has 2 eyes, a mouth and the same tshirt as that other guy who was protesting yesterday", if it comes to this you wouldn't even need a picture of blurred faces, just arrest whoever you want and provide forged evidences (or none) because that's exactly the same thing
And even then law enforcement are already filming them (cctv + from the air) and tracking their phones, the last thing you have to worry about is a 100% blurred face that no amount of technical power would be able to process or match back to you.
picture A of an individual, unblurred, protesting peacefully.
picture B of a blurred individual from later on in the same protest, wearing the exact same clothes, commiting questionable acts, is circumstantially incriminating.
You're overthinking it. Police already have their own camera people doing video surveillance in addition to CCTV and other surveillance tools. The sort of forensic analysis you mention is of course possible and is sometimes engaged in, but obscuring all such information would defeat the purpose of photojournalism altogether.
Yes. Someone who has access to many photos of the same set of people might well able to identify people on one photo, even though their faces are blurred on that particular one.
I'm not sure whether the large number of photos nowadays is a net negative, though. That's also what finally stopped Derek Chauvin.
https://www.androidpolice.com/wp-content/uploads/2020/06/04/... is blurred by Signal. Suppose that you have all the photos that have been posted to Facebook, and that both of those women are on Facebook, and lastly that you have resources enough to run all of those through the Signal code. How would you match those other photos to the blurred part of this one?
Not just any black spot either. A black spot of random size larger than what you want to redact. That way you avoid leaking the size of what's being redacted. The size of what's being redacted can sometimes provide enough information to determine plausible contents: http://blog.nuclearsecrecy.com/2014/07/11/smeared-richard-fe...
Ok, but just to be clear, we're redacting faces here. There isn't much meaningful here other than an exceptionally rough indication of age/development.
The examples on the Signal website give you hair color, hair style, likely race, and the shape of the top of the protesters' ears. While it's not definitive, given that a fuller redaction is easy and has no disadvantages, I don't see why someone shouldn't try.
> "The face and its surroundings are blurred almost to a single colour."
To your eyes, maybe. To a machine, you have an array of pixels, each with different values which, using an algorithm, could be adjusted into something your eyes can resolve into a unique face.
Hard to tell, I'm not a computer; but it does look better than most. To be fair (to me), I was basing my critique on the picture in TFA, which seems to have far more detail in it.
That said, the whole point of my post was that humans are really bad at judging this. Many blur algorithms can be reversed because they just modify the color values of the pixels in a reversible way. You can't always tell by looking at a picture what data is still there, in much the same way you can't see the stars in an ISO 200 picture of the night sky. It's not until you open it in GIMP and crank the exposure up to max that you see just how much data is there that your eyes couldn't perceive.
I kinda hope there's not only enough information there to algorithmically reverse, but that when you do that - all the de-blurred faces end up being George Floyd.
Probably not. You can remove Gaussian blur by performing the inverse convolution (it's tricky because you need to find the actual parameters that caused the initial blur), you can remove motion blur the same way, etc. This looks like there isn't nearly enough information there to do any of this, though.
No chance. That level of blur is essentially impossible to reverse. I think lots of people here are a bit confused because they know that all gaussian blurs are theoretically reversible. But they aren't thinking about how ill-conditioned the inverse gets as the blur gets larger and larger.
There's another concern—even if I can't usably invert the convolution, if I have photos of a thousand people's faces and one of them is that blurred face, can I figure out which one with high confidence?
Aesthetic. A blurred face looks better on a picture than a fat black box.
But like others have pointed out, you can achieve (allmost) the same effect, if you remove enough information before blurring, or just drawing a smooth gradient, but this alone is harder to make it look as nice, as blurring the actual image.
FWIW this was updated before release to also scale down the image before blurring it. We cut the size in half, or cap it to 300x300, whichever is smaller. This was to ensure that the effectiveness of the blur isn't reduced on higher-resolution images.
https://github.com/signalapp/Signal-Android/blob/master/app/...
I tend to pixelate regions I want to make unrecognizable in photos instead of using a gaussian blur due to this reason. Pixelation should be safe as long as the pixels are large enough, right?
I wonder why Signal didn't do something like that...
But a bad blur doesn't. That's why your parent comment asks what kind of blur they use.
Edit: Turns out it's a 25px Gaussian blur. There's some downsampling beforehand, but not much, and no color discretization. In other words, they use a bad blur, but compensate with a large security margin. I wouldn't be surprised if this was vulnerable to "if I have a thousand photos of faces and I think one of them matches the blurred face, I can figure out which one with high confidence", and they can get basically the same aesthetic effect if they heavily pixelate and discretize colors before blurring.
> "if I have a thousand photos of faces and I think one of them matches the blurred face, I can figure out which one with high confidence",
Is that actually a practical attack here? I can see it working if you have 1000 passport photo or mugshot frames photos, and a blurred photo with the same level/front-on framing. But is there a practical attack for non direct-facing-camera blurred pictures? (Assuming the scale of "locals at a local protest" instead of "Find Edward Snowden's blurred face from any BLM protest, no matter what the cost!!!")
Good point, I'd be surprised if lighting and orientation doesn't overwhelm all other information at this level of blur.
But I've been surprised by impressive digital forensics before—what if you can determine lighting and orientation from the rest of the photo, and then simulate them on each passport photo/mugshot? I'd still feel much more comfortable if they pixelized and color-discretized before blurring, and I still think the aesthetic effect would be much the same.
Edit: Most face recognition software works by down-sizing and blurring an image to faster detect face features. So in theory it is very easy to detect face features from a blurred image. A deblur tool can then use this information to better deblur a face.
But is deblurring from handshake or lens out of focus or even a Gaussian blur the same as some random gradient blur they seem to be using?
Edit: The images in the Signal article don't look like images of blurred faces. They look like blurry images overlaid onto faces. If you don't blur the face, how can it be unblurred?
You can't make data that isn't there. It's fundamentally going to be a guess. You can enhance your way to a face or a license plate but there is zero guarantee it will be the face or the license plate that the low quality image/video is of. This is why solid blocks of color or emojis are so effective at censoring images, it takes the data and replaces it with pure junk.
You do, however, lose colour depth information (e.g. deep color to true colour, true colour to high colour / 256 colours). Still enough to detect a face.