Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What kind of blur is used? Blurs are annoyingly bad at obscuring things like faces. They may be good at making faces unrecognisable to people, but they’re not nearly as good at making faces unrecognisable to machines.


I've seen this sentiment mentioned quite a bit, but is it still true with the level of blur being shown in their example images? the blur level is extremely high to the point that it has essentially left behind a smooth gradient. Even with the algorithm known is there enough reversible information left?


Well, if you want to leave behind a smooth gradient, then leave behind a smooth gradient.

    Suggestion for an algorithm:
    * start with the blur
    * sample the four colors at the four corners of the blurred region
    * quantize them
    * fill in the region with bilinear interpolation.
Then your whole region can only reveal these four quantized color values. If you only blur then you will have a harder time proving the leaked information content.


Or perhaps

  * detect (or get the user to select) faces
  * replace the pixels in the face bounding box with a generic face
  * blur the bounding box edges
  * do whatever blur you think ends up "looking nice"


It's missing the step of detecting and applying detected skin tone to the generic face. Right now I can't imagine people being to happy about "generic white face blur", implying the generic face would be white.


I reckon all faces should de-blur into George Floyd's face...

(From the examples in the Signal blog post, I don't think that grey gradient box is gonna be able to specifically imply white or POC faces...)


I suspect there's not much information in the individual blurred face, but I wonder if given enough examples you'd be able to determine if an unblurred face is the one in a sample of images with any level of confidence? You can do that with text (http://dheera.net/projects/blur).


The face and its surroundings are blurred almost to a single colour. The average RGB value of my face might be unique-ish, but if you mix in some variable background, photographed on a camera whose lens has been smeared against the pocket of someone's jeans, the result should be human, not individual.

A side comment: AFAICT what the Signal developers have done is take code that was developed so that the phone camera could autofocus on faces, and and used that code to defocus faces. What a sweet hack.


A very sweet hack, but I think the concern was based on the example image provided in the link posted. While the face is blurred, there's still a lot of information you can glean about the person: their haircut, their neck, the clothes worn etc. -- so I'm guessing the threat vector here is that if you also have a general set of pictures from the same demo, you may be able to automatically identify who the blurred person is.

Blurring is better than nothing but the best picture when it comes to avoid being traced is the picture that was never taken.


Let's be real for 2 seconds here, this is pure nonsense. No court of law would do anything about "hey we arrested that guy because he has 2 eyes, a mouth and the same tshirt as that other guy who was protesting yesterday", if it comes to this you wouldn't even need a picture of blurred faces, just arrest whoever you want and provide forged evidences (or none) because that's exactly the same thing

And even then law enforcement are already filming them (cctv + from the air) and tracking their phones, the last thing you have to worry about is a 100% blurred face that no amount of technical power would be able to process or match back to you.


picture A of an individual, unblurred, protesting peacefully.

picture B of a blurred individual from later on in the same protest, wearing the exact same clothes, commiting questionable acts, is circumstantially incriminating.


bikeshedding? on hacker news? no way


You're overthinking it. Police already have their own camera people doing video surveillance in addition to CCTV and other surveillance tools. The sort of forensic analysis you mention is of course possible and is sometimes engaged in, but obscuring all such information would defeat the purpose of photojournalism altogether.


Yes. Someone who has access to many photos of the same set of people might well able to identify people on one photo, even though their faces are blurred on that particular one.

I'm not sure whether the large number of photos nowadays is a net negative, though. That's also what finally stopped Derek Chauvin.


It shouldn't blur, it should be a black box.

You could definitely take signals code, and run it over the set of test images and find which output matches closest to the target image.


What set of test images?

https://www.androidpolice.com/wp-content/uploads/2020/06/04/... is blurred by Signal. Suppose that you have all the photos that have been posted to Facebook, and that both of those women are on Facebook, and lastly that you have resources enough to run all of those through the Signal code. How would you match those other photos to the blurred part of this one?


Paging clearview.ai's enterprise sales department. Call for you on line 7...


Not just any black spot either. A black spot of random size larger than what you want to redact. That way you avoid leaking the size of what's being redacted. The size of what's being redacted can sometimes provide enough information to determine plausible contents: http://blog.nuclearsecrecy.com/2014/07/11/smeared-richard-fe...


Ok, but just to be clear, we're redacting faces here. There isn't much meaningful here other than an exceptionally rough indication of age/development.


The examples on the Signal website give you hair color, hair style, likely race, and the shape of the top of the protesters' ears. While it's not definitive, given that a fuller redaction is easy and has no disadvantages, I don't see why someone shouldn't try.


> "The face and its surroundings are blurred almost to a single colour."

To your eyes, maybe. To a machine, you have an array of pixels, each with different values which, using an algorithm, could be adjusted into something your eyes can resolve into a unique face.


Seriously? Look at https://www.androidpolice.com/wp-content/uploads/2020/06/04/... — do you really think there's enough information in those two rectangles to reconstruct the faces even approximately?


Hard to tell, I'm not a computer; but it does look better than most. To be fair (to me), I was basing my critique on the picture in TFA, which seems to have far more detail in it.

That said, the whole point of my post was that humans are really bad at judging this. Many blur algorithms can be reversed because they just modify the color values of the pixels in a reversible way. You can't always tell by looking at a picture what data is still there, in much the same way you can't see the stars in an ISO 200 picture of the night sky. It's not until you open it in GIMP and crank the exposure up to max that you see just how much data is there that your eyes couldn't perceive.


I kinda hope there's not only enough information there to algorithmically reverse, but that when you do that - all the de-blurred faces end up being George Floyd.


The number of different characters is quite limited. Does this work for Chinese or only for Latin type languages?


Probably not. You can remove Gaussian blur by performing the inverse convolution (it's tricky because you need to find the actual parameters that caused the initial blur), you can remove motion blur the same way, etc. This looks like there isn't nearly enough information there to do any of this, though.


No chance. That level of blur is essentially impossible to reverse. I think lots of people here are a bit confused because they know that all gaussian blurs are theoretically reversible. But they aren't thinking about how ill-conditioned the inverse gets as the blur gets larger and larger.


There's another concern—even if I can't usably invert the convolution, if I have photos of a thousand people's faces and one of them is that blurred face, can I figure out which one with high confidence?


No, not with this level of blur.


Why dont they just literally cut-out/replace whatever would be blurred with just black pixels? Why blur anything anyway?


Aesthetic. A blurred face looks better on a picture than a fat black box.

But like others have pointed out, you can achieve (allmost) the same effect, if you remove enough information before blurring, or just drawing a smooth gradient, but this alone is harder to make it look as nice, as blurring the actual image.


They should simply explain what convolution they are using, and it would be easy to know.


I've been digging through the latest related commit in the repo: https://github.com/signalapp/Signal-Android/commits/master

They appear to use "com.google.firebase:firebase-ml-vision-face-model:20.0.1" to detect the faces.

The actual blur appears to be done here: https://github.com/signalapp/Signal-Android/blob/514048171bf...

Not sure what "ScriptIntrinsicBlur" stands for exactly, it appears to come from the android SDK itself: import android.renderscript.RenderScript;

EDIT: https://developer.android.com/reference/kotlin/android/rende...

It's a gaussian blur filter with a radius of 25px if I understand the code correctly.


FWIW this was updated before release to also scale down the image before blurring it. We cut the size in half, or cap it to 300x300, whichever is smaller. This was to ensure that the effectiveness of the blur isn't reduced on higher-resolution images. https://github.com/signalapp/Signal-Android/blob/master/app/...


You should perform a non-invertible blur though... Or even easier, use the same noise image for all faces.

EDIT: come to think of it, you can generate random noise using a palette from the color in the blur area (say, take four or five colors and mix them).

Applying convolutional blur for anonymizing is very very risky. Because you might end up with something either invertible or nearly so.


Ouch: gaussian blur might be invertible if you are not careful. That is why you need the explicit parameters of the convolution.

Thanks for digging.


I thought security through obscurity didn't work ^^. /s


Doesn't seem to be that good:

https://news.ycombinator.com/item?id=23422993

In fact it almost seems like the actual blur used in the app is different from what they show in the article.


I tend to pixelate regions I want to make unrecognizable in photos instead of using a gaussian blur due to this reason. Pixelation should be safe as long as the pixels are large enough, right?

I wonder why Signal didn't do something like that...


I think there's a way to do super-resolution on pixellated video.

It's okay for still images, but videos have a lot of information to leak. Just black everything out.


You should also discretize the pixel colors.

Note that if you blur the pixelated region, it'll be just as aesthetically pleasing as the reversible Gaussian blur.


A good blur sheds far too much information to be meaningfully reversible.


But a bad blur doesn't. That's why your parent comment asks what kind of blur they use.

Edit: Turns out it's a 25px Gaussian blur. There's some downsampling beforehand, but not much, and no color discretization. In other words, they use a bad blur, but compensate with a large security margin. I wouldn't be surprised if this was vulnerable to "if I have a thousand photos of faces and I think one of them matches the blurred face, I can figure out which one with high confidence", and they can get basically the same aesthetic effect if they heavily pixelate and discretize colors before blurring.

https://news.ycombinator.com/item?id=23415600


> "if I have a thousand photos of faces and I think one of them matches the blurred face, I can figure out which one with high confidence",

Is that actually a practical attack here? I can see it working if you have 1000 passport photo or mugshot frames photos, and a blurred photo with the same level/front-on framing. But is there a practical attack for non direct-facing-camera blurred pictures? (Assuming the scale of "locals at a local protest" instead of "Find Edward Snowden's blurred face from any BLM protest, no matter what the cost!!!")


Good point, I'd be surprised if lighting and orientation doesn't overwhelm all other information at this level of blur.

But I've been surprised by impressive digital forensics before—what if you can determine lighting and orientation from the rest of the photo, and then simulate them on each passport photo/mugshot? I'd still feel much more comfortable if they pixelized and color-discretized before blurring, and I still think the aesthetic effect would be much the same.


Citation needed.


2012: https://www.instantfundas.com/2012/10/how-to-unblur-out-of-f...

2017: https://arxiv.org/pdf/1702.00783.pdf (Pixel Recursive Super Resolution)

2020: https://venturebeat.com/2020/01/22/researchers-use-ai-to-deb...

Edit: Most face recognition software works by down-sizing and blurring an image to faster detect face features. So in theory it is very easy to detect face features from a blurred image. A deblur tool can then use this information to better deblur a face.


But is deblurring from handshake or lens out of focus or even a Gaussian blur the same as some random gradient blur they seem to be using?

Edit: The images in the Signal article don't look like images of blurred faces. They look like blurry images overlaid onto faces. If you don't blur the face, how can it be unblurred?


Yes, they are using a 25px Gaussian blur, they are not overlaying a different image on top: https://news.ycombinator.com/item?id=23415600


Impressive examples. Thanks for posting them!

So the ridiculous „Enhance!“ one sees in TV show crime dramas could one day actually become true.


You can't make data that isn't there. It's fundamentally going to be a guess. You can enhance your way to a face or a license plate but there is zero guarantee it will be the face or the license plate that the low quality image/video is of. This is why solid blocks of color or emojis are so effective at censoring images, it takes the data and replaces it with pure junk.


If you know it's a gaussian blur with a known radius, you can uniquely reverse it.


You do, however, lose colour depth information (e.g. deep color to true colour, true colour to high colour / 256 colours). Still enough to detect a face.


Yes, that works if the face itself is blurred, not if random noise is used in place of the face.


They're not using random noise in place of the face, they're using a 25px Gaussian blur: https://news.ycombinator.com/item?id=23415600


Nice find. That is unfortunate then. I thought they'd make more effort.


Trivial to test by deblurring and sharpening a blurred pic and passing it to, say, opencv




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: