Comment by lifis

1 day ago

Obviously an image picker shouldn't leak filenames... The filename is a property of the directory entry storing the file storing the image. The image picker only grants access to the image, not to directories, directory entries or files.

If you want filenames, you need to request access to a directory, not to an image

"Obviously"

There are plenty of use cases where the filename is relevant (and many, many people intentionally use the image name for sorting / cataloging).

  • I have had more cases where I was very surprised that the local filename I used for something became part of its record when I uploaded it somewhere. (For instance, uploading an Mp3 using Discord on desktop web.)

  • There are many, many more cases where the user doesn’t expect the name to become public when he sends a photo. If I send you a photo of a friend that doesn’t mean I want you to know his name (which is the name I gave the file when I saved it)

    • I email images as attachments very, very frequently. I go through the browser's file picker and I pick out the photo by its filename. I would be surprised and angry if somewhere along the way the filename got changed to some random string without my knowledge and consent.

      In fact, I often refer to the name of the photo in the body of the email (e.g., "front_before.jpg shows the front of the car when I picked it up, front_after.jpg shows it after the accident.")

      I imagine this is an extremely common use case.

    • So in webmail, when you upload an image / file to attach it to an email, you expect it to be renamed? I don't.

The path is different than the filename though. If I want to find duplicates, it will be impossible if the filename changes. In my use case

/User/user/Images/20240110/happy_birthday.jpg

and

/User/user/Desktop/happy_birthday.jpg

are the same image.

  • > it will be impossible if the filename changes.

    Not impossible, just different and arguably better - comparing hashes is a better tool for finding duplicates.

    • From a technological standpoint, sure. I'd argue when you're staring down the barrel of 19,234 duplicate file deletions, with names like `image01.jpg`, `image02.jpg` instead of `happy_birthday.jpg`, there's a level of perceptual cognitive trust there that I just can't provide.

  • If your camera (or phone) uses the DCF standard [0], you will eventually end up with duplicates when you hit IMG_9999.JPG and it loops around to IMG_0001.JPG. Filename alone is an unreliable indicator.

    [0]: https://en.wikipedia.org/wiki/Design_rule_for_Camera_File_sy...

    • > loops around to IMG_0001

      Almost all cameras create a new directory, e.g. DSC002, and start from IMG_0001 to prevent collision.

    • Which systems still use this shortsighted convention? All photos I’ve taken with the default camera app in the last many years are named with a timestamp.

      1 reply →

  • > If I want to find duplicates, it will be impossible if the filename changes.

    Depends on what is meant by a "duplicate." It would be a good idea to get a checksum of the file, which can detect exact data duplicates, but not something where metadata is removed or if the image was rescaled. Perceptual hashing is more expensive but is better distinguish matches between rescaled or cropped images.

    https://en.wikipedia.org/wiki/Perceptual_hashing

It's not "obvious" at all, since it's contextual, it depends on the purpose and semantics of whatever service you're uploading the photo to.

Depending on how it'll be used next, not only can the current filename be important, I may even want to give something a custom filename with more data than before.