They see your photos

1 year ago (theyseeyourphotos.com)

  magick convert IMG_1111.HEIC -strip -quality 87 -shave 10x10 -resize 91% -attenuate 1.1 +noise Uniform out.jpg

This will strip ALL exif metadata, change the quality, shave 10 pixels off each edge just because, resize to xx%, attenuate, and adds noise of type "Uniform".

Some additional notes:

- attenuate needs to come before the +noise switch in the command line

- the worse the jpeg quality figure, the harder it is to detect image modifications[1]

- resize percentage can be a real number - so 91.5% or 92.1% ...

So, AI image detection notwithstanding, you can not only remove metadata but also make each image you publish different from one another - and certainly very different than the original picture you took.

[1] https://fotoforensics.com/tutorial.php?tt=estq

  • > So, AI image detection notwithstanding, ...

    There is no point in trying. You're just making it worse for human viewers

  • Clearly better than nothing, but how does it work with perceptual hashes? I gave it five minutes to try to get pHash to run locally but didn't manage to get any useful results from it, I was probably holding it wrong.

    • I’ve been working with perceptual hashes a lot lately for a side project, and my experience is that they are extremely resilient to noise, re-encoding, resizing, and some changes in color (since most implementations desaturate the image). Mirroring and rotation can in theory defeat perceptual hashing, but it’s fast enough to compute that if you care you can easily hash horizontal and vertically mirrored versions at 1 degree increments of rotation to identify those cases. Affine transformations can easily defeat some perceptual hashing algorithms, but others are resistant to them.

      The big weakness is that most perceptive hashing algorithms aren’t content aware, so you can easily defeat them by adding or removing background objects that might not be noticed or considered meaningful by a human observer.

    • Could probably get one of the many repos up and running pretty quickly [1].

      Potentially what you could do is generate smaller versions of the images, test their hash matching under different conditions against multiple algorithms and then pick the parameters where you get fewest hash collisions.

      [1] https://github.com/JohannesBuchner/imagehash

  • cropping or recompressing the image helps with nothing.

    see tineye reverse image search, it matches such variations with ease

    https://tineye.com/

    • You and several of your siblings here are missing the point - this is not about resisting or obfuscating image content or subject or fooling an AI classifier, etc.

      This imagemagick command is an attempt to remove digital forensic clues that would tie, for instance, an image posted by one pseudonym to an image posted by another pseudonym.

      At what confidence level can a raw HEIC from my iphone be tied to the jpeg that results from this cropping, resizing, noise and attenuation ?

      At what confidence level can one such transformed jpeg be tied to another such transformed jpeg ? (assuming that you scramble the values for (quality/shave/resize/attenuate) ...)

      This is tangential to the OP and the discussion - forgive me - but I think it's an interesting tangent.

  • It the image is watermarked, you can't remove it that way. Watermarks easily survive uniform noise higher than humans can tolerate. Watermark data is typically stored redundantly in multiple locations and channels, so uniform noise mostly averages itself out, and cropping won't do much. Watermarks often add signal in a different color model than RGB and in a heavily transformed domain of the image, so you're not adding noise along the "axis" of watermark's signal.

    For similarity search, it also won't do much. Algorithms for this look for dozens of "landmarks", and then search for images that share a high percentage of them. The landmarks traditionally were high-contrast geometric features like corners, which wouldn't be affected by noise. Nowadays, landmarks can be whatever a neural network learns to pick when trained against typical deformations like compression and noise.

  • Does this remove the color profile, though? I strip all mine with exiftool, but I exclude the color profile otherwise the entire image is screwed, especially if it's in some odd colorspace.

  • I don't really understand the target audience for this. If I don't want my personal media to be in the internet I just don't put it on the internet.

Keep in mind, when you use this, you are waiving your right to sue the authors in court: https://theyseeyourphotos.com/legal/terms

> PLEASE NOTE THAT THESE TERMS CONTAIN A BINDING ARBITRATION PROVISION AND CLASS ACTION/JURY TRIAL WAIVER.

These are common in the US, and consistently upheld in the US. Curiously, Ente did not add the opt-out provision they have in their usual ToS (https://ente.io/terms). I wonder why they made their Terms more restrictive for this specific service only.

This is just an ad for their photo service. Which presumably has terrible search features, if it doesn't use AI to analyse them. That's one of the best features in Google Photos!

  • Hey, one of the folks working on the said photo service here.

    Ente has reasonably good search[1] powered by on-device machine learning[2].

    [1]: https://ente.io/blog/machine-learning

    [2]: https://ente.io/ml

    • Thanks for the clarification.

      My family library is around 1Tb, the weakest device is an iPhone SE, and the most used access is through a web browser.

      Does on-device machine learning (provided you're syncing inter-device) work in that scenario ?

      10 replies →

    • Thanks. I would gladly take away the "convenience of AI search" for the privacy that your service provides.

      I used Ente once, and it was great, but I am poor, so I just store my images locally now. Not that your service is expensive or not worth it because I think it is.

      2 replies →

  • Terrible search without AI is a bit of a stretch. Also Google does not have a monopoly on object/face recognition in photos. There are self-hosted solutions that readily provide you with that without feeding a faceless AI with your photos while boiling the ocean.

    • > Terrible search without AI is a bit of a stretch.

      How so? I was looking for a photo of a grave I took some years ago. In Google photos I just searched for "grave" and if found 2 photos, including the one I wanted.

      Without AI I would have to search all my photos. Maybe I could narrow it down by date and location but it would take a lot longer.

      4 replies →

  • It is a weird advertisement when Ente (the name of the service) will also see your photos.

    • Ente is client side E2EE at rest, on device AI, open source and audited.

      E2EE solves all of these issues as long as it's open source and reproducible.

      Efforts like these should be praised.

      5 replies →

    • The service is end to end encrypted with local AI for indexing.

      I tried it a few months ago however and the upload/encryption was so slow from their desktop app it would have taken weeks to migrate my photos to the service.

      11 replies →

    • I think it's a pretty clever advert to be fair.

      And I really like some of the stuff they are doing.

      Their TOTP app is great.

  • How do you opt out of Google updating your user/advertising profile based on information they glean from your photos?

    • Google has been caught multiple times violating their own rules and the law to use all the information they have on you for advertising purposes. The only opt out is to stop using their services.

    • Move to the EU.

      Somehow, neither Google, nor Microsoft, nor Samsung, nor (probably) any other big tech company, can usefully extract data from photos anymore. Face recognition in particular works like one of those Shabbat-compatible appliances: something gets extracted at some point, eventually, but infrequently, and only when you're not looking - and, most importantly, it's not possible for you to control or advise the process. The AI processing runs autonomously in such a way that you may start doubting whether it's happening at all!

      I assume that this is the vendors' workaround around GDPR and such in relevant jurisdictions, but this also makes face search/grouping nearly useless. Don't get me wrong - I'm very much with the EU on data protection and privacy, but getting gaslighted by the apps themselves about the extents of and reasons for ML limitations in those apps, that's just annoying.

  • > This is just an ad for their photo service.

    Good. The more companies that treat users like humans instead of chattel the better.

This would be pretty great for generating descriptions for the vision-impaired, but it doesn't provide any profound insight beyond what you can tell from a glance.

It has a lot of "trying to sound smart" waffle, for example, it had this to say about some tree branches:

> A careful observer will also note the subtle variations in the thickness and texture of the branches, implying a natural, organic growth pattern.

Gee, thanks, I might've thought it was an unnatural inorganic tree otherwise.

  • My guess they had "Elaborate some subtle details of the photo" or "What conclusions can you draw from the situation?" or something as some instruction in the prompt, because it seems to try this with any photo, regardless if there are any noteworthy details or implications in it or not.

    I get the idea - demonstrating some "Sherlock Holmes style" inference of hidden facts from the photo - but it gets ridiculous if there is nothing for the model to find.

  • Facebook has been doing for years a basic alt-text generation (not the best but better than nothing) eg: May be an image of 4 people, people smiling and text

It'd be more terrifying if it didn't hallucinate earrings on somebody whose ears are out of frame, make comments about the left shoe of a barefoot child being out of focus, and so forth...

  • Heh. We have some of those Harry Potter style "floating" candles hanging above our dining table right now. I uploaded a photo that included them prominently -- it gave a great description of everything else in the room but ignored them completely. I was imagining it thinking desperately "don't hallucinate floating candles, don't hallucinate floating candles".

  • I had it trying to guess the economic status of a snow leopard.

    > The image centers on a single snow leopard; there are no humans present. The leopard's expression is alert and slightly wary but not aggressive. It's difficult to definitively determine the leopard's age or exact health from the image, but it appears to be an adult in relatively good physical condition. There are no clear indications of its economic status or lifestyle

    • Hilarious! The text generator is primed to remark on the subject's economic status. Because that sounds greasy when it analyzes your children. A snow leopard must have a rad lifestyle too.

      1 reply →

    • The economic status thing is interesting. I can’t help but wonder if there’s bias there.

      I uploaded a picture of me and a friend. I am Caucasian, he is of African descent. It said that my attire indicated I was of a higher socio-economic status than him. I was wearing a black t-shirt with a worn print. He was wearing a shirt and sports jacket.

      2 replies →

  • It is terrifying exqctly because it does hallucinate

    • "The subtle shadow suggests that he is a well-known terrorist. The paving stones in the square appear to be recently laid, implying relatively recent explosives training background."

        cat *.txt | grep -nE 'explo|terror' | fbi -open -up
      

      Expect something similar to this in the near future.

  • It hallucinated seeing something on a jacket, and another item of clothing for a few I tried too.

    It was very interesting to see when it was confused. It's not like when I think of LLM word math that I could sort of guess yeah maybe it comes up with that.

    The visual hallucinations were things like "that's straight up not in the image...".

As much as I appreciate the effort to create a technological solution that avoids big tech like Google, I find the best way is still prints. I'm usually 'the photographer' in the family and after an event I just order prints to the house of the relevant family members (or bring them over myself). Nothing can really compare to holding the physical product in your hand.

Additionally, due to the small cost of prints, there's a real incentive to only show a few of the best so that it doesn't devolve into endless scrolling.

  • Have you checked the privacy policy of your photo lab/printer? It's possible that they're collecting digital copies of your pictures, selling them (or just information about them) to third parties, as well as selling them/turning them over to the police and other government agencies.

    • Yes, I do. I read the privacy policy of all the websites I sign up for. In fact, that is the exact reason why I never got a Facebook account. When I read their privacy policy when it first came out when I was an undergraduate student, I was horrified and never signed up.

      Of course, that doesn't guarantee everything in this deceptive world, but it's the best I can do certainly.

  • Making prints easily available could be a good business idea for a photo storage app

I uploaded a photo of some damage I found on my chimney because of bad flashing, and it was surprisingly insightful. Although, it said my house was dilapidated and neglected. Hey man, fuck you.

Anyways, I’m pretty skeptical on most AI shit, but using it to help steer me in the right direction with home repair actually sounds pretty compelling considering how it’s nearly impossible to find contractors who aren’t full of shit, affordable, and actually show up.

It seems to engage in the same kind of saying-a-lot-without-actually-saying-much that LLMs do these days. I uploaded several private images, and beyond a mediocre description of the scenes, it didn't provide much identifying information. e.g.: " The background features a mix of modern and older buildings characteristic of a European city, with a mix of architectural styles. "

  • Huh, my experience was quite different. I fed it this picture:

    https://postimg.cc/yD4YZKFk

    And it led with "The image shows the interior of Union Station in Chicago Illinois," even though I'm certain there's no location data in the photo.

    There's some half-baked art critique at the end, but it got the exact place, and the Christmas tree, and the people sitting on the benches. It did miss the jazz ensemble off to the side.

Ente looks like Immich[0] (which I self-host for myself and family) with e2ee. I like non-e2ee because if something breaks then the files are stored as-is on disk for easy retrieval.

[0]: https://immich.app/

  • It's amazing how close it looks to Google Photos, Immich might be the perfect fit if Google chooses to hike the price again.

> The age and other details (racial characteristics, ethnicity, economic status, lifestyle) are impossible to ascertain regarding the swan.

Emphasis added.

  • I uploaded a bunch of drawings and it explicitly mentioned "racial characteristics, ethnicity, economic status, lifestyle" could not be ascertained for almost all of the portraits. I'm not a great artist but it was able to pick up a lot of detail about everything else. I imagine the prompt is probably asking for these things and the AI is reluctant to answer, although it did say that the artist was probably male due to the art style. I am male, and I suppose my art style could be more masculine, but I don't know how to quantify that!

    • I'm sure that's in the prompt. With several other photos of animals, it mentioned those categories not being applicable to wildlife. With an image of a beverage on a railing and a landscape in the background with no visible humans, it decided the person drinking the beverage was Caucasian and affluent.

      1 reply →

I uploaded an old image of a keyboard PCB from when I was troubleshooting it and it gave a very detailed response including naming the keyboard the PCB comes from, the time of day the photo was likely taken, and where the photo was likely taken.

Delay between uploading and response led to me uploading the pic 3 times.

The result: The AI analyzed the pic 3 times and each time added more detail - like the model of the burned out SUV, text on a traffic sign and more in-depth analysis of objects laying around the SUV.

A forth upload yielded some pure conjecture; it seemed to be looking for increasingly sinister causes.

    There appears to be some damage to the windows of the car that is more than just fire damage suggesting that the vehicle may have been vandalized or attacked before the fire occurred. The debris scattered around the car is inconsistent, suggesting a possibility that the fire was not accidental.

EXIF tags in images can have your camera and GPS info.

You can clear EXIF tags before sharing files.

Most social networks will remove EXIF tags from pictures before serving them to other people. Except silly social networks used by insurrectionists.

But even without the EXIF tags there's plenty of info that can be extracted from a picture. Face biometrics is one of them.

This is really cool. I posted a photo of what I think was my great grandparents into it and it explained their circumstances in fascinating ways (to the point of mentioning aged clothing, a detailed I overlooked).

I’ve been trying to figure out how to process hundreds of my own scanned photos to determine any context about them. This was convincing enough for me to consider google’s vision API. No way I’d ever trust OpenAI’s apis for this.

Edit: can anybody recommend how to get similar text results (prompt or processing pipeline to prompt)?

  • > consider google’s vision API. No way I’d ever trust OpenAI’s apis for this.

    lol?

  • I use ChatGPT every day and I just throw a pic and say "alt text", that will give you insane detail, but also limited because the prompt itself insinuates a shorter description for a HTML tag.

    I just threw a pic in here of my gf holding a loaf she just made and part of it said "The slight imperfections on the bread's crust indicate it's freshly baked, and the woman's posture and facial expression suggest that she is very pleased with her creation."

  • Where is the difference in trust level between google and openai coming from?

    • One company has the capacity to maintain HIPAA compliance and the other is best known for vacuuming up the entire web and users prompts. For something as sensitive as family photos, I know which company/product I'd prefer for this potential project.

      6 replies →

  • Yeah, I know the point of this site is to give us a dystopian shock by showing us how much information Big Tech extracts from our photos, but it's inadvertently a pretty good advertisement for Google's Vision API. It did a fantastic job of summarizing the photos I threw at it.

I uploaded a picture of my dog, and laughed out loud at:

> The dog's economic status and lifestyle are unclear, but the setting suggests a comfortable environment.

He does indeed live the good life!

This is already amazing, but one possible idea of improvement: Use the metadata (time and coordinates) to look up possible landmarks in the area or possible events/gatherings/conferences/etc that took place near the location and during that time, then add those to the prompt.

I posted some images that showed a well-known local landmark during a christmas fair event, as well as view of a close city.

The model accurately described the architectural details of the landmark that could be inferred from the photo, mentioned that there seems to be some event going on and made some speculations about the city in the background - but purely from the photo it had of course no way of knowing which landmark, event and city it was looking at.

I see this is slightly underestimating the amount of information you can extract from the photo: If you have a GIS database, it's not hard to know this stuff (or at least get a list of likely candidates) - and the kind of actors that this project is warning against very likely have one.

Also I'd be interested to see if the model could combine the context and the details from the photo to make some interesting additional observations.

i gave it a picture of making some meatballs, and it didnt capture the interesting parts.

a) it didnt catch that they were made of ground pork and not beef b) it didnt realize that the inconsistent browning and look of the fat was from butter browing the breadcrumbs and flour c) it didnt realize that the surrounding on the pan was bits of browned meat that fell off while rolling, instead claiming it was garlic or herbs d) it didnt spot that one had fallen apart a little bit e) it didnt get that i took the picture because i thought i rolled them too big f) it made up a counter, when only the cast iron pan was visible

with a different picture, it couldnt figure out what my makeshift Halloween costume was, despite it having been a pretty obvious squid games character.

it seems like it can see whats in the picture mechanically, but it can't see what the picture is of. whats the point of all this ai photo stuff if i cant give it a picture of a cake and have it tell me to turn down my oven a couple degrees next time?

Reminds me of the article “Language Models Model Us”:

> “On a dataset of human-written essays, we find that gpt-3.5-turbo can accurately infer demographic information about the authors from just the essay text, and suspect it's inferring much more.

> Every time we sit down in front of an LLM like GPT-4, it starts with a blank slate. It knows nothing about who we are, other than what it knows about users in general. But with every word we type, we reveal more about ourselves -- our beliefs, our personality, our education level, even our gender. Just how clearly does the model see us by the end of the conversation, and why should that worry us?“

https://www.lesswrong.com/posts/dLg7CyeTE4pqbbcnp/language-m...

Interesting how it‘s refusing to describe my penis, which I uploaded, all I get is:

> The photo appears to be a self-portrait, taken from an overhead angle. A person's torso is prominently featured in the foreground, the individual's gender is apparent.

A vibrant red Ducati SuperSport motorcycle takes center stage in the foreground, parked against a backdrop of a modern, light beige building. The building's architectural design features tall, slender vertical panels, creating a clean and contemporary aesthetic. In the background, the city's subtle hum hints at the urban environment, a scene of quiet sophistication and stylish urbanity. The sleek lines of the motorcycle contrast beautifully with the building's minimalist design.

The man, appearing to be in his late 20s to early 30s, exudes an air of refined confidence. His attire - a crisp white shirt, a grey waistcoat, and dark trousers - suggests a lifestyle of success and a keen eye for detail. He appears to be of Caucasian descent, his calm demeanor suggesting a moment of quiet contemplation rather than hurried activity. He is meticulously adjusting his helmet, exhibiting meticulous care in preparation. He looks to be in a good mood. The photo appears to have been taken with a professional DSLR camera in the daytime.

The subtle sheen on the motorcycle's paint job hints at a high-quality finish, reflecting the care and attention to detail apparent in both the rider's attire and the choice of machine. The watch on his wrist seems expensive, reflecting his status. The overall composition is balanced and well-lit, likely the result of careful planning and execution. This is not just a man riding a bike, it is a carefully crafted image of a stylish moment in time.

---

"He looks to be in a good mood"?!

I'm never in a good mood!

The painting is dominated by a large, anthropomorphic hot dog in the foreground, its body taking up most of the canvas. The background is a dark, muted purple, providing a stark contrast to the hot dog's reddish-brown skin. The hot dog appears to be in a state of distress, holding something smaller and lighter in its arms. The background is plain, drawing all the attention to the central figure. There are no other discernible objects in either the background or the foreground, except for what appears to be another hot dog of a different color in the central figure's hands.

The hot dog's expression is one of fear and pain; its eyes are wide and its mouth is open in a silent scream. Its skin is smooth and glossy, giving it an almost unnatural appearance. It's difficult to determine the exact details such as its race, ethnicity, age and other demographics from just the image itself. There is no indication of other people present. The activity in the image seems to be the hot dog grappling with another hot dog. There is no information regarding camera details or creation time.

The hot dog's skin exhibits subtle textural variations, suggesting a possible blend of oil and acrylic paints. The smaller hot dog in its grasp shows a slight discoloration around its base, hinting at a possible internal struggle or a change in its state.

These are the obvious things they can see in the photos. Not shown are the various assumptions they'll make about you based on your photos such as: gay, likely uneducated, high income earner, most likely republican, narcissistic, etc.

Also not shown is what they'll learn by the totality of the data they collect from your pictures such as how often you go on vacation, how often you're seen in new clothing and what kinds of clothes you typically wear, your health, what types of foods you eat, social graphs of everyone you're seen with and changes to your relationship status over time, how often you consume drugs/alcohol, your general level of cleanliness and personal hygiene, etc.

Even a handful of photos can give companies like Google, Apple, and Amazon massive amounts of very personal data but nobody thinks about that when they pull out their phones to take pictures or install a ring camera on their front door.

  • Do you do any ML for Big Tech? Because it's actually a lot simpler than that: the input is the sum total of your activity, and the output is the likelihood that you'll click on an ad or buy a product on a specific surface. You certainly can predict demographic information like sexual orientation, education level, income, political party, and with a fair degree of accuracy, but all it does is add noise to the calculation you really want, which is optimizing the amount of money you'll make. To the extent that demographics are computed, it's to make advertisers feel better about themselves. They would almost always be better off with a blanket "optimize my sales" campaign, but it's hard for ad agencies and digital marketers to justify their existence that way.

    • > You certainly can predict demographic information like sexual orientation, education level, income, political party, and with a fair degree of accuracy, but all it does is add noise to the calculation you really want, which is optimizing the amount of money you'll make.

      How are all those data points noise? They're crucial information used for targeting ads to a specific audience. Advertisers pay extra for it, because it leads to more sales. This is not just a gimmick, but a proven tactic that has made the web the most lucrative ad platform over any other. Adtech wouldn't be the behemoth it is without targeting, and the companies that do this well are some of the richest on the planet.

      2 replies →

    • I understand that tech companies simply care about whether the user will click on the ad, video or like the next song or show. But can this also be used to change user's preferences or thought process?

    • > and the output is the likelihood that you'll click on an ad or buy a product on a specific surface.

      Surveillance capitalism isn't really about ads. Increasingly that data is being used to impact your life offline. It influences how much companies charge you for their products and services. It determines what version of their policies companies will inform you of and hold you to. It determines very big things like whether or not you get a job offer or a rental agreement, but it's also being used to determine even small things like how long a company keeps you on hold when you call them. It's being used to make people suspects for crimes. It's being used against people in criminal trials and custody battles. It informs decisions on whether or not your health insurer covers your medical treatments. Activists and extremists use it to target and harass people they perceive as being their enemies.

      The data you hand over to companies is being used to build dossiers stuffed with inaccuracies and assumptions that will be used against you in countless ways yet you aren't even allowed to know who has it, what they're using it for, when they use it, or who they share it with.

      Nobody really cares about what ads they get shown when they use the internet so companies like to pretend that that's what their data collection is all about, and they absolutely do use it for marketing, but the truth is that digital marketing is a smokescreen for everything else that your data is being used for and will later be used for.

  • For most people, it is too taxing to be on guard 24/7 and they have other things going on in their life that are more pressing like paying rent. I don't blame people for not thinking twice about that Ring camera because unlike most open-source solutions, "it just works."

  • > Not shown are the various assumptions they'll make about you based on your photos such as: gay, likely uneducated, high income earner, most likely republican, narcissistic, etc.

    I uploaded a photo of myself as a child and based on the house in the background, the brand of shoes on my feet, and the clothes my dad was wearing, it flagged me as “middle class”, so at least one part of your claim is incorrect. I suspect this may be the same model google used internally

I’m very flattered that this tool think I’m in my 30s. Otherwise, not a lot of surprises. Yes smartphones encode GPS data and timestamps into EXIF.

For all 4 of their sample photos and one that I uploaded, their thing failed to notice that there were humans in the pictures. It said the opposite, that there weren't any. I'm disappointed. The one I uploaded is one that I took some years ago, but I've forgotten the time and place, and I'd like to have had it tell me.

  • If you're using a browser with heavy anti-fingerprinting capabilities it will upload a randomized canvas image instead of the intended image, and you'll get a lot of descriptions of pictures of wavy lines and no people.

    • That is weird, I upload photos from the exact same browser to other sites all the time and they look fine. Uploading a photo shouldn't touch the canvas. It's just an http post.

    • I find that hard to believe. Which browser intentionally tampers with images before they are uploaded, especially in such a destructive manner?

      1 reply →

In other words, images don't only potentially contain a lot of metadata (serial numbers, a geolocation, time since last OS reboot etc.), but people or algorithms could also... look at them, and then find out what's depicted?

I'll be sure to keep that in mind going forward!

I uploaded an image of a 6-panel hand-drawn cartoon I created and it very accurately described the scene and overall theme of the joke, even pointing out that it was hand-drawn, used no colors, and that the text in the speech bubbles was very legible. I did not expect that level of detail.

I like how the last paragraph completely oversells my photographing skills. The picture was not meant to be unique. It seems to always end with such a paragraph, even for dumb photos of nothing really.

“The photo's perspective is unique; it is taken from a very low angle, creating an unusual, almost childlike point of view. Another detail is that the photographer seems to have excellent timing as they captured the hand gesture at this precise moment. The lighting in the photo indicates it was taken during daytime, with the sun illuminating the scene beautifully. The contrast between the modern architecture of the building and the traditional costumes adds a rich cultural element to the photograph.”

Are machine learning image classifiers new to people? I don't get what's controversial here. How did people think they were searching their photos apps for beach and dog and getting automatic albums this whole time. Am I missing the point of this post/website?

I enjoyed a "the photographer is likely male given the technical nature of the subject". (on a picture of computer equipment: https://nt4tn.net/photos/garage1sm.jpg)

-- it hits a lot of details in images but also hallucinates a lot. And typical of modern LLM hallucinations they're kinda insidious in how plausible they are.

It's fun seeing what triggers its class classification. People in wooded area, middle class. Add welding to the image, working class.

It seems to have been prompted to seek out interesting, easily overlooked ("subtle!") details, but actually still misses them even if some are present.

I tried this picture [1] of a model nativity scene, which caused it to go on and on about the dryness of the moss and the indications of wear on the (fake) stable while completely overlooking that the scene had no Jesus.

[1] https://imgur.com/a/x7D1GC7

I once was foolish enough to upload a lot of personal photos to what was Picasa Web Albums integrated with desktop Google Picasa software back in 2007, but then years later deleted all of them. To this day I keep wondering whether Google still keeps all that photos somewhere in data lake warehouse.

https://en.wikipedia.org/wiki/Picasa_Web_Albums

This tells nothing much interesting. It seems to think all my photos are taken with a NORITSU KOKI QSS-30 camera. Which, btw, does not seem to be a camera of any sort.

The description generated is completely useless fluff.

Nearly as useless as the automatically generated image titles generated by word and PowerPoint which make the title/alt feature less useful since most modern documents have those autogenerated titles which add no value at all so people skip even reading the descriptions

So, I just click the example pic with a guy and two kids, one on his shoulders, and it describes saying it "shows a detailed close-up view of a textured surface, possibly a fabric or wallpaper". And then it goes on to say that the "photograph itself seems to be devoid of any human presence, focusing entirely on the abstract design."

I click another one with a family on a field. It says mostly the same as before.

EDIT: Oh, wait a minute! I had Resist Fingerprinting activated. So they're probably just reading the image through a <canvas> and getting shit from that.

In any case it's interesting to know that it works as a way to block some of it. But Google & co. just run it on their servers so...

I gave the following advice to someone I was chatting with on tinder:

1. Remember that when you send pics through imessage, it sends the exif data which includes location, date when it was taken etc and other info.

2. Disable Live Photos as it often captures things you may not want to capture few moments before and after the pic is taken.

Pretty nice idea, also introduced me to Ente's service which features shared event albums and guest append-only uploads - exactly what I needed a few months ago and even considered building myself.

Heh. I gave it a handwritten historical document from my genealogy research. Sure enough, they got the metadata from the picture, but they weren't able to read a word of it.

I sent photo of subway information screen in Hamburg with clearly visible line and direction - it did not pick up anything except the line number and "it's possibly a subway".

Took me a while to realize it was just describing Firefox's canvas anti-fingerprinting measures. "Looks to be a textile..."

From the title, I was hoping this was going to be an expose on iCloud Photos, which are not meaningfully encrypted and allows Apple to view your entire photo roll.

TIL I am Hispanic or Latino. I am also of Middle Eastern descent. My Latino co-worker is also of Mediterranean or Middle Eastern descent.

Perfect description of a photo I uploaded: age, gender, traditional clothes ... and even my ethnic group. Quite scary.

I find it interesting that it doesn't recognises AI generated images (on the other hand maybe it's intentional).

As others also pointed out, I found this site to be useful for other tasks aswell, I might use this often!

Probably I am the wrong audience but does this privacy scaremongering style actually work on anyone?

  • I, too, must be in the wrong audience, because I can't fathom consciously requesting an AI, whether local or remote, to examine a photo I took - for any reason. Certainly not just to help me organize a collection.

  • If that's the only hammer they have, that's the one they have to use whether it works or not.

I uploaded a photo of myself and this tool identified my ethnicity as Caucasian, which according to DNA tests is not correct. Also it was not able to recognize the brand of a cap I was wearing even though it should be obvious to a human. But it gave an interesting/useful description of the stones near me.

  • please do not expect the automatic phrenology machine to correctly identify races or ethnicities

Click the photo with what appears the be a father and two children.

> Although there are no people present in this image, [...]

Clicked the photo with what seems to be an African-American couple in front of a tree.

> The photograph lacks any human presence.

Clicked the photo with a family sitting among flowers.

> There are no people present in this image.

Clicked the photo with two people silhouetted in front of a window seen from the inside.

> The photograph is a study in texture; there are no people or discernible activity in the image.

But yeah, sure, let's hand all critical decision-making to AI.

If AI writing had a smell, this tool would smell as bad as a monkey chopping onions. They somehow spun 4 paragraphs out of a group vacation photo. Impressive on paper, yet half of the description was painfully obvious:

>The image shows a lively nighttime scene, possibly a parade or street festival. In the foreground, a group of people wearing elaborate, colorful hats and red shirts are prominently featured. The background includes brightly lit storefronts, one of which appears to be a pizza place, suggesting a bustling urban or suburban setting. The overall atmosphere is festive and energetic. There are also some indistinct shapes in the background that might be more people or decorations, but they are not clearly visible.

...

Several details are harder to make out at first glance. The hats themselves are quite elaborate and appear to be custom-made or part of a themed event, hinting at a possible local cultural or community celebration. There's a subtle variation in the lighting across the scene, indicating either the illumination from different sources (streetlights and storefront signs) or the varying distances of people from the camera. The signs in the background suggest a location, potentially in a town with a commercial district.

  • The reason this is interesting is that five years ago, or even two years ago, producing such a "painfully obvious" description from such a photo with a computer was utterly impossible, and ten years ago it was unthinkable. The capability to do this automatically at scale for trivial cost has many nonobvious implications (for example for cloud photo storage and drone warfare), and it invalidates many widely held implicit beliefs. We are only beginning to dimly grasp how this will change the world.

The few I tried were pretty unimpressive. It felt like elementary deduction with a lot of filler words and few facts… and straight up bad information.

I don’t see the problem here, if you remove the metadata from the image you are left with a very bland ChatGPT description of the image that sounds like a fifth grader trying to hit a minimum word count on an essay. Even if a photo service did this with every single image I have on my phone right now I don’t care.

This is just another attempt to shoehorn AI into absolutely anything

Here are some example photos we can discuss. First a photo[1] of me trying to look Amish, and the story it gives:

  The image shows a man in a beige polo shirt and a black fedora hat. He is sitting in what appears to be an office, indicated by the presence of boxes and what looks like a printer in the background. There is a landscape photograph on the wall behind him, showing what looks like trees and a field. The foreground is dominated by the man himself, while the background includes office supplies and a wall with a picture.

  The man appears to be middle-aged, with a serious expression. He has a goatee and glasses. His ethnicity and racial background are not readily apparent from the image. He appears to be of a middle-class socioeconomic status based on the office environment. He seems to be at work, possibly taking a selfie. The picture was taken on May 9th, 2007, at 9:14 AM, using a NIKON COOLPIX L12 camera.

  The man's glasses have a slight reflection, and this reflection shows part of his workspace and other objects. It is possible to make out the small print on the label of a box behind him. The lighting is relatively soft and comes from the front, as indicated by how it falls on his face. The focus is sharpest on the man, but the background is reasonably clear.

Here's another... of me doing my best "Some like it hot" pose

  The photo shows a man standing on a city sidewalk. In the foreground, there is a man wearing khaki shorts and a gray t-shirt. The background includes older brick buildings, a street with traffic, and some trees. There's also a lamp post next to the man and a modern glass building in the distance. The overall setting appears to be an urban area, possibly in Chicago, given the architectural style of the buildings.

  The man in the image appears to be middle-aged, with a fair complexion. He seems happy, possibly amused, judging by his smile. He looks like he may be of Caucasian descent. His economic status is difficult to ascertain, but his attire suggests a middle-class lifestyle. The photo was taken on August 7, 2008, at 12:07 PM using a NIKON CORPORATION NIKON D40 camera. He appears to be simply standing on the sidewalk, perhaps taking a break or waiting for something.

  The man's watch shows a bit of wear suggesting regular use. There is a subtle reflection visible on the man's glasses that provides a small glimpse of the surroundings. The image quality indicates it was likely taken outdoors in bright sunlight. The shadows suggest the time of day, adding depth to the scene and providing an additional element of reality to the photo.

Last, a photo of Chicago[3]

  The image is a nighttime shot of the Chicago skyline from across the lake. In the foreground, there's a dark, paved walkway with a few lights and what looks like a small building or structure near the water's edge. The background is dominated by the brightly lit cityscape of Chicago, with many skyscrapers and buildings of varying heights and architectural styles. The water reflects the city lights, creating a shimmering effect.

  The photo appears to have been taken by a lone photographer, judging by the lack of people in the foreground. The picture was taken on Saturday, November 20th, 2010, at around 10:32 AM using a NIKON CORPORATION NIKON D40 camera. No people are clearly visible, so there is no information about their characteristics or activities. The overall mood of the scene is serene and peaceful, with the city lights providing a sense of quiet vibrancy.

  The reflection of the city lights on the water isn't perfectly uniform, which is a subtle detail to notice, and the slight variations in the brightness of different buildings hint at differences in their energy consumption or lighting design. The darkness of the sky suggests a clear night with minimal light pollution, outside of the city itself. The overall lighting and composition create a breathtaking view of the Chicago skyline at night.

Note that it didn't catch the inconsistency of being a night-time photograph, and supposedly being taken at 10:32 AM (likely the edit date)

[1] https://www.flickr.com/photos/---mike---/52196125239/in/date...

[2] https://www.flickr.com/photos/---mike---/52194857882/in/date...

[3] https://www.flickr.com/photos/---mike---/51857839297/in/date...

This is a good example of the F in FUD marketing. I hope to never work for a company that has to scare people into using a worse product.

Imagine how much Meta has on your overall profile from your fb photo uploads.

  • Plus who you write on Whatsapp, plus what you like on Instagram, plus what ads you see online, what ads you click through on, etc.

  • > your fb photo uploads

    Don't forget other people's uploads. You don't have to use facebook to be on there. At one time not using it probably only served to make you more interesting to the system.

I fed it Trumps bullshit AI photo from today and it’s clearly hallucinating bullshit

“the subtle shadows of the drones indicates they are real not photoshopped…”

lol. Another AI snakeoil page.