Comment by xlii
3 months ago
It seems nice but every single time I see service allowing anonymous uploads like such I’m thinking immediately: criminal use.
How hard would it be write a protocol that uses relatively safe urls to encode messages, e.g. by ensuring that the ratio of emojis isn’t serialized URL, credentials to some stash or an encoded picture no one wants to keep?
> It seems nice but every single time I see service allowing anonymous uploads like such I’m thinking immediately: criminal use.
This seems like the Hollywood movie plot criminal use.
Actual criminals just put a normal server/proxy in a non-extradition country or compromise any of the zillion unpatched Wordpress instances on the internet or something equally boring.
Might I say that this whole safetyist moral panic is very convenient for large corporations? If you can't host your own service due to these concerns, you'll use the cloud :)
It's not a moral panic it's called "an extended engagement with law enforcement will be unpleasant and costly" and you probably don't want that.
And if you're wondering why it's that way, then casually observe everytime people declare that people under arrest or being tried "don't deserve..." something.
5 replies →
It's even more boring: When I share criminal data (usually old movies that are still in copyright), I just put them in an encrypted 7zip archive and upload to google drive, then delete after my friend downloads it.
I mean, in this case we're talking about emoji, so I'm having a hard time picturing the criminal use, but in general anonymous file uploads or text uploads absolutely get used by criminals as soon as they're discovered. Anyone who's run a service for long enough will have stories of the fight against spam and CSAM (I do!).
> I mean, in this case we're talking about emoji, so I'm having a hard time picturing the criminal use, but in general anonymous file uploads or text uploads absolutely get used by criminals as soon as they're discovered
You can use the emoji service as an anonymous data upload service because it transfers information and you can encode arbitrary data into other data. But that sounds like work and people are lazy and criminals are people so they'll generally do the lazy thing and use one of the numerous other options available to them which are less work than creating and distributing an emoji encoder.
If you make a generic file upload service, well, they don't have to do as much work to use that. Then the question is, what should we do about that?
The next question is, does preventing them from using a given service meaningfully prevent any crime? That one we know the answer to. No, it does not. Because they still have all of the other alternatives, like putting it on a server or service in a foreign country or compromising random Wordpress instances etc.
Then we can ask, from the perspective of what the law should be and the perspective of a host under a given set of laws, what should we do? And these are related questions, because you want to consider how people are going to respond to a given set of laws.
So, what happens if you impose strict liability on hosts regardless of whether they know that a given thing is crime? Well then you don't have any services hosting data for people because nobody has a 0% false negative rate but without one you're going to jail.
What if you only impose liability if they know about it? Then knowing is a liability because you still can't have a 0% false negative rate, so they're going to prevent knowing and you end up with Mega encrypting user data so they can't themselves see it. That seems pretty dumb, you'd like them to be able to remove obvious bad stuff without putting liability on them if they're not 100% perfect.
What if you only impose liability if someone else reports it? This works like the DMCA takedown process, and then you get a combination of the first two. They can allow uploads but they can also remove things they're aware of and want to remove, but they end up de facto required to remove anything anyone reports, because if they don't and they ever get it wrong then they're screwed. So then you get widespread takedown abuse and have created a trolling mechanism. This is not a great option.
What if you let them moderate without any liability but require a court order to force them to take something down? This is like the approach taken by the CDA and is the best option, because you're not forcing risk-averse corporate bureaucrats to comply with evidence-free fraudulent takedowns but you still allow them to remove obvious spam etc. without liability. This leaves the service with a good set of incentives, because in general they'll want to satisfy users, so they'll try to remove spam etc. but not remove non-spam. Meanwhile this still leaves the option for crimes to be investigated by the people who are actually supposed to be investigating crimes, i.e. law enforcement, and then the courts can still order things to be taken down -- and more than that, put the actual criminals in jail -- without putting penalties on the service for not themselves being infallible adjudicators of what is and isn't crime.
What do you have in mind? It seems to only allow sending a single character at a time from a limited set. What criminal use does that allow?
The ultimate exploit is to create fake "likes". Once any system of likes becomes successful, it gets used for (a) filtering news feeds, and (b) establishing consensus & social truth. This is the biggest exploit there is.
A cheap system for "likes", such as this, is only safe when few people use it. Once it becomes popular, and worth something, it gets exploited, and then utterly fails.
"A Open Heart message should contain of a single emoji sequence. However, the emoji sequence may be followed by arbitrary data which the server is expected to ignore."
Italics mine.
That arbitrary data could be a multi-gigabyte zip file of some expensive program, classified data, copyrighted video/music,or anything for all this spec cares.
Ok, so? Provided the receiving server is configured to redirect the arbitrary data into the trash, you're in the clear, right? Not your fault someone sent you extra data, and you're not expected to keep it, and if you don't keep it anywhere, law enforcement could search your server but there's nothing to find because your system doesn't retain anything from the arbitrary data.
With ZWJ (Zero Width Joiner) sequences you could in theory encode an unlimited amount of data in a single emoji.
Particularly interesting are the "family" emojis, made by joining any number of person-type emoji with ZWJ characters. So in theory, a family made of thousands of men, women, girls, boys, etc... would be a valid emoji.
This nerd-sniped me, I wrote a tool for encoding arbitrary strings into one emoji: https://news.ycombinator.com/item?id=42829938
I tried with ZWJ but it turns out variation selectors were easier to make work.
Yup, ZWJ was my first thought, and yes, it works.
Tried up to 4.
Too lazy to push it to see how many joins until the api breaks.
https://emojipedia.org/zero-width-joiner
1 reply →
Ottoman harem emoji is valid Unicode now.
>A Open Heart message should contain of a single emoji sequence
This probably refers to emojis made out of multiple codepoints (e.g. skin color + person, or flags which are built out of the country code in a special range).
A single emoji is a sequence (of bytes)
5 replies →
You are aware that computers also just use zeros and ones to enable everything that is around us?
It seems to only allow sending a single character at a time from a limited set. What criminal use does that allow?
In the age of beepers, criminals found plenty of creative ways to send messages in just a few characters. And this permits emojis, which -- binarily speaking, contain far more bits than a beeper message.
The problem isn't that criminals can use your service, its that the service provider really doesn't want to be liable for that happening, which generally only happens when you host illegal content.
You don’t need a upload a lot of data in order to have illegal data stash and there are creative criminals out there.
E.g. for GPS coordinates you need only a 16 digits. Emojis are 8 bytes so by selecting specific ones and adding a control character (or two) and ensuring other stay in sequence you can encode this data in.
And then I can only respond with „Did you read article on ACME Times about a car riding a bike?” which is a simple pointer for URL which you might check for the drop coordinates.
This it’s also possible to provide encryption keys, url serialization, cryptocurrency wallet pointers etc. And sure, this seems complicated and dystopian but when government asks you to provide data of your users who committed hard crimes it’s not really fun to be at position when you say „I don’t know who my users are”.
From my experience any service that allows anonymous write and anonymous read over long periods will sooner or later be used for illicit activity. It doesn’t matter if that’s 1mb or 10 bytes.
Sure, I guess that could happen. Hackernews allows anon data uploads over long periods. How many online services actually do KYC if they don't legally have to?
Any motivated criminal could also just use a book cipher or any number of less trackable options.
The GET request does not return data in sequence, does it? Just counts fr each emoji.
What exactly does the govt do if you do not have data they want? I assume if you run a service like this you would comply with any data retention requirements in your country and hand over logs - although older ones which you might have deleted to comply with other laws!
Unless you have id verification crminals can sign up with false identities.
1 reply →
Why not just use pastebin for a "hey I left ur drugs at this coord", or even just a plain ol' encrypted message over email, Signal, etc...? I'm a little lost here, probably due to naivete. Is the storage of URLs or crypto wallet pointers really the bottleneck for cybercrime?
1 reply →
<http://habitatchronicles.com/2007/03/the-untold-history-of-t...>
that's really only a risk when you allow direct retrieval of the uploaded data.
if you're only returning counts and you're not even offering a guarantee that every submission will be counted, then the potential for abuse isn't really any higher than any other website out there.
Byte value is a count of flipped bits. Those bits aren’t even guaranteed to be correct (see cosmic bit flipping) and yet our computers work this out.
IMO this is risky because it’s easy to distribute upload, e.g. I could have infected, semi popular website that would submit distributed request on visit (think about it like 1000 credits daily to use to encode message). Visitors of this website wouldn’t see a thing and yet the encoded message would be consistent.
As for other websites - especially free image hosts - they often keep a metric ton of data, some won’t work if you won’t have an identifiable partner cookie on submission request, and there is post upload analysis etc.
like yes, theoretically somebody could probably manage to encode a few bytes of secret message.
but it's just that there's so many easier and better ways to do that, and even if they managed to accomplish it here it's hardly hurting this project - worst case scenario it's a bit of unwanted noise. it seems very silly to worry about it.
[dead]