> I should have used "side-effect-free" instead of "idempotent" in my tweets
The HTTP term is "safe method". Although you weren't even wrong because section 4.2.2 of RFC7231 (i.e. HTTP) defines all safe methods, including GET, as idempotent.
I think they use this language because nothing is truly side-effect free. In fact GETs can have side-effects, the most obvious of which is writing the fact of it to a logfile, and that's the most harmless side-effect of all, right until you run out of disk space.
'Side-effect free' means that doing it once, twice or n >= 3 times (with same parameters) yields the same result, i.e. what it returns doesn't depend on any remote state that is altered by the call itself.
However, an idempotent HTTP call is certainly not a pure function which some people seem to be mixing up. Pure functions don't work with I/O.
REST is bit more specific and explicitly requires GET to be nullipotent which really means "effect free" - it just reads and doesn't alter the state on the remote system at all.
Side-effects like log files, rate-limiting, etc. will always exist, but they do belong to a different 'layer', so to speak. That is, these should be unobservable side-effects (also think about minuscule effects on the power grid, the fact that a request might write something to an ARP-cache, etc. - they all happen at different layers, so the quantum world state keeps changing, but that's not what this is about). Whether an X-Request-Count header violates the requirements or not depends on interpretation. From the garage door perspective, I wouldn't care...
IMO "side-effect free" is always a statement constrained by the operating level of abstraction. Logging is not an effect at the level of the application, but rather some subset of it's context (system, db).
I love that term and am going to use it as much as possible
When I recently added 'click to unsubscribe' functionality to my emails, the URL got wrote out into some logs. Those logs got written to a Slack channel and Slack loves to click any link it sees. Oh, and it doesn't respect robots.txt.
But all I saw was every member of my list clicking 'unsubscribe'. It took a good hour to figure out exactly what was going on.
Idempotence is not the problem here, by the way. That just means calling the method twice has the same effect. But GET should have no side-effect, in an ideal world. Of course, in the case of unsubscribe links, it needs to have a side-effect to comply with the law.
Actual public Certificate Authorities have done this too.
You request a cert, it's authorized everything seems fine. Except, huh, the guy who was supposed to authorize is off sick today, how did that work? The email to the authorizer should just be sat in his INBOX until he gets back.
Oh - the company's "Malware protection" system automatically dereferenced the "Do you want to issue this certificate?" link from the email and there was no second step. So for affected companies basically anybody could request any certificate in their domains and it would get issued.
As far as I remember nobody has proof any bad guys ever used this, but grey hats posted some fun they had with it. Likewise for a CA that decided to OCR the images from an unco-operative DNS hierarchy that wouldn't provide machine readable data to them. Grey hats obtained domain names that confused the OCR into allowing them to get certs for other people's names. Did any black hats do it? We have no proof.
Aren't you allowed to have your "click to unsubscribe" button lead to a page with a button that does a POST that actually unsubscribes? I feel like I've seen that approach in use.
How about just having, near the unsubscribe link in the email, a link that says "click here to ignore up to one unsubscribe link press within two minutes of clicking this link", so that the automated process that clicked the unsubscribe link by mistake will also click that link.
The page in question could also have a "actually, I do want to unsubscribe, I just clicked the wrong link by accident" button as well, in case a human reader is confused.
---
Or, one link that says "unsubscribe immediately" and another that says "unsubscribe only after confirmation", and the first one unsubscribes immediately, unless they also click the second link immediately before or after, while the second link only unsubscribes them if they click a confirm button on the page?
---
Or, maybe the unsubscribe link could have a confirmation button, but would also have some javascript to confirm the unsubscribe after a few moments of the page being fully loaded (the button being used if they have javascript disabled, for example)
> Idempotence is not the problem here, by the way. That just means calling the method twice has the same effect. But GET should have no side-effect, in an ideal world. Of course, in the case of unsubscribe links, it needs to have a side-effect to comply with the law.
Thank you! I felt like I was taking crazy pills with my understanding of idempotence.
Calling GET /door/open is idempotent too, but it's still gross in my opinion. The author makes it sound like that would be fine.
Facebook Messenger has the same problem, for a while my friends and I couldn't figure out why our referral links for a service weren't working, turns out Messenger was "using them up" for us.
It's perfectly fine to have certain side-effects, provided the GET is idempotent, but not others/most. Specifically, it's fine to idempotently have side-effects where it doesn't matter who is causing the effect, the side-effects are desirable, and the side-effect load won't be overwhelming.
In the case of a link shortener one can pretend that the side effect did not happen the first time the service sees some particular link, that the shortening has already occurred (at the beginning of time!). There is definitely a side-effect, since it involves updating a persistent hash table, and it is idempotent (though one could construct a shortener where it's possible to get more than one shortened form for a given URI when racing to shorten it, but this is not a problem for this particular sort of service).
“Toggle” is by definition not idempodent, because you get a different result each and every time. “Open” and “close” are idempodent, but not safe. The result of a GET request should always be idempodent and safe.
I disagree. There are some ways in which GET can be non-idempotent, such as pageview counters and endpoints with a vast amount of constantly-changing content, for instance. One may argue that the first example may be possible with a GET followed by a POST, but any subsequent GET response (assuming it contains the counter) would still be different to its prior.
It would be sufficient to GET a resource that uses a script to POST the side-effect. Slack's user-agent probably isn't sophisticated enough to mess this up (although heaven help us when they implement their preview with something like headless chrome).
What would the problem be in just showing a button that says "Confirm unsubscribe" that sends a POST request? A lot of sites does something like that for their newsletter unsubscription.
I remember having a similar issue in 2000, before any (meaningful) client-side javascript. Solution: each link that did something also had a request_id parameter, which was a timestamp in milliseconds. Two requests with the same request_id meant the user had clicked something twice, so any action would NOT be performed multiple times if the same request_id came in more than once.
This let users double click on links and have the action performed only once. In 2000, hyperlinks were still confusing to some users who were used to "double-click = open", especially for file icons.
EDIT: added text in italics because initial wording was confusing.
At 13:45 in this interview with Sergey Brin and Larry Page, you can listen to them discuss the meaning of idempotent on the air with Terry Gross. Pretty amusing thing to hear on NPR. I wonder if the term has ever been mentioned on public radio before or since.
> Idempotence is not the problem here, by the way. That just means calling the method twice has the same effect. But GET should have no side-effect, in an ideal world. Of course, in the case of unsubscribe links, it needs to have a side-effect to comply with the law.
Could you have the page redirect to itself with POST? Like javascript redirect or metatag. Browsers would do this, and there wouldn't be any different for users, but bots, slack and Safari probably wouldn't.
The problem is not that your endpoint had side effect. Rather, it didn't have any side effect(it is supposed to unsub, and it does exactly that).
The problem is with lack of authentication. Slack's ability to unsubscribe people on their behalf, without their explicit permission seems to be the real issue here.
Even a "I'm not a bot" check would provide some protection.
I thought that initially, but then I considered that for users to be bulk unsubscribed then the link would surely have to be the same for every user, at which point the each user gets the same unsubscribe link, and then when they click on it it unsubscribes everyone.
This doesn't seem very likely, so I guess that a whole load of unique unsubscribe links got dumped into slack which started following them.
Many years ago, I was asked to look at why all the content had vanished from a site (not built by me). After digging in a bit, I found that:
1) the original developer's idea of handling an unauthorized /admin request was just to set a redirect header and continue processing the current request .
2) the /admin page had a grid of all the content on the site, with handy 'Delete' links that ran over GET without confirmation.
You can probably guess where this is going – some search bot hit the overview page, ignored the redirect header, saw the content, and dutifully crawled every single link on it…
I think the state of the web has improved slightly over the last decade but this is a great example of why browser vendors are so conservative. You can do this now but only opt-in.
Was it blekko? We had a website owner email us about that issue when blekko's ScoutJet crawler was new... although I don't recall the bit about ignored redirect headers.
I'm pretty sure everyone with a crawler has hit this sort of problem before. The first startup I was at did with someone's wiki that had "delete" links everywhere with no auth.
Idempotency might be necessary for GET calls, but it's not sufficient. Imagine he had two separate GET calls (opened/closed): the author would still have the same problem. Browsers assume GET to be safe (non-mutating), and safety implies idempotency.
At the risk of being overly pedantic/piling on -- this is what bad REST-ful API design looks like in practice.
When you talk to your teammates about the semantics of these verbs and someone just says "oh a GET is fine" and the team agrees but you don't and you can't say it so you don't become "that guy" it's time to find a new engineering org to be a part of.
On the topic of PATCH, check out JSON merge patches (application/merge-patch+json):
Absolutely agree. A PUT method carrying an open/closed flag would seem like a natural choice. Calling it any number of consecutive times with the same payload would be idempotent. There would probably be a GET method to go along with it. And of course, it would model the desired state, not the actual position of the garage door since garage doors don't instantaneously flip (would be cool though).
A toggle would actually be a good use of POST, though PUTting the desired state would be better (PATCH works instead of PUT if you are changing some part of the state and not the whole state, but is unnecessary if the door state consists entirely of either “open” or ”closed”.)
One more thing: PATCH needs to do atomic updates on a partial resource per http://restcookbook.com/ (which I think is a great TLDR resource on the topic).
HTTP GET is nice because you can "debug" via browser. But I don't think it's a good protocol choice for opening/closing doors nor any other service not related to document requests.
This stuff is also why you should be afraid of any libraries/ frameworks/ tooling that says it's going to automatically offer TLS 1.3's "Zero round trip" (0RTT) feature for code as opposed to trivial stuff like resource downloads.
Normally, TLS ensures you can't replay somebody else's conversations. So even if I know Barry, who is authorised to toggle the door, just sent a "toggle the door" command, if I try playing it back that won't work, the setup will be different each connection and I can't respond.
But for 0RTT there is no setup - there can't be, no time to do it, and so if I replay Barry's "toggle the door" it would work.
The specification is very clear that the right thing here will be to never allow 0RTT for such features. But the moment that's hidden behind some library API you can bet _somebody_ is going to screw up badly. Alas our industry doesn't exactly have a "safety first" mentality.
It's updating the thumbnail screenshots. This only happens if you have the Safari "blank page" be your favorites instead of either your "home page" or a truly blank page.
Literally the reason HTTP verbs are a thing is so that User Agents like Safari can do exactly this. If this weren't a by-design property of the HTTP protocol, we wouldn't even have methods. Read the spec!
Now I wonder if Safari (& other browsers) has distinct headers for their favorites lookups, to tell these lookups apart from real users and discard these accesses from site analytics..
Hey! I've been thinking about this all day. I thought it was a GET to fetch the title of the page, but it might only be an OPTIONS or HEAD request? I'm not sure. Either way, my code activates the garage door on that endpoint no matter the HTTP verb.
The intersection of full-stack web devs from the commercial line-of-business world; and hardware/embedded hackers brings a lot of room for accidents IMO. I'm not saying any one of these groups are bad or inept. I'm in the former and completely accept that I'm new to embedded programming. It seems kool and I wanna learn about it. But I can also see the flip side where a hardware hacker sees query strings for toggling an output as a perfectly reasonable interface. Do we expect the embedded guys to grok HTTP/REST? The web-dev would be like "no, no, that has to be POST or PUT". But these things are going to happen. We don't yet have a large pool of experts across both fields.
It's no surprise the level of compromise and breach when you intersect what were pretty distinct skillsets and dump them in the mixing bowl together. That's what this IoT thing is like - it's a bunch of household and industrial chemicals all poured into the one container. It's not going to be very safe.
> Do we expect the embedded guys to grok HTTP/REST?
REST is irrelevant to this; HTTP alone covers the reason why this is bad. So ultimately, the question is: "Do we expect somebody designing an HTTP API to understand HTTP?" I think that's a reasonable expectation. If your embedded guys don't understand HTTP, then get somebody who does understand HTTP to design the API. They don't need embedded experience to do so, they aren't implementing it, just designing it. This isn't a difficult cross-functional intersection, you just don't assign tasks to people who aren't qualified to carry them out.
This is pretty much the classic newbie web developer mistake, heard many stories about people making it when they first start. I've also seen people fuck up in the opposite way, using POST when they should use GET and having unexpected behavior. Though not usually as "funny" as the classic "using GET instead of POST" errors are.
This concept of HTTP request methods really should be explained to new developers in a more accessible way, with examples of mistakes. It might not be intuitive at first or they might not think it's important as it is.
"Idempotence" isn't really the problem here, nor "should" GET requests be idempotent, think kittenwar.com or stumbleupon, the problem here is GET is reserved for retrieving (getting!) data, it shouldn't modify data. (Other than access information.)
> Methods can also have the property of "idempotence" in that (aside from
> error or expiration issues) the side-effects of N > 0 identical
> requests is the same as for a single request. The methods GET, HEAD,
> PUT and DELETE share this property.
Looks like the RFC talks about idempotence from a "side effect" perspective where I was talking about it from an "output" perspective (the generated HTML).
I agree with the RFC and I mistook what the person meant
Well, a while ago I saw this code (on my own project!):
window.open("?controller=users&action=changePassword&name=" + user_name + "&password=" + password)
I was horrified, glad it isn't live yet, and I fixed it immediately. But I'm still wondering whether I was so sleep-deprived or drunk when I wrote this. It's over SSL, so it should not be that big deal, but still, GET shouldn't be used for such things.
Well you don’t seem to validate the existing password prior to authorizing the change.
Good CSRF protection on GET requests is also near impossible to implement as GET is intended to be a “safe” request as in a request that does not modify a state but this isn’t something that is actually practiced.
Actually, I do. This is not a form for user to change his own password, rather a administrators form to change another user's form. And for such actions the administrators identity and privileges are checked. But I understand your reasoning and thank you for pointing it out.
And yeah, I try to use GET only for safe requests, but I should be more careful.
This comment thread has really put me in a good mood. These stories have so much pedagogical value:
• The grammars we engineer have important semantic value
• understanding and adhering to them is important, and hard
• relying on others to adhere to them is dangerous, and hard to avoid
• "experts" make mistakes in both areas constantly
I genuinely love seeing this kind of lively discussion, because these seemingly "trivial details" matter, a lot. The Three Mile Island accident was more or less caused by "message sent" being conflated with "state changed" at the UI level, directly leading to a nuclear meltdown. They basically had a system with the equivalent design of GET /open and /close that assumed success for both
https://en.wikipedia.org/wiki/Three_Mile_Island_accident#Con...
I hooked up an ESP32, 2 channel relay (up/down control), and distance sensor (to detect height). Pushes height to graphite and position is settable remotely. :)
I get that it's cool, but I'm missing the why? Would you ever need to change your desk height when you're not already at your desk? Wouldn't manually changing the height be easier than hitting an HTTP endpoint on your PC to adjust it? Maybe I'm missing something, and like I said, I'll give oyu that it's cool and that alone is sometimes reason enough.
Primary effect - An effect on the input arguments, who's effect is captured in the output.
Side effect - An effect on state that was not passed in as input arguments, who's effect may or may not be captured on the output.
Side effect free - Also known as pure, means the function only has a primary effect. Thus it only effects the input in a way that the output captures.
Idempotent - Applying a function to itself results in the same effects. Applies to both primary and side effects.
Where things get weird, is that there's also the following:
- An effect on implicit input state, which did not come from input arguments, who's effect is captured on the output. This would be like a HTTP GET. Or any query on a DB where the DB is an implicit input.
- An effect who's effect is captured on implicit output state, either by having its effect captured on an input (like a modification to a pointed object), or captured on output not returned by the function (like print to screen). This would be like a HTTP POST.
And now if you look at all these, there's an easy permutations of them. So you can build a table like so:
Input | Output | Idempotent
Arguments | Return Value | Yes
Arguments | Return Value | No
Arguments | Outside State | Yes
Arguments | Outside State | No
Arguments | Arguments | Yes
Arguments | Arguments | No
Outside State | Return Value | Yes
Outside State | Return Value | No
Outside State | Arguments | Yes
Outside State | Arguments | No
Outside State | Outside State | Yes
Outside State | Outside State | No
All these combinations are possible. That's why it can be really tricky.
Yeah, did wonder about that. My thinking at the time was that the device was on the local WiFi network, not exposed to the internet, and there would be easier ways of getting into the garage if you really wanted to.
I recently soldered a wire to my garage door opener on the wall and ran it to a relay and then to the pins on a raspberry pi. Knowing the state of the door is key because the opener is just a toggle. I also have my alarm system hooked up to the pi, so it checks the state before and after any request. Repeatedly asking it to open will open it, or return success of it already is. Same with close.
It took a bit of testing before I trusted it would all work the way I thought it would, but now I user it and don't even think about it, it just works and is handy to have.
I know this is quite unrelated, and based on hazy memory of things from almost 10 years ago, DML statements in databases especially 'insert into' statements are not idempotent as I remember; ie if you try to select a few rows from a table table1 and insert them into table2 with same schema, if there were any identical rows already in table2, then whole insert will fail. My thinking at that time was that if these insert operations were idempotent, then there would be no need to explicitly check for duplicates before the insert.
At least in MySQL you can use “ON DUPLICATE KEY” to either ignore such things or optionally execute an update statement to change something about the matching row.
There is also REPLACE INTO
Assuming you have a relevant unique key setup of course.
> DML statements in databases especially 'insert into' statements are not idempotent as I remember;
GET is actually supposed to be safe which is stronger than idempotent; the SQL command that most naturally corresponds to GET—SELECT—is normally safe, but DML inherently is not.
But, sure, that INSERT isn't safe increases the amount of code needed to implement idempotent PUTs.
REST is HTTP. "loose REST conventions" is when someone chose to ignore big chunks of the HTTP spec.
In other words, you can build whatever you want (like SOAP) on top of HTTP and ignore the spec that describes content negotiation, HTTP methods, Caching policies, etc. It's still technically HTTP. But if you were to read the HTTP spec and follow it to a tee, you'd build a REST application.
This is untrue. I hate to quote from Wikipedia, but it sums it up quite nicely: "REST is not a standard in itself, but RESTful implementations make use of standards, such as HTTP, URI, JSON, and XML"
More colloquially: REST is what happens when people mix up transport layers in their head.
That's actually backwards; HTTP is the motivating example of REST (that is, REST was developed from observed properties which HTTP/1.0 loosely exhibited and was consciously applied in design of HTTP/1.1.)
A reboot of the machine isn't the end of the world and far less risky than a garage door randomly opening. It probably needed it anyway as they tend to degrade over time. This is a very specific usecase. If it becomes an issue, I can always push out an update of the software that switches things to POST (thanks to using the golang library, overseer).
I really (seriously) don’t get why you did’t use POST in the first place? If it’s all for the “easy hit of the url in a browser” there are addons for that? Care to explain? One can’t be that lazy :)
This garage door opening story reminds me of something that happened in the 50's. The first powerful comm satellites happened to use a frequency that garage doors used (no id codes then). Garage doors opened and closed by themselves. My grandmother had it happen to her.
forgive my density, but why would you make a web page trigger a toggle when it is loaded? shouldnt there be a button or some other user interaction to initiate the state change?
Is it like loading a web page for the weather- so that when the page is loaded it goes and 'GETS' the latest weather info? Is it for convenience? So that all you have to do is go to a webpage and have it do stuff?
The real problem here, as usual, is web browsers, the worst class of software ever written. We constantly come across these CSRF-style bugs that are only made possible by how stupid the browser and HTTP are, but instead of blaming the culprit and trying to deal with the source, we blame ourselves for not being accommodating enough. Fool me once, shame on me, fool me 5,000 times, shame on me. Oh, and occasionally invent hackneyed fixes like CORS.
Why do people like you always want to suggest that web developers shouldn't have to learn the basics of the tools they use every day? This is page one HTTP stuff.
This is the anti-intellectualism in our field. Where people are so used to finding a YouTube video tutorial for the exact thing they want to do that anything that's inherently hard (like client development) or requires some extra knowledge to do correctly is somehow shitty and needs to be reworked. More and more often it's somehow everything's fault but the craftman's.
It's that mentality that's coasting parts of our field into code monkey cost center positions. Go somewhere like /r/webdev and watch how unresourceful the beginners are and how bad the advice is.
I'm not seeing how this is an issue with the browser as much as the server handling the request. Safari provides a feature that shows thumbnails of frequently visited sites; these thumbnails are loaded with a header specifying its for a preview. It's on the server to understand that a GET request _by definition_ should not have side effects (like toggling the open/closed state of a door), and optionally to perform special handling when seeing the preview header.
> In computing, an idempotent operation is one that has no additional effect if it is called more than once with the same input parameters. For example, removing an item from a set can be considered an idempotent operation on the set.
This, kids, is why GET requests should be idempotent.
Or, like, you know, how about a browser only sends a request when I'm actually fucking asking for something, and doesn't try to fetch everything I've ever thought about, with my every slightest accidental finger twitch against its touch screen?
Well, ideally you'd have both. When I "GET" I assume that it's GET as opposed to any of the other standard HTTP methods, not "GET" as in Indiana Jones getting the golden idol from the pedestal.
This is not only a problem with browsers preemptively requesting URLs, but also when it comes to caching. What happens when the URL I'm GETing is cached? Absolutely nothing, as far as the original server is concerned.
The current situation with browsers doing smart things to make slow websites appear fast is a bit like compiler writers doing smart things with UB in C, though. Speed a lot of things up by utilizing every undefined nook and cranny of the spec, breaking tons of legacy software that make pretty sound assumptions about how things actually work. I use a bunch of poorly designed legacy systems where GET often has intentional side effects. They break because the browser starts issuing HTTP requests long before I have finished typing an address. Let slow sites be slow and leave the speed problem to the people that should be dealing with it instead.
First of all “/toggle” is not a proper REST endpoint. In fact, that is exactly what an RPC endpoint might look like. I can’t believe so many “developers” miss this.
With JSON RPC, he would POST a JSON object to a /toggle endpoint which runs a procedure to open or close his garage door.
If you want to be cute and conform to REST as much as possible, you would have to treat your garage door like a resource, and then use PUT to send the entire new state of the garage door to your API or PATCH to send the instructions for how to change the existing state (or use the +json media type for patch if you want to be lazy and just send a JSON object with updated key values). Your URL would probably be in the form of “/garage-doors/garage-door-id”.
Except, his garage door is probably NOT a resource and has no ID or even a serialized representation of it’s physical state. It’s just a physical door. And it should be driven remote procedures.
Maybe you don’t need any of this. Maybe you are happy with just using GET and pushing everything into REST and basically doing the wrong things. But that’s how you end up with the problems in the article, by not respecting standards, by choosing to be close enough to correct instead of technically correct. And frankly, if there’s any people who should strive for technical precision it should be engineers.
I personally would not hire a single software engineer who chose to approach this problem in the RESTful way without clear and deliberate reasoning for doing so.
To me this sounds like a CSRF problem. There's no token or session associated with these calls, so a browser was able to inadvertently CSRF the calls. Changing this call to POST or PUT would still leave this API vulnerable.
It's not about access control, it's about the fact that browsers are free to make speculative GET requests whenever they like, and they actively do to pre-fetch pages. His GET end-point was pre-fetched by his browser, activating the door. This would still happen even if there was a token or session associated.
> You know how HTTP GET requests are meant to be idempotent?
No, they aren't. They aren't meant to be anything specific. They can be idempotent, but generally these philosophical arguments are the net gain of this kind of condescension.
Hello! Long-time lurker, and guilty dev behind the garage door. You can see the (broken) code I wrote here:
https://github.com/wpearse/wemos-d1-garage-door-wifi
I'll get around to fixing it later this week.
Also, an apology: I should have used "side-effect-free" instead of "idempotent" in my tweets.
> I should have used "side-effect-free" instead of "idempotent" in my tweets
The HTTP term is "safe method". Although you weren't even wrong because section 4.2.2 of RFC7231 (i.e. HTTP) defines all safe methods, including GET, as idempotent.
I think they use this language because nothing is truly side-effect free. In fact GETs can have side-effects, the most obvious of which is writing the fact of it to a logfile, and that's the most harmless side-effect of all, right until you run out of disk space.
Being a language arse I think the high precision descriptor is actually nullipotent. https://en.wiktionary.org/wiki/nullipotent but I'd never say it out loud.
'Side-effect free' means that doing it once, twice or n >= 3 times (with same parameters) yields the same result, i.e. what it returns doesn't depend on any remote state that is altered by the call itself.
However, an idempotent HTTP call is certainly not a pure function which some people seem to be mixing up. Pure functions don't work with I/O.
REST is bit more specific and explicitly requires GET to be nullipotent which really means "effect free" - it just reads and doesn't alter the state on the remote system at all.
Side-effects like log files, rate-limiting, etc. will always exist, but they do belong to a different 'layer', so to speak. That is, these should be unobservable side-effects (also think about minuscule effects on the power grid, the fact that a request might write something to an ARP-cache, etc. - they all happen at different layers, so the quantum world state keeps changing, but that's not what this is about). Whether an X-Request-Count header violates the requirements or not depends on interpretation. From the garage door perspective, I wouldn't care...
2 replies →
IMO "side-effect free" is always a statement constrained by the operating level of abstraction. Logging is not an effect at the level of the application, but rather some subset of it's context (system, db).
I love that term and am going to use it as much as possible
Fixed now: https://github.com/wpearse/wemos-d1-garage-door-wifi/commit/...
I think.
edit: updated link because left creds in the commit :-/
I hope that's not your real password!
16 replies →
Please, for the love of all that is holy, don't post long-form content on Twitter.
> Please, for the love of all that is holy, don't post on Twitter.
FTFY
I'm sorry. This is the first time I've used a thread. I won't do it again!
1 reply →
Anything side-effect-free is idempotent too, I suppose
No. A function which multiplies a number by two is side effect free, but is no idempotent.
22 replies →
No side effects means no garage door opening or closing. I guess you wanted it to open or close in some cases at least?
Isn't the point of this thread that the OT says that he shouldn't have used GET for this?
When I recently added 'click to unsubscribe' functionality to my emails, the URL got wrote out into some logs. Those logs got written to a Slack channel and Slack loves to click any link it sees. Oh, and it doesn't respect robots.txt.
But all I saw was every member of my list clicking 'unsubscribe'. It took a good hour to figure out exactly what was going on.
Idempotence is not the problem here, by the way. That just means calling the method twice has the same effect. But GET should have no side-effect, in an ideal world. Of course, in the case of unsubscribe links, it needs to have a side-effect to comply with the law.
Actual public Certificate Authorities have done this too.
You request a cert, it's authorized everything seems fine. Except, huh, the guy who was supposed to authorize is off sick today, how did that work? The email to the authorizer should just be sat in his INBOX until he gets back.
Oh - the company's "Malware protection" system automatically dereferenced the "Do you want to issue this certificate?" link from the email and there was no second step. So for affected companies basically anybody could request any certificate in their domains and it would get issued.
As far as I remember nobody has proof any bad guys ever used this, but grey hats posted some fun they had with it. Likewise for a CA that decided to OCR the images from an unco-operative DNS hierarchy that wouldn't provide machine readable data to them. Grey hats obtained domain names that confused the OCR into allowing them to get certs for other people's names. Did any black hats do it? We have no proof.
Any chance you could provide articles on these grey hat activities? Sounds interesting
1 reply →
Aren't you allowed to have your "click to unsubscribe" button lead to a page with a button that does a POST that actually unsubscribes? I feel like I've seen that approach in use.
What they do is load a page with a form redirect to do the POST, I believe, so link loaders won't follow it and you'll be safe.
How about just having, near the unsubscribe link in the email, a link that says "click here to ignore up to one unsubscribe link press within two minutes of clicking this link", so that the automated process that clicked the unsubscribe link by mistake will also click that link.
The page in question could also have a "actually, I do want to unsubscribe, I just clicked the wrong link by accident" button as well, in case a human reader is confused.
---
Or, one link that says "unsubscribe immediately" and another that says "unsubscribe only after confirmation", and the first one unsubscribes immediately, unless they also click the second link immediately before or after, while the second link only unsubscribes them if they click a confirm button on the page?
---
Or, maybe the unsubscribe link could have a confirmation button, but would also have some javascript to confirm the unsubscribe after a few moments of the page being fully loaded (the button being used if they have javascript disabled, for example)
Don't make your users jump through hoops to unsubscribe. That seems like a typical dark pattern to me.
31 replies →
> Idempotence is not the problem here, by the way. That just means calling the method twice has the same effect. But GET should have no side-effect, in an ideal world. Of course, in the case of unsubscribe links, it needs to have a side-effect to comply with the law.
Thank you! I felt like I was taking crazy pills with my understanding of idempotence.
Calling GET /door/open is idempotent too, but it's still gross in my opinion. The author makes it sound like that would be fine.
Kinda seems like he really just wanted to use the word.
I think he only had GET /door/toggle, which cannot be idempotent.
6 replies →
Facebook Messenger has the same problem, for a while my friends and I couldn't figure out why our referral links for a service weren't working, turns out Messenger was "using them up" for us.
Consider link shortener services...
It's perfectly fine to have certain side-effects, provided the GET is idempotent, but not others/most. Specifically, it's fine to idempotently have side-effects where it doesn't matter who is causing the effect, the side-effects are desirable, and the side-effect load won't be overwhelming.
In the case of a link shortener one can pretend that the side effect did not happen the first time the service sees some particular link, that the shortening has already occurred (at the beginning of time!). There is definitely a side-effect, since it involves updating a persistent hash table, and it is idempotent (though one could construct a shortener where it's possible to get more than one shortened form for a given URI when racing to shorten it, but this is not a problem for this particular sort of service).
“Toggle” is by definition not idempodent, because you get a different result each and every time. “Open” and “close” are idempodent, but not safe. The result of a GET request should always be idempodent and safe.
GET requests should have no side effects. In other words NOOP is idempotent
11 replies →
> But GET should have no side-effect
I disagree. There are some ways in which GET can be non-idempotent, such as pageview counters and endpoints with a vast amount of constantly-changing content, for instance. One may argue that the first example may be possible with a GET followed by a POST, but any subsequent GET response (assuming it contains the counter) would still be different to its prior.
It would be sufficient to GET a resource that uses a script to POST the side-effect. Slack's user-agent probably isn't sophisticated enough to mess this up (although heaven help us when they implement their preview with something like headless chrome).
What would the problem be in just showing a button that says "Confirm unsubscribe" that sends a POST request? A lot of sites does something like that for their newsletter unsubscription.
5 replies →
I remember having a similar issue in 2000, before any (meaningful) client-side javascript. Solution: each link that did something also had a request_id parameter, which was a timestamp in milliseconds. Two requests with the same request_id meant the user had clicked something twice, so any action would NOT be performed multiple times if the same request_id came in more than once.
This let users double click on links and have the action performed only once. In 2000, hyperlinks were still confusing to some users who were used to "double-click = open", especially for file icons.
EDIT: added text in italics because initial wording was confusing.
A not-small percentage of users double click links (and buttons, and anything else that needs clicking).
2 replies →
At 13:45 in this interview with Sergey Brin and Larry Page, you can listen to them discuss the meaning of idempotent on the air with Terry Gross. Pretty amusing thing to hear on NPR. I wonder if the term has ever been mentioned on public radio before or since.
http://www.npr.org/2003/10/14/167643282/google-founders-larr...
> Idempotence is not the problem here, by the way. That just means calling the method twice has the same effect. But GET should have no side-effect, in an ideal world. Of course, in the case of unsubscribe links, it needs to have a side-effect to comply with the law.
Could you have the page redirect to itself with POST? Like javascript redirect or metatag. Browsers would do this, and there wouldn't be any different for users, but bots, slack and Safari probably wouldn't.
The problem is not that your endpoint had side effect. Rather, it didn't have any side effect(it is supposed to unsub, and it does exactly that).
The problem is with lack of authentication. Slack's ability to unsubscribe people on their behalf, without their explicit permission seems to be the real issue here.
Even a "I'm not a bot" check would provide some protection.
If the unsubscribe link were unique for each subscriber would the law still be satisfied?
I don't understand how that prevents an automated system from unsubscribing them?
I thought that initially, but then I considered that for users to be bulk unsubscribed then the link would surely have to be the same for every user, at which point the each user gets the same unsubscribe link, and then when they click on it it unsubscribes everyone.
This doesn't seem very likely, so I guess that a whole load of unique unsubscribe links got dumped into slack which started following them.
It should be a DELETE imho or a PATCH.
same thing with games running on facebook's platform
Many years ago, I was asked to look at why all the content had vanished from a site (not built by me). After digging in a bit, I found that:
1) the original developer's idea of handling an unauthorized /admin request was just to set a redirect header and continue processing the current request .
2) the /admin page had a grid of all the content on the site, with handy 'Delete' links that ran over GET without confirmation.
You can probably guess where this is going – some search bot hit the overview page, ignored the redirect header, saw the content, and dutifully crawled every single link on it…
There were at least two browser extensions which also discovered that poor design was widespread and to disable prefetching for similar reasons:
http://fasterfox.mozdev.org/index.html
https://signalvnoise.com/archives2/google_web_accelerator_he...
I think the state of the web has improved slightly over the last decade but this is a great example of why browser vendors are so conservative. You can do this now but only opt-in.
Was it blekko? We had a website owner email us about that issue when blekko's ScoutJet crawler was new... although I don't recall the bit about ignored redirect headers.
I'm pretty sure everyone with a crawler has hit this sort of problem before. The first startup I was at did with someone's wiki that had "delete" links everywhere with no auth.
1 reply →
Idempotency might be necessary for GET calls, but it's not sufficient. Imagine he had two separate GET calls (opened/closed): the author would still have the same problem. Browsers assume GET to be safe (non-mutating), and safety implies idempotency.
Exactly. A 'toggle' should really be implemented as a PATCH request, or maybe a PUT if there's no data other than the door state.
At the risk of being overly pedantic/piling on -- this is what bad REST-ful API design looks like in practice.
When you talk to your teammates about the semantics of these verbs and someone just says "oh a GET is fine" and the team agrees but you don't and you can't say it so you don't become "that guy" it's time to find a new engineering org to be a part of.
On the topic of PATCH, check out JSON merge patches (application/merge-patch+json):
https://tools.ietf.org/html/rfc7386
13 replies →
Absolutely agree. A PUT method carrying an open/closed flag would seem like a natural choice. Calling it any number of consecutive times with the same payload would be idempotent. There would probably be a GET method to go along with it. And of course, it would model the desired state, not the actual position of the garage door since garage doors don't instantaneously flip (would be cool though).
A toggle would actually be a good use of POST, though PUTting the desired state would be better (PATCH works instead of PUT if you are changing some part of the state and not the whole state, but is unnecessary if the door state consists entirely of either “open” or ”closed”.)
One more thing: PATCH needs to do atomic updates on a partial resource per http://restcookbook.com/ (which I think is a great TLDR resource on the topic).
HTTP GET is nice because you can "debug" via browser. But I don't think it's a good protocol choice for opening/closing doors nor any other service not related to document requests.
It's trivial to debug a POST in a browser if you can open the console. https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequ...
1 reply →
I really don’t like that, I prefer debugging with curl which easily supports the other verbs.
This stuff is also why you should be afraid of any libraries/ frameworks/ tooling that says it's going to automatically offer TLS 1.3's "Zero round trip" (0RTT) feature for code as opposed to trivial stuff like resource downloads.
Normally, TLS ensures you can't replay somebody else's conversations. So even if I know Barry, who is authorised to toggle the door, just sent a "toggle the door" command, if I try playing it back that won't work, the setup will be different each connection and I can't respond.
But for 0RTT there is no setup - there can't be, no time to do it, and so if I replay Barry's "toggle the door" it would work.
The specification is very clear that the right thing here will be to never allow 0RTT for such features. But the moment that's hidden behind some library API you can bet _somebody_ is going to screw up badly. Alas our industry doesn't exactly have a "safety first" mentality.
I’m more surprised that the Safari new tab window makes GET requests to every “favorite” URL, which I gather is what was happening.
It's updating the thumbnail screenshots. This only happens if you have the Safari "blank page" be your favorites instead of either your "home page" or a truly blank page.
> if you have the Safari "blank page" be your favorites instead of either your "home page" or a truly blank page.
Yay for modern user friendly applications making simple words like blank completely meaningless and ambiguous.
2 replies →
Literally the reason HTTP verbs are a thing is so that User Agents like Safari can do exactly this. If this weren't a by-design property of the HTTP protocol, we wouldn't even have methods. Read the spec!
If I recall, Safari's default is to show your "favourites" screen in a new tab, which routinely refetches to update icons/previews.
I'm skeptical that it's "every time" but I do remember it doing it way more than I thought was needed.
Now I wonder if Safari (& other browsers) has distinct headers for their favorites lookups, to tell these lookups apart from real users and discard these accesses from site analytics..
They do, "X-Purpose: preview"
Hey! I've been thinking about this all day. I thought it was a GET to fetch the title of the page, but it might only be an OPTIONS or HEAD request? I'm not sure. Either way, my code activates the garage door on that endpoint no matter the HTTP verb.
The intersection of full-stack web devs from the commercial line-of-business world; and hardware/embedded hackers brings a lot of room for accidents IMO. I'm not saying any one of these groups are bad or inept. I'm in the former and completely accept that I'm new to embedded programming. It seems kool and I wanna learn about it. But I can also see the flip side where a hardware hacker sees query strings for toggling an output as a perfectly reasonable interface. Do we expect the embedded guys to grok HTTP/REST? The web-dev would be like "no, no, that has to be POST or PUT". But these things are going to happen. We don't yet have a large pool of experts across both fields.
It's no surprise the level of compromise and breach when you intersect what were pretty distinct skillsets and dump them in the mixing bowl together. That's what this IoT thing is like - it's a bunch of household and industrial chemicals all poured into the one container. It's not going to be very safe.
> Do we expect the embedded guys to grok HTTP/REST?
REST is irrelevant to this; HTTP alone covers the reason why this is bad. So ultimately, the question is: "Do we expect somebody designing an HTTP API to understand HTTP?" I think that's a reasonable expectation. If your embedded guys don't understand HTTP, then get somebody who does understand HTTP to design the API. They don't need embedded experience to do so, they aren't implementing it, just designing it. This isn't a difficult cross-functional intersection, you just don't assign tasks to people who aren't qualified to carry them out.
This is pretty much the classic newbie web developer mistake, heard many stories about people making it when they first start. I've also seen people fuck up in the opposite way, using POST when they should use GET and having unexpected behavior. Though not usually as "funny" as the classic "using GET instead of POST" errors are.
This concept of HTTP request methods really should be explained to new developers in a more accessible way, with examples of mistakes. It might not be intuitive at first or they might not think it's important as it is.
"Idempotence" isn't really the problem here, nor "should" GET requests be idempotent, think kittenwar.com or stumbleupon, the problem here is GET is reserved for retrieving (getting!) data, it shouldn't modify data. (Other than access information.)
GET requests are specified[0] to be idempotent:
[0] https://tools.ietf.org/html/rfc2616#section-9.1.2
edit: formatting
Looks like the RFC talks about idempotence from a "side effect" perspective where I was talking about it from an "output" perspective (the generated HTML).
I agree with the RFC and I mistook what the person meant
What are some of these unexpected behaviors associated with using POST when you're meant to use GET?
I saw “GET request” “idempotent” and “WiFi control garage doors” and immediately inferred the punchline.
At least it was his own devices and not Googlebot or something, I guess.
Perhaps we need a specific nomenclature for this sort of case: oddempotent!
After all, you get the same state if you do it 3,5,7,9,etc times as if you do it once, right? ;)
That would be an involution!
https://en.wikipedia.org/wiki/Involution_(mathematics)
Readable format.
https://threadreaderapp.com/thread/990684453734203392.html
The first time I clicked the link, Twitter said I was rate limited. I thought that was the joke.
Well, a while ago I saw this code (on my own project!): window.open("?controller=users&action=changePassword&name=" + user_name + "&password=" + password)
I was horrified, glad it isn't live yet, and I fixed it immediately. But I'm still wondering whether I was so sleep-deprived or drunk when I wrote this. It's over SSL, so it should not be that big deal, but still, GET shouldn't be used for such things.
Well you don’t seem to validate the existing password prior to authorizing the change.
Good CSRF protection on GET requests is also near impossible to implement as GET is intended to be a “safe” request as in a request that does not modify a state but this isn’t something that is actually practiced.
Actually, I do. This is not a form for user to change his own password, rather a administrators form to change another user's form. And for such actions the administrators identity and privileges are checked. But I understand your reasoning and thank you for pointing it out.
And yeah, I try to use GET only for safe requests, but I should be more careful.
Another big deal is that it'll get stored in server logs too.
It's a big deal since it will be visible in access logs in plaintext, so if the logs are compromised your users would be too.
This comment thread has really put me in a good mood. These stories have so much pedagogical value: • The grammars we engineer have important semantic value • understanding and adhering to them is important, and hard • relying on others to adhere to them is dangerous, and hard to avoid • "experts" make mistakes in both areas constantly
I genuinely love seeing this kind of lively discussion, because these seemingly "trivial details" matter, a lot. The Three Mile Island accident was more or less caused by "message sent" being conflated with "state changed" at the UI level, directly leading to a nuclear meltdown. They basically had a system with the equivalent design of GET /open and /close that assumed success for both https://en.wikipedia.org/wiki/Three_Mile_Island_accident#Con...
Two thoughts:
1) Twitter is a terrible medium for anything, let alone posts longer than a sentence.
2) The level of over-engineering tech people readily engage in without a second thought is truly mind boggling.
1) Market disagrees with you strongly, hence hundreds of millions of people using it regularly.
2) He's just having fun, working on a side project that's useful to him, and learning. Nothing wrong with any of that.
Why so negative?
1) Popularity is rarely an indicator of quality.
2) That doesn't mean it isn't over-engineering.
1 reply →
Adding Wi-Fi control to your garage door using a WeMos is not over-engineering. It's just a fun little weekend hack.
Over-engineering is what I did...
- Raspberry pi
- Open/close sensors on the garage door as well as the side-entry door
- Camera pointed at the side entry door taking photos while it is left open
- Push alerts to my phone if either door is opened between specific hours of the night (Break-ins to detached garages were huge in my neighborhood)
- Voice controls from my fucking phone to open the door
I sold the house otherwise it'd probably be an even larger monstrosity today
1 reply →
Over-engineering can only be assessed based on the goals of the project, which he hasn't detailed.
70 lines of code is over-engineering? That's nothing.
> "I threw the code together in minutes and was too lazy to spend another couple minutes figuring out POST."
So it's not the vendor's problem then. They provide you with two ways to make a request. You have a choice to do it right, you didn't.
The device should not support GET at all for this. It opens up a number of attacks and there’s no good reason to support it.
Who says it was the vendor's problem?
What's your point? Nobody said it was the vendor's problem except you.
GET requests aren't supposed to be idempotent. They're not supposed to change state in a first place.
> GET requests aren't supposed to be idempotent.
RFC 7231 disagrees.
> They're not supposed to change state in a first place.
Well, yeah, GET is supposed to be safe, but all safe methods are also idempotent.
Just because a server announces HTTP/1.1 doesn't mean it conforms to that specific RFC.
2 replies →
My desk height is set with a GET.
I was going to fix it, but considering how hilarious this is I might not.
Your desk runs a web server? I need to step up my game...
Yeah, it's a sit/stand Linak desk.
I hooked up an ESP32, 2 channel relay (up/down control), and distance sensor (to detect height). Pushes height to graphite and position is settable remotely. :)
2 replies →
I get that it's cool, but I'm missing the why? Would you ever need to change your desk height when you're not already at your desk? Wouldn't manually changing the height be easier than hitting an HTTP endpoint on your PC to adjust it? Maybe I'm missing something, and like I said, I'll give oyu that it's cool and that alone is sometimes reason enough.
> I get that it's cool, but I'm missing the why?
I'm a hacker, I did it because it's cool and I wanted to learn.
Primary effect - An effect on the input arguments, who's effect is captured in the output.
Side effect - An effect on state that was not passed in as input arguments, who's effect may or may not be captured on the output.
Side effect free - Also known as pure, means the function only has a primary effect. Thus it only effects the input in a way that the output captures.
Idempotent - Applying a function to itself results in the same effects. Applies to both primary and side effects.
Where things get weird, is that there's also the following:
- An effect on implicit input state, which did not come from input arguments, who's effect is captured on the output. This would be like a HTTP GET. Or any query on a DB where the DB is an implicit input.
- An effect who's effect is captured on implicit output state, either by having its effect captured on an input (like a modification to a pointed object), or captured on output not returned by the function (like print to screen). This would be like a HTTP POST.
And now if you look at all these, there's an easy permutations of them. So you can build a table like so:
All these combinations are possible. That's why it can be really tricky.
Also, having your garage door opened with unauthorized requests seems like looking for trouble
Yeah, did wonder about that. My thinking at the time was that the device was on the local WiFi network, not exposed to the internet, and there would be easier ways of getting into the garage if you really wanted to.
war-driving -- now with parking included!
See "Important Programming Concepts (Even on Embedded Systems) Part I: Idempotence"
https://www.embeddedrelated.com/showarticle/629.php
I recently soldered a wire to my garage door opener on the wall and ran it to a relay and then to the pins on a raspberry pi. Knowing the state of the door is key because the opener is just a toggle. I also have my alarm system hooked up to the pi, so it checks the state before and after any request. Repeatedly asking it to open will open it, or return success of it already is. Same with close.
It took a bit of testing before I trusted it would all work the way I thought it would, but now I user it and don't even think about it, it just works and is handy to have.
I know this is quite unrelated, and based on hazy memory of things from almost 10 years ago, DML statements in databases especially 'insert into' statements are not idempotent as I remember; ie if you try to select a few rows from a table table1 and insert them into table2 with same schema, if there were any identical rows already in table2, then whole insert will fail. My thinking at that time was that if these insert operations were idempotent, then there would be no need to explicitly check for duplicates before the insert.
At least in MySQL you can use “ON DUPLICATE KEY” to either ignore such things or optionally execute an update statement to change something about the matching row.
There is also REPLACE INTO
Assuming you have a relevant unique key setup of course.
> DML statements in databases especially 'insert into' statements are not idempotent as I remember;
GET is actually supposed to be safe which is stronger than idempotent; the SQL command that most naturally corresponds to GET—SELECT—is normally safe, but DML inherently is not.
But, sure, that INSERT isn't safe increases the amount of code needed to implement idempotent PUTs.
See also: email-based click-to-confirm produces many false positives.
Is idempotency part of HTTP or just part of loose REST conventions?
HTTP: https://tools.ietf.org/html/rfc7231#section-4.2
Part of HTTP: https://github.com/for-GET/know-your-http-well/blob/master/m...
For a fascinating overview how all of HTTP fits together I also recommend the HTTP decision diagram: https://github.com/for-GET/http-decision-diagram/blob/master...
> loose REST conventions
REST is HTTP. "loose REST conventions" is when someone chose to ignore big chunks of the HTTP spec.
In other words, you can build whatever you want (like SOAP) on top of HTTP and ignore the spec that describes content negotiation, HTTP methods, Caching policies, etc. It's still technically HTTP. But if you were to read the HTTP spec and follow it to a tee, you'd build a REST application.
> REST is HTTP
This is untrue. I hate to quote from Wikipedia, but it sums it up quite nicely: "REST is not a standard in itself, but RESTful implementations make use of standards, such as HTTP, URI, JSON, and XML"
More colloquially: REST is what happens when people mix up transport layers in their head.
4 replies →
> REST is HTTP.
That's actually backwards; HTTP is the motivating example of REST (that is, REST was developed from observed properties which HTTP/1.0 loosely exhibited and was consciously applied in design of HTTP/1.1.)
> you'd build a REST application
Nah, nerds would come out of the woodwork to inform you that what you've built is not a real REST.
18 replies →
I have a small (golang) agent that runs on over a thousand raspberry pi class machines (not on the open internet).
The agent has a GET /reboot api because it is really convenient to be able to just hit that url in a browser window when we need to.
Adding all the no-cache headers to the response seems to have worked well enough to prevent browsers from randomly hitting the url.
I just added a check for the x-purpose header as well, thanks for the suggestion.
This still seems like really bad idea when the POST request is right there.
Serve a page with a button saying "are you sure"?
A reboot of the machine isn't the end of the world and far less risky than a garage door randomly opening. It probably needed it anyway as they tend to degrade over time. This is a very specific usecase. If it becomes an issue, I can always push out an update of the software that switches things to POST (thanks to using the golang library, overseer).
3 replies →
Do you really like having to confirm every action? For interfaces that you use all the time, it's nice to be able to eliminate extra steps.
3 replies →
I really (seriously) don’t get why you did’t use POST in the first place? If it’s all for the “easy hit of the url in a browser” there are addons for that? Care to explain? One can’t be that lazy :)
This garage door opening story reminds me of something that happened in the 50's. The first powerful comm satellites happened to use a frequency that garage doors used (no id codes then). Garage doors opened and closed by themselves. My grandmother had it happen to her.
These kinds of chaotic accidents are just going to get more frequent as everything moves towards IoT
forgive my density, but why would you make a web page trigger a toggle when it is loaded? shouldnt there be a button or some other user interaction to initiate the state change?
Is it like loading a web page for the weather- so that when the page is loaded it goes and 'GETS' the latest weather info? Is it for convenience? So that all you have to do is go to a webpage and have it do stuff?
I think there was a homepage with a "toggle" button, the toggle button redirected to /toggle.
site.com/toggle then got added to his favorites by safari.
So it wasn't loading the site with the button, but the site that the button redirects to.
Correct. Site now has a form and a button. /toggle now responds only to POSTs, and redirects back to /.
1 reply →
The real problem here, as usual, is web browsers, the worst class of software ever written. We constantly come across these CSRF-style bugs that are only made possible by how stupid the browser and HTTP are, but instead of blaming the culprit and trying to deal with the source, we blame ourselves for not being accommodating enough. Fool me once, shame on me, fool me 5,000 times, shame on me. Oh, and occasionally invent hackneyed fixes like CORS.
Craftsmen should understand the tools they use.
We expect it in every other industry.
Why do people like you always want to suggest that web developers shouldn't have to learn the basics of the tools they use every day? This is page one HTTP stuff.
This is the anti-intellectualism in our field. Where people are so used to finding a YouTube video tutorial for the exact thing they want to do that anything that's inherently hard (like client development) or requires some extra knowledge to do correctly is somehow shitty and needs to be reworked. More and more often it's somehow everything's fault but the craftman's.
It's that mentality that's coasting parts of our field into code monkey cost center positions. Go somewhere like /r/webdev and watch how unresourceful the beginners are and how bad the advice is.
I'm not seeing how this is an issue with the browser as much as the server handling the request. Safari provides a feature that shows thumbnails of frequently visited sites; these thumbnails are loaded with a header specifying its for a preview. It's on the server to understand that a GET request _by definition_ should not have side effects (like toggling the open/closed state of a door), and optionally to perform special handling when seeing the preview header.
I don't see why the browser or HTTP is to blame here? GET is supposed to be safe, so what the browser is doing seem perfectly fine to me.
CORS is to _allow_ cross-origin requests, not to restrict them. Most such requests are restricted by default.
Postel's law: "an implementation should be conservative in its sending behavior, and liberal in its receiving behavior"
Hope for the best but expect the worst.
>"The real problem here, as usual, is web browsers, the worst class of software ever written."
Can you elaborate? What class of software is this specifically? And why is it the worst?
Nay. TFA correctly identifies the problem.
Or dont your doors to the internet.
Too funny!
>idempotent
what does this mean
It's an important term in computer science. There is a good explanation on StackOverflow which I found by searching for "define: idempotent":
https://stackoverflow.com/a/1077421/111327
> In computing, an idempotent operation is one that has no additional effect if it is called more than once with the same input parameters. For example, removing an item from a set can be considered an idempotent operation on the set.
Or, like, you know, how about a browser only sends a request when I'm actually fucking asking for something, and doesn't try to fetch everything I've ever thought about, with my every slightest accidental finger twitch against its touch screen?
What happened to deterministic user interaction?
Well, ideally you'd have both. When I "GET" I assume that it's GET as opposed to any of the other standard HTTP methods, not "GET" as in Indiana Jones getting the golden idol from the pedestal.
This is not only a problem with browsers preemptively requesting URLs, but also when it comes to caching. What happens when the URL I'm GETing is cached? Absolutely nothing, as far as the original server is concerned.
The current situation with browsers doing smart things to make slow websites appear fast is a bit like compiler writers doing smart things with UB in C, though. Speed a lot of things up by utilizing every undefined nook and cranny of the spec, breaking tons of legacy software that make pretty sound assumptions about how things actually work. I use a bunch of poorly designed legacy systems where GET often has intentional side effects. They break because the browser starts issuing HTTP requests long before I have finished typing an address. Let slow sites be slow and leave the speed problem to the people that should be dealing with it instead.
GET is meant to get, not set. Since 1.0.
https://www.w3.org/Protocols/HTTP/1.0/spec.html#GET
> The GET method means retrieve whatever information (in the form of an entity) is identified by the Request-URI
So what?
If I never intended for the GET to be GOT, then the browser is just as much at fault for such unintended consequences.
1 reply →
Darn kids and their prefetching, GET off my lawn!
This is a little bit (not a lot though) like the /reset.htm endpoint on some Arris Surfboard modems.
https://news.softpedia.com/news/csrf-bug-in-over-135-million...
I only terrified how twitter is stealing my privacy data
This kind of thing shouldn’t even be a REST call. Use JSON RPC.
Why is that better?
First of all “/toggle” is not a proper REST endpoint. In fact, that is exactly what an RPC endpoint might look like. I can’t believe so many “developers” miss this.
With JSON RPC, he would POST a JSON object to a /toggle endpoint which runs a procedure to open or close his garage door.
If you want to be cute and conform to REST as much as possible, you would have to treat your garage door like a resource, and then use PUT to send the entire new state of the garage door to your API or PATCH to send the instructions for how to change the existing state (or use the +json media type for patch if you want to be lazy and just send a JSON object with updated key values). Your URL would probably be in the form of “/garage-doors/garage-door-id”.
Except, his garage door is probably NOT a resource and has no ID or even a serialized representation of it’s physical state. It’s just a physical door. And it should be driven remote procedures.
Maybe you don’t need any of this. Maybe you are happy with just using GET and pushing everything into REST and basically doing the wrong things. But that’s how you end up with the problems in the article, by not respecting standards, by choosing to be close enough to correct instead of technically correct. And frankly, if there’s any people who should strive for technical precision it should be engineers.
I personally would not hire a single software engineer who chose to approach this problem in the RESTful way without clear and deliberate reasoning for doing so.
6 replies →
This guy is stupid.
You're stupid.
To me this sounds like a CSRF problem. There's no token or session associated with these calls, so a browser was able to inadvertently CSRF the calls. Changing this call to POST or PUT would still leave this API vulnerable.
It's not about access control, it's about the fact that browsers are free to make speculative GET requests whenever they like, and they actively do to pre-fetch pages. His GET end-point was pre-fetched by his browser, activating the door. This would still happen even if there was a token or session associated.
> This would still happen even if there was a token or session associated.
This is exactly the scenario a CSRF token is support to prevent. But I understand your point.
Not just browsers, but any service.
> You know how HTTP GET requests are meant to be idempotent?
No, they aren't. They aren't meant to be anything specific. They can be idempotent, but generally these philosophical arguments are the net gain of this kind of condescension.