← Back to context

Comment by doctorpangloss

10 months ago

> As has been frequently pointed out, the benefits from improved profiling cannot be understated, even a 10% cost to having frame pointers can be well worth it when you leverage that information to target the actual bottlenecks that are eating up your cycles.

Few can leverage that information because the open source software you are talking about lacks telemetry in the self hosted case.

The profiling issue really comes down to the cultural opposition in these communities to collecting telemetry and opening it for anyone to see and use. The average user struggles to ally with a trustworthy actor who will share the information like profiling freely and anonymize it at a per-user level, the level that is actually useful. Such things exist, like the Linux hardware site, but only because they have not attracted the attention of agitators.

Basically users are okay with profiling, so long as it is quietly done by Amazon or Microsoft or Google, and not by the guy actually writing the code and giving it out for everyone to use for free. It’s one of the most moronic cultural trends, and blame can be put squarely on product growth grifters who equivocate telemetry with privacy violations; open source maintainers, who have enough responsibilities as is, besides educating their users; and Apple, who have made their essentially vaporous claims about privacy a central part of their brand.

Of course people know the answer to your question. Why doesn’t Google publish every profile of every piece of open source software? What exactly is sensitive about their workloads? Meta publishes a whole library about every single one of its customers, for anyone to freely read. I don’t buy into the holiness of the backend developer’s “cleverness” or whatever is deemed sensitive, and it’s so hypocritical.

> Basically users are okay with profiling, so long as it is quietly done by Amazon or Microsoft or Google, and not by the guy actually writing the code and giving it out for everyone to use for free.

No; the groups are approximately "cares whether software respects the user, including privacy", or "doesn't know or doesn't care". I seriously doubt that any meaningful number of people are okay with companies invading their privacy but not smaller projects.

"Agitators". We don't trust telemetry precisely because of comments like that. World is full of people like you who apparently see absolutely nothing wrong with exfiltrating identifying information from other people's computers. We have to actively resist such attempts, they are constant, never ending and it only seems to get worse over time but you dismiss it all as "cultural opposition" to telemetry.

For the record I'm NOT OK with being profiled, measured or otherwise studied in any way without my explicit consent. That even extends to the unethical human experiments that corporations run on people and which they euphemistically call A/B tests. I don't care if it's Google or a hobbyist developer, I will block it if I can and I will not lose a second of sleep over it.

  • > World is full of people like you who apparently see absolutely nothing wrong with exfiltrating identifying information from other people's computers.

    True. But such people are like cockroaches. They know what they are doing will be unpopular with their targets, so they keep it hidden. This is easy enough to do in closed designs, car manufacturers selling your driving habits to insurance companies and health monitoring app selling menstrual cycle data to retailers selling to women.

    Compare that do say Debian and RedHat. They too collect performance data. But the code is open source, Debian has repeatable builds so you can be 100% sure that is the code in use, and every so often someone takes a look it. Guess what, the data they send back is so unidentifiable it satisfies even the most paranoid of their 1000's of members.

    All it takes is a little bit of sunlight to keep the cockroaches at bay, and then we can safely let the devs collect the data they need to improve code. And everyone benefits.

    • I fully support the schemes for telemetry that already happens. It’s just obnoxious that there’s no apparent reason behind the support of one kind of anonymous telemetry versus another. For every 1 person who digests the package repo’s explanation of the telemetry strategy, there are 19 who feel okay with GitHub having all the telemetry and none of the open source repos they host because of vibes.

I think the kind of profiling information you're imagining is a little different from what I am.

Continuous profiling of your system that gets relayed to someone else by telemetry is very different from continuous profiling of your own system, handled only by yourself (or, generalising, your community/group/company). You seem to be imagining we're operating more in the former, whereas I am imagining more in the latter.

When it's our own system, better instrumented for our own uses, and we're the only ones getting the information, then there's nothing to worry about, and we can get much more meaningful and informative profiling done when more information about the system is available. I don't even need telemetry. When it's "someone else's" system, in other words, when we have no say in telemetry (or have to exercise a right to opt-out, rather than a more self-executing contract around opt-in), then we start to have exactly the kinds of issues you're envisaging.

When it's not completely out of our hands, then we need to recognise different users, different demands, different contexts. Catering to the user matters, and when it comes to sensitive information, well, people have different priorities and threat models.

If I'm opening a calendar on my phone, I don't expect it to be heavily instrumented and relaying all of that, I just want to see my calendar. When I open a calendar on my phone, and it is unreasonably slow, then I might want to submit relevant telemetry back in some capacity. Meanwhile, if I'm running the calendar server, I'm absolutely wanting to have all my instrumentation available and recording every morsel I reasonably can about that server, otherwise improving it or fixing it becomes much harder.

From the other side, if I'm running the server, I may want telemetry from users, but if it's not essential, then I can "make do" with only the occasional opt-in telemetry. I also have other means of profiling real usage, not just scooping it all up from unknowing users (or begrudging users). Those often have some other "cost", but in turn, they don't have the "cost" of demanding it from users. For people to freely choose requires acknowledging the asymmetries present, and that means we can't just take the path of least resistance, as we may have to pay for it later.

In short, it's a consent issue. Many violate that, knowingly, because they care not for the consequences. Many others don't even seem to think about it, and just go ahead regardless. And it's so much easier behind closed doors. Open source in comparison, even if not everything is public, must contend with the fact that the actions and consequences are (PRs, telemetry traffic, etc), so it inhabits a space in which violating consent is much more easily held accountable (though no guarantee).

Of course, this does not mean it's always done properly in open source. It's often an uphill battle to get telemetry that's off-by-default, where users explicitly consent via opt-in, as people see how that could easily be undermined, or later invalidated. Many opt-in mechanisms (e.g. a toggle in the settings menu) often do not have expiration built in, so fail to check at a later point that someone still consents. Not to say that's the way you must do it, just giving an example of a way that people seem to be more in favour of, as with the generally favourable response to such features making their way into "permissions" on mobile.

We can see how the suspicion creeps in, informed by experience... but that's also known by another word: vigilance.

So, users are not "okay" with it. There's a power imbalance where these companies are afforded the impunity because many are left to conclude they have no choice but to let them get away with it. That hasn't formed in a vacuum, and it's not so simple that we just pull back the curtain and reveal the wizard for what he is. Most seem to already know.

It's proven extremely difficult to push alternatives. One reason is that information is frequently not ready-to-hand for more typical users, but another is that said alternatives may not actually fulfil the needs of some users: notably, accessibility remains hugely inconsistent in open source, and is usually not funded on par with, say, projects that affect "backend" performance.

The result? Many people just give their grandma an iPhone. That's what's telling about the state of open source, and of the actual cultural trends that made it this way. The threat model is fraudsters and scammers, not nation-state actors or corporate malfeasance. This app has tons of profiling and privacy issues? So what? At least grandma can use it, and we can stay in contact, dealing with the very real cultural trends towards isolation. On a certain level, it's just pragmatic. They'd choose differently if they could, but they don't feel like they can, and they've got bigger worries.

Unless we do different, the status quo will remain. If there's any agitation to be had, it's in getting more people to care about improving things and then actually doing them, even if it's just taking small steps. There won't be a perfect solution that appears out of nowhere tomorrow, but we only have a low bar to clear. Besides, we've all thought "I could do better than that", so why not? Why not just aim for better?

Who knows, we might actually achieve it.

  • [flagged]

    • Telemetry is exceedingly useful, and it's basically a guaranteed boon when you operate your own systems. But telemetry isn't essential, and it's not the heart of the matter I was addressing. Again, the crux of this is consent, as an imbalance of power easily distorts the nature of consent.

      Suppose Chrome added new telemetry, for example, like it did when WebRTC was added in Chrome 28, so we really can just track this against something we're all familiar (enough with). When a user clicks "Update", or it auto-updated and "seamlessly" switched version in the background / between launches, well, did the user consent to the newly added telemetry?

      Perhaps most importantly: did they even know? After all, the headline feature of Chrome 28 was Blink, not some feature that had only really been shown off in a few demos, and was still a little while away from mass adoption. No reporting on Chrome 28 that I could find from the time even mentions WebRTC, despite entire separate articles going out just based on seeing WebRTC demos! Notifications got more

      So, capabilities to alter software like this are, knowingly or unknowingly, undermine the nature of consent that many find implicit in downloading a browser, since what you download and what you end up using may be two very different things.

      Now, let's consider a second imbalance. Did you even download Chrome? Most Android devices often have it preinstalled, or some similar "open-core" browser (often a Chromium-derivative). Some are even protected from being uninstalled, so you can't opt out that way, and Apple only just had to open up iOS to non-Safari backed browsers.

      So the notion of consent via the choice to install is easily undermined.

      Lastly, because we really could go on all day with examples, what about when you do use it? Didn't you consent then?

      Well, they may try to onboard you, and have you pretend to read some EULA, or just have it linked and give up the charade. If you don't tick the box for "I read and agree to this EULA", you don't progress. Of course, this is hardly a robust system. Enforceability aside, the moment you had it over to someone else to look at a webpage, did they consent to the same EULA you did?

      ... Basically, all the "default" ways to consider consent are nebulous, potentially non-binding, and may be self-defeating. After all, you generally don't consent to every single line of code, every single feature, and so on, you are usually assumed to consent to the entire thing or nothing. Granularity with permissions has improved that somewhat, but there is usually still a bulk core you must accept before everything else; otherwise the software is usually kept in a non-functional state.

      I'm not focused too specifically on Chrome here, but rather the broad patterns of how user consent typically assumed in software don't quite pan out as is often claimed. Was that telemetry the specific reason why libwebrtc was adopted by others? I'm not privy to the conversations that occurred with these decisions, but I imagine it's more one factor among many (not to mention, Pion is in/for Go, which was only 4 years old then, and the pion git repo only goes back to 2018). People were excited out of the gate, and libwebrtc being available (and C++) would have kept them in-step (all had support within 2013). But, again, really this is nothing to do with the actual topic at hand, so let's not get distracted.

      The user has no opportunity to meaningfully consent to this. Ask most people about these things, and they wouldn't even recognise the features by now (as WebRTC or whatever is ubiquitous), let alone any mechanisms they may have to control how it engages with them.

      Yet, the onus is put on the user. Why do we not ask about anything/anyone else in the equation, or consider what influences the user?

      A recent example I think illustrates the imbalance and how it affects and warps consent is the recent snafu with a vending machine with limited facial recognition capabilities. In other words, the vending machine had a camera, ostensibly to know when to turn on or not and save power. When this got noticed at a university, it was removed, and everyone made a huge fuss, as they had not consented to this!

      What I'd like to put in juxtaposition with that is how, in all likelihood, this vending machine was probably being monitored by CCTV, and even if not, that there is certainly CCTV at the university, and nearby, and everywhere else for that matter.

      So what changed? The scale. CCTV everywhere does not feel like something you can, individually, do anything about; the imbalance of power is such that you have no recourse if you did not consent to it. A single vending machine? That scale and imbalance has shifted, it's now one machine, not put in place by your established security contracts, and not something ubiquitous. It's also something easily sabotaged without clear consequence (students at the university covered the cameras of it quite promptly upon realising), ironically, perhaps, given that this was not their own property and potentially in clear view of CCTV, but despite having all the same qualities as CCTV, the context it embedded in was such that they took action against it.

      This is the difference between Chrome demanding user consent and someone else asking for it. When the imbalance of power is against you, even just being asked feels like being demanded, whereas when it doesn't quite feel that way, well, users often take a chance to prevent such an imbalance forming, and so work against something that may (in the case of some telemetry) actually be in their favour. However, part and parcel with meeting user needs is respecting their own desires -- as some say, the customer is always right in matters of taste.

      To re-iterate myself from before, there are other ways of getting profiling information, or anything you might relay via telemetry, that do not have to conform to the Google/Meta/Amazon/Microsoft/etc model of user consent. They choose the way they do because, to them, it's the most efficient way. At their scale, they get the benefits of ubiquitous presence and leverage of the imbalance of power, and so what you view as your system, they view as theirs, altering with impunity, backed by enough power to prevent many taking meaningful action to the contrary.

      For the rest of us, however, that might just be the wrong way to go about it. If we're trying to avoid all the nightmares that such companies have wrought, and to do it right by one another, then the first step is to evaluate how we engage with users, what the relationship ("contract") we intend to form is, and how we might inspire mutual respect.

      In ethical user studies, users are remunerated for their participation, and must explicitly give knowing consent, with the ability to withdraw at any time. Online, they're continually A/B tested, frequently without consent. On one hand, the user is placed in control, informed, and provided with the affordances and impunity to consent entirely according to their own will and discretion. On the other, the user is controlled, their agency taken away by the impunity of another, often without the awareness that this is ongoing, or that they might have been able to leverage consent (and often ignored even if they did, after all, it's easy to do so when you hold the power). I know which I'd rather be on the other end of, at least personally speaking.

      So, if we want to enable telemetry, or other approaches to collaborating with users to improve our software, then we need to do just that. Collaborate. Rethink how we engage, respect them, respect their consent. It's not just that we can't replicate Google, but that maybe we shouldn't, maybe that approach is what's poisoned the well for others wanting to use it, and what's forcing us to try something else. Maybe not, after all, that's not for us to judge at this point, it's only with hindsight that we might truly know. Either way, I think there's some chance for people to come in, make something that actually fits with people, something that regards them as a person, not simply a user, and respects their consent. Stuff like that might start to shift the needle, not by trying to replace Google or libwebrtc or whatever and get the next billion users, but by paving a way and meeting the needs of those who need it, even if it's just a handful of customers or even just friends and family. Who knows, we might start solving some of the problems we're all complaining about yet never seem to fix. At the very least, it feels like a breath of fresh air.

      1 reply →