Comment by mwcampbell
4 years ago
However, his experience, in games and game development tools AFAIK, might not be fully applicable to the development of mainstream commercial software that has to try to be all things to all people, including considerations like internationalization, accessibility, and backward compatibility. The performance difference that he demonstrated between Windows Terminal and refterm is certainly dramatic, but I wouldn't be surprised if there's something he's overlooking.
When I saw this mentioned on HN I immediately knew this kind of comment would be there because something along the lines and I am paraphrasing "its probably fast because its not enterprise enough" was repeated in every place the refterm was shared by different people even multiple times even after showing all the proof in the world that its in fact the opposite they almost refused to believe that software can be that much better than its standard today even to the point of bringing up arguments like 16fps terminal is better than 7500 one because so many fps would probably consume to much resources. Before I found Casey's tone criticizing bad software off putting but now I understand that after many years of such arguments it takes toll on you.
Seconding. It takes doing some low-level gamedev[0] stuff, or using software written by people like Casey, to realize just how fast software can be. There's an art to it, and it hits diminishing returns with complex software, but the baseline of popular software is so low it doesn't take much more than a pinch of care to beat it by an order of magnitude.
(Cue in the "but why bother, my app is IO bound anyway" counterarguments. That happens, but people too often forget you can design software to avoid or minimize the time spent waiting on IO. And I don't mean just "use async" - I mean design its whole structure and UX alike to manage waits better.)
--
[0] - Or HFT, or browser engine development, few other performance-mindful areas of software.
I feel obliged to point out the destructive power of Knuth's statement, "Premature optimization is the root of all evil."
I have encountered far too many people who interpret that to mean, "thou shall not even even consider performance until a user, PM or executive complains about it."
14 replies →
I find it striking that a decent modern laptop would have been a supercomputer 20 years ago, when people used Office 97 that was feature complete already IMO. I can't help this constant cognitive dissonance with modern software; do we really need supercomputers to move Windows out of the box?
5 replies →
I think there's a story in here that most are missing, but your comment is closest to touching on. This was not a performance problem. This was a fundamental problem that surfaced as a performance issue.
The tech stack at use in the Windows Terminal project is new code bolted onto old code, and no one on the existing team knows how that old code works. No one understands what it's doing. No one knows when the things that old code needed to do was still needed.
It took someone like Casey who knew gamedev to know instinctually that all of that stuff was junk and you could rewrite it in a weekend. The Microsoft devs, if they wanted to dive into the issue, would be forced to Chesteron's Fence every single line of code. It WOULD have taken them years.
We've always recommended that programmers know the code one and possibly two layers below them. This recommendation failed here, it failed during the GTA loading times scandal. It has failed millions of times and the ramifications of that failing is causing chaos of performance issues.
I'm come to realize that much of the problems that we have gotten ourselves into is based on what I call feedback bandwidth. If you are an expert, as Casey is, you have infinite bandwidth, and you are only limited by your ability to experiment. If your ability to change is a couple seconds, you will be able to create projects that are fundamentally impossible without that feedback.
If you need to discuss something with someone else, that bandwidth drops like a stone. If you need a team of experts, all IMing each-other 1 on 1, you might as well give up. 2 week Agile sprints are much better than months to years long waterfall, but we still have so much to learn. If you only know if the sprint is a success after everyone comes together, you are doomed. The people iterating dozens of times every hour will eat your shorts.
I'm not saying that only a single developer should work on entire projects. But what I am saying is that when you have a Quarterback and Wide Receiver that are on the same page, talking at the same abstraction level, sometimes all it takes is one turn, one bit of information, to know exactly what the other is about to do. They can react together.
Simple is not easy. Matching essential complexity might very well be impossible. Communication will never be perfect. But you have to give it a shot.
1 reply →
I started off programming doing web development working on an community run asynchronous game where we needed to optimize everything to run in minimal time and power to save on cost and annoyance. It was a great project to work on as a high schooler.
Then in college, I studied ECE and worked in a physics lab where everything needed to run fast enough to read out ADCs as quickly as allowed by the datasheet.
Then I moved to defense doing FPGA and SW work (and I moonlighted in another group consulting internally on verifcation for ASICs). Again, everything was tightly controlled. On a PCI-e transfer, we were allowed 5 us of maximum overhead. The rest of the time could only be used for streaming data to and from a device. So if you needed to do computation with the data, you needed to do it in flight and every data structure had to be perfectly optimized. Weirdly, once you have data structures that are optimized for your hardware, the algorithms kind of just fall into place. After that, I worked on sensor fusion and video applications for about a year where our data rates for a single card were measured in TB/s. Needless to say, efficiency was the name of the game.
After that, I moved to HFT. And weirdly, outside of the critical tick-to-trade path or microwave stuff, this industry has a lot less care around tight latencies and has crazy low data rates compared to what I'm used to working with.
So when I go look at software and stuff is slow, I'm just suffering because I know all of this can be done faster and more efficiently (I once shaved 99.5% of the run time off of a person's code with better data packing to align to cache lines, better addressing to minimize page thrashing, and automated loop unrolling into threads all in about 1 day of work). Software developers seriously need to learn to optimize proactively... or just write less shitty code to begin with.
> There's an art to it
while that's true, in this particular case with Casey's impl, it's not an art. The one thing that drastically improved performance, was caching. Literally the simplest, most obvious thing to do when you have performance problems.
2 replies →
Even something like a JSON parser is often claimed to be IO bound. It almost never is because few could keep up with modern IO and some cannot keep up with old HD’s
Third'ing: the current crop of new developers have no freaking idea how much power they have at their finger tips. Us greybeard old game developers look at today's hardware and literally cream our jeans in comparison to the crap we put up with in the previous decades. People have no idea, playing with their crappy interpenetrated languages, just how much raw power we have if one is willing to learn the low level languages to access them. (Granted, Numpy and BLAS to a wonderful job for JIT languages.)
2 replies →
Worth bearing in mind that Casey has a long history of unsuccessfully trying to nudge Microsoft to care about performance and the fact that he's still being constructive about it is to his credit.
I highly respect Casey but given his abrasive communication style I sometimes wonder if he is not trying to trigger people (MS devs in this case) to push him back so he can make his point.
Honestly, I felt like the ones to start with the condescending tones were the Microsoft devs who kept talking down to Casey about You Don't Understand How Hard This Is, when they also readily admitted they didn't understand the internals very well.
I don't think they're actually contradicting themselves there. They know enough about how hard text rendering is to conclude that they're better off delegating it to the team that specializes in that particular area, even though it means they have to settle for a good-enough abstraction rather than winning at a benchmark.
2 replies →
In my experience as a former game dev who moved to enterprise apps, game dev techniques are broadly applicable and speed up enterprise apps without compromising on functionality.
Consider memory management techniques like caching layers or reference pools. Or optimizing draws for the platform's render loop. Or being familiar with profiler tools to identify hotspots. These techniques are all orthogonal to functionality. That is, applying them when you see an opportunity to will not somehow limit features.
So why aren't the enterprise apps fast, if it's so easy? I think that boils down to incentives. Enterprise apps are sales or product led and the roadmap only accommodates functionality that makes selling the software easier. Whereas in games the table stakes point you need to reach for graphics is not achievable by naively pursuing game features.
Put another way, computers and laptops are way stronger than consoles and performance is a gas. Enterprise devs are used to writing at 1 PSI or less and game devs are used to writing at 1 PSI or more.
With enterprise apps, I also have the budget to throw more computers at a problem. If it's between 2 weeks of my time, or to throwing another core at a VM, the extra core wins most of the time.
I actually have a lot of respect for old school game programmers because they have two traits that many of us who develop mainstream commercial software often lack: a) they care about performance and not in the abstract, but performance as evaluated by an actual human (latency issues in a messaging app are tolerable, a game with latency issues is simply not fun to play) and b) they can sit down without much fuss and quickly write the damn code (the ability that slowly atrophies as one works on a multi-year-old codebase where every change is a bit of a PITA). Sure, the constraints are different, but a lot of it is simply learned helplessness.
> might not be fully applicable to the development of mainstream commercial software that has to try to be all things to all people, including considerations like internationalization, accessibility, and backward compatibility.
Windows Terminal has none of that. And his refterm already has more features implemented correctly (such as proper handling of Arabic etc.) than Windows Terminal. See feature support: https://github.com/cmuratori/refterm#feature-support
Also see FAQ: https://github.com/cmuratori/refterm/blob/main/faq.md
Internationalization and accessibility are very important in game development. A lot of time is invested in this and larger studios have dedicated UI/UX teams which spend a lot of time on these issues.
The same is true of backwards compatibilty. As an example, making sure old save data is compatible with new versions is an important consideration.
Source: I'm a game programmer working mainly with graphics and performance, but I previously spent five years working on the UI team at a AAA studio.
How is it not applicable when the thing at question is rendering text and rendering is the core of game development? This argument is stupid. Do you have to be a slowpoke to develop commercial apps?
My point is that a UI that meets the needs of as many users as possible, including things like internationalization and accessibility, is much more complex than a typical game UI. That complexity drives developers to use abstractions that often make it much more difficult to optimize. And in the big picture, support for these things that add complexity is often more important than top-speed rendering.
Games are typically much better at internationalization and accessibility than developer tooling though. For example this new windows console doesn't have either, but all big games gets translated to and handles text from languages all over the world.
Video games often have an international audience and go to great lengths to support accessibility and multiplatform support, ie. supporting both tablet and desktop. It's laughable how bad many enterprise UIs are that fail to handle different locales, or issues displaying right-to-left text and assuming everyone is using an English speaking standard desktop environment, whereas many indie games manage to handle these things very well.
Games usually handle internationalization and accessibility much better than most software.
This includes audio localization (something no 'Enterprise' software has ever needed AFAIK), and multiple colour palettes for different types of colour blindness.
Sometimes video games are the only software with reasonable localizations I ever find installed in a particular computer.
Can't recall when it was the last time I played game with no internationalization support
Backward compatibility is a huge one here.
There is a newer version of component X but we can't leverage that due to dependency Y.