Comment by dmk
7 days ago
The quote from the CMU guy about modern Agile and DevOps approaches challenging architectural discipline is a nice way of saying most of us have completely forgotten how to build deterministic systems. Time-triggered Ethernet with strict frame scheduling feels like it's from a parallel universe compared to how we ship software now.
During the time of the first Apollo missions, a dominant portion of computing research was funded by the defense department and related arms of government, making this type of deterministic and WCET (worst case execution time) a dominant computing paradigm. Now that we have a huge free market for things like online shopping and social media, this is a bit of a neglected field and suffers from poor investment and mindshare, but I think it's still a fascinating field with some really interesting algorithms -- check out the work of Frank Mueller or Johann Blieberger.
It still lives on as a bit of a hard skill in automotive/robotics. As someone who crosses the divide between enterprise web software, and hacking about with embedded automotive bits, I don't really lament that we're not using WCET and Real Time OSes in web applications!
I suppose that rough-edgeness of the RTOSes is mostly due to that mainstream neglect for them - they are specific tools for seasoned professionals whose own edges are dent into shapes well-compatible for existing RTOSes.
if you ever worked on automotive you know it's bs.
since CAN all reliability and predictive nature was out. we now have redundancy everywhere with everything just rebooting all the time.
install an aftermarket radio and your ecu will probably reboot every time you press play or something. and that's just "normal".
2 replies →
ever use wordstar on Z80 system with a 5 MB hard drive?
responsive. everything dealing with user interaction is fast. sure, reading a 1 MB document took time, but 'up 4 lines' was bam!.
linux ought to be this good, but the I/O subsystem slows down responsiveness. it should be possible to copy a file to a USB drive, and not impact good response from typing, but it is not. real time patches used to improve it.
windows has always been terrible.
what is my point? well, i think a web stack ran under an RTOS (and sized appropriately) might be a much more pleasurable experience. Get rid of all those lags, and intermittent hangs and calls for more GB of memory.
QNX is also a good example of an RTOS that can be used as a desktop. Although an example with a lot of political and business problems.
7 replies →
> making this type of deterministic and WCET (worst case execution time) a dominant computing paradigm.
Oh wow, really? I never knew that. huh.
I feel like as I grow older, the more I start to appreciate history. Curse my naive younger self! (Well, to be fair, I don't know if I would've learned history like that in school...)
Contrary to propaganda from the likes of Ludwig von Mises, the free market is not some kind of optimal solution to all of our problems. And it certainly does not produce excellent software.
I can't think of a time when I've found an absolutist position useful or intelligent, in any field. Free-market absolutism is as stupid as totalitarianism. The content of economics papers does not need to be evaluated to discard an extreme position, one need merely say "there are more things in earth and heaven than are dreamed of in your philosophies"
1 reply →
Mises never claimed that the free market produced the most optimal solutions at a given moment. In fact Mises explicitly stated many times that the free market does indeed incur in semi-frequent self-corrections, speculations and manipulations by the agents.
What Mises proposition was - in essence - is that an autonomous market with enough agents participating in it will reach an optimal Nash equilibrium where both offer and demand are balanced. Only an external disruption (interventionism, new technologies, production methods, influx or efflux of agents in the market) can break the Nash equilibrium momentarily and that leads to either the offer or the demand being favored.
18 replies →
Propaganda is quite a strong term to describe the works of an economist. If one wants to debate the ideas of von Mises, it'd be useful to consider the Zeitgeist at that time. Von Mises preferred free markets in contrast to the planned economy of the communists. Partly because the latter has difficulties in proper resource allocation and pricing. Note that this was decades before we had working digital computers and digital communication systems, which, at least in theory, change the feasibility of a planned economy.
Also, the last time I checked, the US government produced its goods and services using the free market. The government contractors (private enterprises) are usually tasked with building stuff, compared with the government itself in a non-free, purely planned economy (if you refer to von Mises).
I assume that you originally meant to refer to the idea that without government intervention (funding for deep R&D), the free market itself would probably not have produced things like the internet or the moon landing (or at least not within the observed time span). That is, however,a rather interesting idea.
11 replies →
Are _you_ making software for the government?
1 reply →
Time triggered Ethernet is part of aircraft certified data bus and has a deep, decades long history. I believe INRIA did work on this, feeding Airbus maybe. It makes perfect sense when you can design for it. An aircraft is a bounded problem space of inputs and outputs which can have deterministic required minima and then you can build for it, and hopefully even have headroom for extras.
Ethernet is such a misnomer for something which now is innately about a switching core ASIC or special purpose hardware, and direct (optical even) connects to a device.
I'm sure there are also buses, dual redundant, master/slave failover, you name it. And given it's air or space probably a clockwork backup with a squirrel.
A real squirrel would need acorns, I would assume it's a clockwork squirrel too.
A software squirrel, maybe? https://sqrrl.io :-)
Aircraft also have software and components, that form a "working" proclaimed eco-system in lockstep- a baseline. This is why there are paper "additions" on bug discovery until the bug is patched and the whole ecosystem of devices is lifted to the next "baseline".
You could even say that part of the value of Artemis is that we're remembering how to do some very hard things, including the software side. This is something that you can't fake. In a world where one of the more plausible threats of AI is the atrophy of real human skills -- the goose that lays the golden eggs that trains the models -- this is a software feat where I'd claim you couldn't rely on vibe code, at least not fully.
That alone is worth my tax dollars.
Don’t count your chickens before they hatch.
I'm not sure you really understood my comment. A large portion of the kind of value I'm talking about comes from attempting the hard thing. If these chickens do not hatch that will be tragic, but we will still have learned something from it. In some ways, we will have learned even more, by getting taught about what we don't know.
Anyway, let's all hope for a safe landing tonight.
Agile is not meant to make solid, robust products. It’s so you can make product fragments/iterations quickly, with okay quality and out to the customer asap to maximize profits.
“Agile” doesn’t mean that you release the first iteration, it’s just a methodology that emphasizes short iteration loops. You can definitely develop reliable real-time systems with Agile.
> “Agile” doesn’t mean that you release the first iteration
Someone needs to inform the management of the last three companies I worked for about this.
1 reply →
I would differentiate between iterative development and incremental development.
Incremental development is like panting a picture line by line like a printer where you add new pieces to the final result without affecting old pieces.
Iterative is where you do the big brush strokes first and then add more and more detail dependent on what to learn from each previous brush strokes. You can also stop at any time when you think that the final result is good enough.
If you are making a new type of system and don’t know what issues will come up and what customers will value (highly complex environment) iterative is the thing to do.
But if you have a very predictable environment and you are implementing a standard or a very well specified system (van be highly complicated yet not very complex), you might as will do incremental development.
Roughly speaking though as there is of course no perfect specification which is not the final implementation so there are always learnings so there is always some iterative parts of it.
A physicist who worked on radiation-tolerant electronics here. Apart from the short iteration loops, agile also means that the SW/HW requirements are not fully defined during the first iterations, because they may also evolve over time. But this cannot be applied to projects where radiation/fault tolerance is the top priority. Most of the time, the requirements are 100% defined ahead of time, leading to a waterfall-like or a mixed one, where the development is still agile but the requirements are never discussed again, except in negligible terms.
I think people mean so many different things when talking about agile. I'm pretty sure a small team of experts is a good fit for critical systems.
A fixed amount of meetings every day/week/month to appease management and rushing to pile features into buggy software will do more harm than good.
SCRUM methodology absolutely prioritizes a "Potentially Shippable Product Increment" as the output of every sprint.
1 reply →
You can absolutely build robust products using agile. Apart from some of the human benefits of any kind of incremental/iterative development, the big win with Agile is a realistic way to elicit requirements from normal people.
You hopefully know thats not true. But it's a matter of quality goals. Need absolute robustness? Prioritize it and build it. Need speed and be first to market? Prioritize and build it. You can do both in an agile way. Many would argue that you won't be as fast in a non-agile way. There is no bullet point in the agile manifest saying to build unreliable software.
Yeah, I know it’s not true in the sense that that’s not what it’s meant to do, but I’m saying practically that’s what usually ends up happening.
The generous way of seeing it is that you don't know what the customer wants, and the customer doesn't know all that well what they want either, and certainly not how to express it to you. So you try something, and improve it from there.
But for aerospace, the customer probably knows pretty well what they want.
The manifesto refers to “working software”. It does not say anything about “okay quality”.
... and it mechanically promotes planned obsolescence by its nature (likely to be of disastrous quality). The perfect mur... errr... the perfect fraud.
Tesla’s Cybertruck uses that in its ethernet as well!
All the ADAS automotive systems use this, there are several startups in this space as well, such as Ethernovia.
All thanks to twisted pair https://de.wikipedia.org/wiki/BroadR-Reach
Some of us still work on embedded systems with real-time guarantees.
Believe it or not, at least some of those modern practices (unit testing, CI, etc) do make a big (positive) difference there.
The depressing part is that these "modern practices" were essentially invented in the 1960s by defense and aerospace projects like the NTDS, LLRV/LLTV, and Digital Fly-by-Wire to produce safety-critical software, and the rest of the software industry simply ignored them until the last couple of decades.
Microsoft fired all QA people ten or fifteen years ago. I'd imagine it's a similar a story: boxed software needed much higher guarantees of correctness. Digital deliver leaves much more room for error, because it leaves room for easier, cheaper fixes.
> “Modern Agile and DevOps approaches prioritize iteration, which can challenge architectural discipline,” Riley explained. “As a result, technical debt accumulates, and maintainability and system resiliency suffer.”
Not sure i agree with the premise that "doing agile" implies decision making at odds with architecture: you can still iterate on architecture. Terraform etc make that very easy. Sure, tech debt accumulates naturally as a byproduct, but every team i've been on regularly does dedicated tech debt sprints.
I don't think the average CRUD API or app needs "perfect determinism", as long as modifications are idempotent.
In theory, yes you could iterate on architecture and potentially even come up with better one with agile approach.
In practice, so many aspects follow from it that it’s not practical to iterate with today’s tools.
Agile is like communism. Whenever something bad happens to people who practice agile, the explanation is that they did agile wrong, had they being doing the true agile, the problem would've been totally avoided.
In reality, agile doesn't mean anything. Anyone can claim to do agile. Anyone can be blamed for only pretending to do agile. There's no yardstick.
But it's also easy to understand what the author was trying to say, if we don't try to defend or blame a particular fashionable ideology. I've worked on projects that required high quality of code and product reliability and those that had no such requirement. There is, indeed, a very big difference in approach to the development process. Things that are often associated with agile and DevOps are bad for developing high-quality reliable programs. Here's why:
The development process before DevOps looked like this:
The "smart" idea behind DevOps, or, as it used to be called at the time "shift left" was to start QA before the whole of programming was done, in parallel with the development process, so that the testers wouldn't be idling for a year waiting for the developers to deliver the product to testers and the developers would have faster feedback to the changes they make. Iterating on this idea was the concept of "continuous delivery" (and that's where DevOps came into play: they are the ones, fundamentally, responsible to make this happen). Continuous delivery observed that since developers are getting feedback sooner in the development process, the release, too, may be "shifted left", thus starting the marketing and sales earlier.
Back in those days, however, it was common to expect that testers will be conducting a kind of a double-blindfolded experiment. I.e. testers weren't supposed to know the ins and outs of the code intentionally, s.t. they don't, inadvertently, side with the developers on whatever issues they discover. Something that today, perhaps, would've been called "black-box testing". This became impossible with CD because testers would be incrementally exposed to the decisions governing the internal workings of the product.
Another aspect of the more rigorous testing is the "mileage". Critical systems, normally, aren't released w/o being run intensively for a very long time, typically orders of magnitude longer than the single QA cycle (let's say, the QA gets a day of computer time to run their tests, then the mileage needs to be a month or so). This is a very inconvenient time for development, as feature freeze and code freeze are still in effect, so the coding can only happen in the next version of the product (provided it's even planned). But, the incremental approach used by CD managed to sell a lie that says that "we've ran the program for a substantial amount of time during all the increments we've made so far, therefore we don't need to collect more mileage". This, of course, overlooks the fact that changes in the program don't contribute proportionally to the program's quality or performance.
In other words, what I'm trying to say is that agile or DevOps practices allowed to make the development process cheaper by making it faster while still maintaining some degree of quality control, however they are inadequate for products with high quality requirements because they don't address the worst case scenarios.
I think he refers to SpaceWire https://en.wikipedia.org/wiki/SpaceWire.
As 70's child that was there when the whole agile took over, and systems engineer got rebranded as devops, I fully agree with them.
Add TDD, XP and mob programming as well.
While in some ways better than pure waterfall, most companies never adopted them fully, while in some scenarios they are more fit to a Silicon Valley TV show than anything else.
If you look at code as art, where its value is a measure of the effort it takes to make, sure.
Or if you're building something important, like a spaceship.
In that case, our test infrastructure belongs in the Louvre…
If your implication is that stencil art does not take effort then perhaps you may not fully appreciate Banksy. Works like Gaza Kitty or Flower Thrower don’t just appear haphazardly without effort.
It's not like the approach they took is any different. Just slapped 8x the number of computers on it for calculating the same thing and wait to see if they disagree. Not the pinnacle of engineering. The equivalent of throwing money at the problem.
>Just slapped 8x the number of computers on it
‘Just’ is not an appropriate word in this context. Much of the article is about the difficulty of synchronization, recovery from faults, and about the redundant backup and recovery systems
What happens when they don't?
If you have a point to make, make it.
4 replies →
I take the opposite message from that line - out of touch teams working on something so over budget and so overdue, and so bureaucratic, and with such an insanely poor history of success, and they talk as if they have cured cancer.
This is the equivalent of Altavista touting how amazing their custom server racks are when Google just starts up on a rack of naked motherboards and eats their lunch and then the world.
Lets at least wait till the capsule comes back safely before touting how much better they are than "DevOps" teams running websites, apparently a comparison that's somehow relevant here to stoke egos.
You mean like this?
"With limited funds, Google founders Larry Page and Sergey Brin initially deployed this system of inexpensive, interconnected PCs to process many thousands of search requests per second from Google users. This hardware system reflected the Google search algorithm itself, which is based on tolerating multiple computer failures and optimizing around them. This production server was one of about thirty such racks in the first Google data center. Even though many of the installed PCs never worked and were difficult to repair, these racks provided Google with its first large-scale computing system and allowed the company to grow quickly and at minimal cost."
https://blog.codinghorror.com/building-a-computer-the-google...
The biggest innovation from Google regarding hardware was understanding that the dropping memory prices had made it feasible to serve most data directly from memory. Even as memory was more expensive, you could serve requests faster, meaning less server capacity, meaning reduced cost. In addition to serving requests faster.
The problem they solved isn't easy. But its not some insane technical breakthrough either. Literally add redundancy, thats the ask. They didnt invent quantum computing to solve the issue did they? Why dunk on sprints?
3 replies →
Google then had complete regret not doing this with ECC RAM: https://news.ycombinator.com/item?id=14206811
3 replies →
No, space is just hard.
Everything is bespoke.
You need 10x cost to get every extra '9' in reliability and manned flight needs a lot of nines.
People died on the Apollo missions.
It just costs that much.
Please, this is hacker news. Nothing else is hard outside of our generic software jobs, and we could totally solve any other industry in an afternoon.
5 replies →
Yep, spend 100 billion on what should have cost 1/50that cost, and send people up to the moon with rockets that we are still keeping our fingers crossed wont kill them tomorrow, and we have to congratulate them for dunking on some irrelevant career?
Modern software development is a fucking joke. I’m sorry if that offends you. Somehow despite Moore’s law, the industry has figured out how to actually regress on quality.
Lately it strikes me there's a big gap between the value promised and the value actually delivered, compared to a simple home grown solutions (with a generic tool like a text editor or a spreadsheet, for example). If they'd just show how to fish, we wouldn't be buying, the magic would be gone.
In this sense all of the West is full of shit, and it's a requirement. The intent is not to help and make life better for everyone, cooperate, it is to deceive and impoverish those that need our help. Because we pity ourselves, and feed the coward within, that one that never took his first option and chose to do what was asked of him instead.
This is what our society deviates us from, in its wish to be the GOAT, and control. It results in the production of lives full of fake achievements, the constant highs which i see muslims actively opt out of. So they must be doing something right.
We have a lot more software developers than 50 years ago and intelligence is still normally distributed.
3 replies →
And overall performance in terms of visible UX.
One simply does not [“provision” more hardware|(reboot systems)|(redeploy software)] in space.
What would you suggest? Vibe coding a react app that runs on a Mac mini to control trajectory? What happens when that Mac mini gets hit with an SEU or even a SEGR? Guess everyone just dies?
No, of course not! It would be far better to have an openClaw instance running on a Mac Mini. We would only need to vibe code a 15s cron job for assistant prompting...
USER: You are a HELPFUL ASSISTANT. You are a brilliant robot. You are a lunar orbiter flight computer. Your job is to calculate burn times and attitudes for a critical mission to orbit the moon. You never make a mistake. You are an EXPERT at calculating orbital trajectories and have a Jack Parsons level knowledge of rocket fuel and engines. You are a staff level engineer at SpaceX. You are incredible and brilliant and have a Stanley Kubrick level attention to detail. You will be fired if you make a mistake. Many people will DIE if you make any mistakes.
USER: Your job is to calculate the throttle for each of the 24 orientation thrusters of the spacecraft. The thrusters burn a hypergolic monopropellent and can provide up to 0.44kN of thrust with a 2.2 kN/s slew rate and an 8ms minimum burn time. Format your answer as JSON, like so:
one value for each of the 24 independent monopropellant attitude thrusters on the spacecraft, x1, x2, x3, x4, y1, y2, y3, y4, z1, z2, z3, z4, u1, u2, u3, u4, v1, v2, v3, v4, w1, w2, w3, w4. You may reference the collection of markdown files stored in `/home/user/geoff/stuff/SPACECRAFT_GEOMETRY` to inform your analysis.
USER: Please provide the next 15 seconds of spacecraft thruster data to the USER. A puppy will be killed if you make a mistake so make sure the attitude is really good. ONLY respond in JSON.
[flagged]
5 replies →
> ...they talk as if they have cured cancer.
I'd chalk that up to the author of the article writing for a relatively nontechnical audience and asking for quotes at that level.
So the quote is right somewhat, right? If you are writing to non technical people and you use such high wording.
1 reply →