Transmeta made a technology bet that dynamic compilation could beat OOO super scalar CPUs in SPEC.
It was wrong, but it was controversial among experts at the time.
I’m glad that they tried it even though it turned out to be wrong. Many of the lessons learned are documented in systems conferences and incorporated into modern designs, ie GPUs.
To me transmeta is a great example of a venture investment. If it would have beaten Intel at SPEC by a margin, it would have dominated the market. Sometimes the only way to get to the bottom of a complex system is to build it.
The same could be said of scaling laws and LLMs. It was theory before Dario, Ilya, OpenAI, et al trained it.
I was an intel cpu architect when transmeta started making claims. We were baffled by those claims. We were pushing the limit of our pipelines to get incremental gains and they were claiming to beat a dedicated arch on the fly! None of their claims made sense to ANYONE with a shred of cpu arch experience. I think your summary has rose colored lenses, or reflects the layman’s perspective.
Wasn't Intel trying to do something similar in Itanium i.e. use software to translate code into VLIW instructions to exploit many parallel execution units? Only they wanted the C++ compiler to do it rather than a dynamic recompiler? At least some people in Intel thought that was a good idea.
I wonder if the x86 teams at Intel people were similarly baffled by that.
I think this is a classic hill-climbing dilemma. If you start in the same place, and one org has worked very hard and spent a lot of money optimizing the system, they will probably come out on top. But if you start in a different place, reimagining the problem from first principles, you may or may not find yourself with a taller hill to climb. Decisions made very early on in your hill-climbing process lock you in to a path, and then the people tasked with optimizing the system later can't fight the organizational inertia to backtrack and pick a different path. But a new startup can.
It's worth noting that Google actually did succeed with a wildly different architecture a couple years later. They figured "Well, if CPU performance is hitting a wall - why use just one CPU? Why not put together thousands of commodity CPUs that individually are not that powerful, and then use software to distribute workloads across those CPUs?" And the obvious objection to that is "If we did that, it won't be compatible with all the products out there that depend upon x86 binary compatibility", and Google's response was the ultimate in hubris: "Well we'll just build new products then, ones that are bigger and better than the whole industry." Miraculously it worked, and made a multi-trillion-dollar company (multiple multi-trillion-dollar companies, if you now consider how AWS, Facebook, TSMC, and NVidia revenue depends upon the cloud).
Transmeta's mistake was that they didn't re-examine enough assumptions. They assumed they were building a CPU rather than an industry. If they'd backed up even farther they would've found that there actually was fertile territory there.
I think more about the timing being incorrect - betting on software in an era of exponential hardware growth was unwise (software performance can’t scale that way). The problem is that you need to marry it with a significantly better CPU/architecture because the JIT is about not losing performance while retaining back compat.
However, if you add it onto a better CPU it’s a fine technique to bet on - case in point Apple’s move away from Intel onto homegrown CPUs.
> However, if you add it onto a better CPU it’s a fine technique to bet on - case in point Apple’s move away from Intel onto homegrown CPUs.
I don't think Apple is a good example here. Arm was extremely well-established when Apple began its own phone/tablet CPU designs. By the time Macs began to transition, much of their developer ecosystem was already familiar.
Apple's CPUs are actually notably conservative when compared to the truly wild variety of Arm implementations; no special vector instructions (e.g. SVE), no online translation (e.g. Nvidia Denver), no crazy little/big/bigger core complexes.
Exactly... I think that if you look at the accelerator paths that Apple's chips have for x86 emulation combined with software it's pretty nifty. I do wish these were somewhat standardized/licensed/upstreamed so that other arm vendors could use them in a normalized way.
They were also the first to produce an x86 CPU with an integrated northbridge, they could have pitched it more at embedded and industrial markets where SPEC scores are less important.
That's kind of the bet they made, but misses a key point.
Their fundamental idea was that by having simpler CPUs, they could iterate on Moore's law more quickly. And eventually they would win on performance. Not just on a few speculative edge cases, but overall. The dynamic compilation was needed to be able to run existing software on it.
The first iterations, of course, would be slower. And so their initial market, needed to afford those software generations, would be use cases for low power. Because the complexity of a CISC chip made that a weak point for Intel.
They ran into a number of problems.
The first is that the team building that dynamic compilation layer was more familiar with the demands of Linux than Windows, with the result that the compilation worked better for Linux than Windows.
The second problem was that the "simple iterates faster" also turns out to be true for ARM chips. And the most profitable segments of that low power market turned out to be willing to rewrite their software for that use case.
And the third problem is that Intel proved to be able to address their architectural shortcomings by throwing enough engineers at the problem to iterate faster.
If Transmeta had won its bet, they would have completely dominated. But they didn't.
It is worth noting that Apple pursued a somewhat similar idea with Rosetta. Both in changing to Intel, and later changing to ARM64. With the crucial difference that they also controlled the operating system. Meaning that instead of constantly dynamically compiling, they could rely on the operating system to decide what needs to be compiled, when, and call it correctly. And they also better understood what to optimize for.
I don't know if the bet was even particularly wrong. If they had done a little better job on performance, capitalized on the pains of Netburst + AMD64 transition, and survived long enough to do integrated 3D graphics and native libraries for Javascript + media decoding it might have worked out fine. That alternate universe might have involved a merger with Imagination when the Kyro was doing poorly and the company had financial pain. We'll never know.
Folks like to say that, but that's not what's happening.
The key difference is: what is an instruction set? Is it a Turing-complete thing with branches, calls, etc? Or is it just data flow instructions (math, compares, loads and stores, etc)?
X86 CPUs handle branching in the frontend using speculation. They predict where the branch will go, issue data flow instructions from that branch destination, along with a special "verify that I branched to the right place" instruction, which is basically just the compare portion of the branch. ARM CPUs do the same thing. In both X86 and ARM CPUs, the data flow instructions that the CPU actually executes look different (are lower level, have more registers) than the original instruction set.
This means that there is no need to translate branch destinations. There's never a place in the CPU that has to take a branch destination (an integer address in virtual memory) in your X86 instruction stream and work out what the corresponding branch destination is in the lower-level data flow stream. This is because the data flow stream doesn't branch; it only speculates.
On the other hand, a DBT has to have a story for translating branch destinations, and it does have to target a full instruction set that does have branching.
That said, I don't know what the Transmeta CPUs did. Maybe they had a low-level instruction set that had all sorts of hacks to help the translation layer avoid the problems of branch destination translation.
Not to the same level. Crusoe was, in many ways, more classic CISC than x86 - except it's microcode was actually doing dynamic translation to internal ISA instead of operating like interpreter in old CISCs.
x86 ISA had the funny advantage of being way closer to RISC than "beloved" CISC architectures of old like m68k or VAX. Many common instructions translate to single "RISCy" instruction for the internal microarchitecture (something AMD noted IIRC in the original K5 with its AMD29050-derived core as "most instructions translate to 1 internal microinstruction, some between 2 to 4"). X86 prefixes are also way simpler than the complicated logic of decoding m68k or VAX. An instruction with multiple prefixes will quite probably decode to single microinstruction.
That said, there's funny thing in that Transmeta tech survived quite a long way to the point that there were Android tablets, in fact flagship Google ones like Nexus 9, whose CPU was based on it - because nvidia "Denver" architecture used same technology (AFAIK licensed from Transmeta, but don't cite me on this)
Modern CPUs still translate individual instructions to corresponding micro-ops, and do a bit of optimization with adjacent micro-ops. Transmeta converted whole regions of code at a time, and I think it tried to do higher-level optimizations.
Did anyone try dynamic recompilation from x86 to x86? Like a JIT taking advantage of the fact that the target ISA is compatible with with the source ISA.
Yes, I think the conclusion was that it did improve performance on binaries that were not compiled with optimizations, but didn't generate enough gains on optimized binaries to set of the cost of re-compilation.
Notably VMware and alike in pre-hardware virtualization era did something like that to run x86 programs fast under virtualization instead of interpreting x86 through emulation.
One aspect of Transmeta not mentioned by this article is their "Code Morphing" technique used by the Crusoe and Efficeon processors. This was a low level piece of software similar to a JIT compiler that translated x86 instructions to the processor's native VLIW instruction set.
Similar technology was developed later by Nvidia, which had licensed Transmeta's IP, for the Denver CPU cores used in the HTC Nexus 9 and the Carmel CPU cores in the Magic Leap One. Denver was originally intended to target both ARM and x86 but they had to abandon the x86 support due to patent issues.
Code morphing was fascinating. I had no idea nVidia tried anything similar.
I always felt Transmeta could have carved out a small but sustained niche by offering even less-efficient "morphing" for other architectures, especially discontinued ones. 680x0, SPARC, MIPS, Alpha, PA-RISC... anything the vendors stopped developing hardware (or competitive hardware) for.
So glad someone else also knew about this connection :) Details about Denver are pretty minimal, but this talk at Stanford is one of the most detailed I’ve been able to find for those interested. It’s fascinating stuff with lots of similarities to how Transmeta operated: https://youtu.be/oEuXA0_9feM?si=WXuBDzCXMM4_5YhA
There was a Hot Chips presentation by them that also gave some good details. Unlike the original Transmeta design they first ran code natively and only recompiled the hot spots.
God, I had a manager who was the worst manager I ever had who was the last one to stay at Transmeta to turn off the lights. Between working there and working at DEC he could boast that he'd supervised both Dave Cutler and Linus Torvalds.
One time I had to unravel a race condition and he seemed pissed that it took a few days and when I tried to explain the complexity he told me his name was on a patent for a system that would let several VAXes share a single disk and didn't need a lecture.
I interviewed there around 1997 or 98. As part of the interview process, I had the opportunity to have lunch with Linus Torvalds. (I did not get an offer)
To me, what's important about Transmeta is that they brought over some kid developer named Linus Torvalds to the States from Finland. He had invented some hobby operating systen. I wonder what ever happened to him. :-)
Didn't Transmeta's technology end up in Apple's PowerPC emulator Rosetta, following the switch to Intel?
IIRC Transmeta's technology came out of HP (?) research into dynamic inlining of compiled code, giving performance comparable to profile-guided optimization without the upfront work. It worked similarly to an inlining JIT compiler, except it was working with already compiled code. Very interesting approach and one I think could be generally useful. Imagine if, say, your machine's bootup process was optimized for the hardware you actually have. I'm going off decades old memories here, so the details might be incorrect.
In the early 1990s, HP had a product called “SoftPC” that was used to emulate x86 on PA-RISC. IIRC, however, this was an OEM product written externally. My recollection of how it worked was similar to what is described in the Dynamo paper. I’m wondering if HP bought the technology and whether Dynamo was a later iteration of it? Essentially, it was a tracing JIT. Regardless, all these ideas ended up morphing into Rosetta (versions 1 and 2), though as I understand it, Rosetta also uses a couple hardware hooks to speed up some cases that would be slow if just performed in software.
I worked at Transmeta. I remember for the launch of one of the Crusoe-powered laptops, there was a bug that prevented the BIOS from booting Linux. Since the laptop was only going to run Windows ME, they didn’t fix it. Of course when Linus got a demo unit to play with, the first thing he did was try to install Linux on it. He let everyone know, and the bug was fixed soon there after.
Back in the day, Linux was less tolerant of incorrect behavior than Windows 9x was, and would crash, terminate a process, or otherwise surface errors at times when Windows 9x would just keep going until the bugs corrupted memory or similar. Having Linus aboard as a technical advisor, soneone to whom you can say "hey, the CPU is crashing here, what's the kernel trying to do at that spot?", alone, probably would have been well worth the money to hire him.
Not before becoming the worst sort of patent trolls. "in January 2009, Transmeta sold itself to Novafora, who in turn sold the patent portfolio to Intellectual Ventures". (This was long after Linus had left.)
Well, they ended up being mobile-oriented, but even that didn’t work. They were definitely not server-oriented and they really couldn’t compete at desktop. Honestly, while the tech was interesting, it wasn’t really solving a problem that anyone was struggling with.
> it wasn’t really solving a problem that anyone was struggling with
They did push the envelope on efficiency. My Crusoe-equipped laptop could go six hours on the stock battery (12+ on the extended batteries) back when most laptops struggled to get three.
They probably would have worked well as server processor, because they were pretty energy efficient, but they were slow the first time a program was run, but sped up after caching the translation. Most servers run the same software over and over again, so they could have been competitive.
It would have been an extremely difficult time to enter the market though, because at the time Intel was successfully paying server manufacturers to not offer superior competing products.
I used a Fujitsu Lifebook P-2046 laptop at university. It had an 800Mhz Crusoe chip. IIRC it shipped with 256 MB of RAM, which I eventually upgraded to 384.
Somehow I managed to tolerate running Gentoo on it. Compiling X, OpenOffice, or Firefox were multi-day affairs. One thing that annoyed me was I could never get the graphics card (an ATI Rage 128 with 4 MB RAM, IIRC) working with acceleration under Linux, and that was when compositing window managers were gaining prevalence; I kept trying to get it working in the hope that it would take a bit of the load off of the struggling CPU.
Despite the bad performance, it worked really well for a college student: it was great for taking notes, and the batteries (extended main and optical drive bay) would easily last a full day of classes. It wouldn't run Eclipse very well, but most of my CS assignments were done using a text editor, anyways.
The last non Apple laptop I had was a Fujitsu lifebook with a Transmeta processor in it. I did way too much work trying to get Linux to use every bit of the hardware. Mostly researching what others had done, but also contributed to the ACPI code - half the buttons didn’t work on Linux because the factory default was broken. Windows had its own, but rather than pilfer that, someone pointed out that later lifebooks had fewer issues so I backported fixes from those, and I think invented one of my own by trying things that seemed reasonable.
I also looked at the TM specific flags that they documented, and was surprised to find some that hadn’t been enabled on Linux despite Linus still working there at the time. They looked to be useful for low power mode, and at that time I was looking for a carry-everywhere laptop with decent run time so I invested in those flags.
Turns out they didn’t do anything observable to the system. Power draw was unphased by flipping these toggles. I don’t believe those changes ever got merged.
But it was the Linux fuckery that convinced me I wanted a bask shell and a Unix CLI and just get shit done without having to fiddle all the time. I had better things to do. So I’ve been on apple since except for Pi, Docker, and work.
at the time, just out of undergrad, I ended up working for the remnants of the #9 Video Card company that had been bought by S3 and was masking a last effort at making a Linux-based Transmeta-powered "web-pad" (tablet): the "Frontpath ProGear" (new management wouldn't let them give it a Beatles related name that #9 equipment used to get)
in any case due to the unfortunate timing of the dot-com implosion it never really went anywhere (I wish I had managed to keep one, they used to appear on ebay occasionally)
the one thing I remember is that it was memory limited, it had 64MB but I think the code-morphing software really wanted 16MB of it which really cut into the available system memory
More interestingly, whatever happened to David "Pardo" Keppel, who PhD thesis and person were somewhat central to Transmeta (at least as far as I remember it). For someone who was doing a CS PhD in the mid 90s, he has a vanishingly tiny online footprint. Not sure if that is inspiring or concerning ...
Cisco and Cray used IBM fabs for multiple generations in the aughts but they weren't startups. Before the rise of TSMC it was a weird situation where fabless companies were kind of picking up extra capacity from IDMs.
I remember my Compaq TC1000 well, a pen tablet convertible running Windows very sluggishly with a Transmeta Crusoe processor. Nice promise, execution not so much unfortunately.
I liked the Transmeta web page from before they launched. It was just bare HTML with no styling. It said:
This page is not here yet.
The product hype and lack of knowledge about what it was meant that nobody knew what to expect. In these hyped expectations, and with Torvalds on board, everyone expected that everything would be different. But it wasn't.
A similar product launch was the Segway, where we went from this incredible vision of everyone on Segways to nobody wanting one.
The hype was part of the problem with Transmeta. Even in it's delivered form it could have found a niche. For example, the network computer was in vogue at the time, thanks to Oracle. A different type of device, like a Chromebook might have worked.
With Torvalds connected to Transmeta and the stealthy development, we never did get to hear about who was really behind Transmeta and why.
It looks like rather than hiring a designer, they let one of their engineers (or worse, the CEO) design the Transmeta logo. I don’t know what that font is, but it might be even worse than Papyrus.
>
A similar product launch was the Segway, where we went from this incredible vision of everyone on Segways to nobody wanting one.
The problem with Segway in Germany was rather the certification for road traffic. Because of the insane red tape involved, the introduction was delayed, and for the same reason nobody thus wanted one.
I recall a coworker being excited several years ago about catching someone lying about their linux experience before their interview. If what they said was true, they'd have to have been working on it during it's first year.
He was then excited after the interview because the individual had been working at transmeta with Linus, and his resume was accurate. He didn't end up working with us, but I wasn't privy to any additional information.
Thats different qualitatively and quantitatively than buying patent rights for cheap (since the even the original patent holders didn't think it was worth much) and suing random people who happen to use a product that may infringe on the patent.
Transmeta made a technology bet that dynamic compilation could beat OOO super scalar CPUs in SPEC.
It was wrong, but it was controversial among experts at the time.
I’m glad that they tried it even though it turned out to be wrong. Many of the lessons learned are documented in systems conferences and incorporated into modern designs, ie GPUs.
To me transmeta is a great example of a venture investment. If it would have beaten Intel at SPEC by a margin, it would have dominated the market. Sometimes the only way to get to the bottom of a complex system is to build it.
The same could be said of scaling laws and LLMs. It was theory before Dario, Ilya, OpenAI, et al trained it.
I was an intel cpu architect when transmeta started making claims. We were baffled by those claims. We were pushing the limit of our pipelines to get incremental gains and they were claiming to beat a dedicated arch on the fly! None of their claims made sense to ANYONE with a shred of cpu arch experience. I think your summary has rose colored lenses, or reflects the layman’s perspective.
Wasn't Intel trying to do something similar in Itanium i.e. use software to translate code into VLIW instructions to exploit many parallel execution units? Only they wanted the C++ compiler to do it rather than a dynamic recompiler? At least some people in Intel thought that was a good idea.
I wonder if the x86 teams at Intel people were similarly baffled by that.
I think this is a classic hill-climbing dilemma. If you start in the same place, and one org has worked very hard and spent a lot of money optimizing the system, they will probably come out on top. But if you start in a different place, reimagining the problem from first principles, you may or may not find yourself with a taller hill to climb. Decisions made very early on in your hill-climbing process lock you in to a path, and then the people tasked with optimizing the system later can't fight the organizational inertia to backtrack and pick a different path. But a new startup can.
It's worth noting that Google actually did succeed with a wildly different architecture a couple years later. They figured "Well, if CPU performance is hitting a wall - why use just one CPU? Why not put together thousands of commodity CPUs that individually are not that powerful, and then use software to distribute workloads across those CPUs?" And the obvious objection to that is "If we did that, it won't be compatible with all the products out there that depend upon x86 binary compatibility", and Google's response was the ultimate in hubris: "Well we'll just build new products then, ones that are bigger and better than the whole industry." Miraculously it worked, and made a multi-trillion-dollar company (multiple multi-trillion-dollar companies, if you now consider how AWS, Facebook, TSMC, and NVidia revenue depends upon the cloud).
Transmeta's mistake was that they didn't re-examine enough assumptions. They assumed they were building a CPU rather than an industry. If they'd backed up even farther they would've found that there actually was fertile territory there.
It was risky.
From my perspective it was more exciting to the programming systems and compiler community than to the computer architecture community.
I think more about the timing being incorrect - betting on software in an era of exponential hardware growth was unwise (software performance can’t scale that way). The problem is that you need to marry it with a significantly better CPU/architecture because the JIT is about not losing performance while retaining back compat.
However, if you add it onto a better CPU it’s a fine technique to bet on - case in point Apple’s move away from Intel onto homegrown CPUs.
> However, if you add it onto a better CPU it’s a fine technique to bet on - case in point Apple’s move away from Intel onto homegrown CPUs.
I don't think Apple is a good example here. Arm was extremely well-established when Apple began its own phone/tablet CPU designs. By the time Macs began to transition, much of their developer ecosystem was already familiar.
Apple's CPUs are actually notably conservative when compared to the truly wild variety of Arm implementations; no special vector instructions (e.g. SVE), no online translation (e.g. Nvidia Denver), no crazy little/big/bigger core complexes.
2 replies →
Exactly... I think that if you look at the accelerator paths that Apple's chips have for x86 emulation combined with software it's pretty nifty. I do wish these were somewhat standardized/licensed/upstreamed so that other arm vendors could use them in a normalized way.
They were also the first to produce an x86 CPU with an integrated northbridge, they could have pitched it more at embedded and industrial markets where SPEC scores are less important.
They did! There are many transmeta powered thin clients for example.
That's kind of the bet they made, but misses a key point.
Their fundamental idea was that by having simpler CPUs, they could iterate on Moore's law more quickly. And eventually they would win on performance. Not just on a few speculative edge cases, but overall. The dynamic compilation was needed to be able to run existing software on it.
The first iterations, of course, would be slower. And so their initial market, needed to afford those software generations, would be use cases for low power. Because the complexity of a CISC chip made that a weak point for Intel.
They ran into a number of problems.
The first is that the team building that dynamic compilation layer was more familiar with the demands of Linux than Windows, with the result that the compilation worked better for Linux than Windows.
The second problem was that the "simple iterates faster" also turns out to be true for ARM chips. And the most profitable segments of that low power market turned out to be willing to rewrite their software for that use case.
And the third problem is that Intel proved to be able to address their architectural shortcomings by throwing enough engineers at the problem to iterate faster.
If Transmeta had won its bet, they would have completely dominated. But they didn't.
It is worth noting that Apple pursued a somewhat similar idea with Rosetta. Both in changing to Intel, and later changing to ARM64. With the crucial difference that they also controlled the operating system. Meaning that instead of constantly dynamically compiling, they could rely on the operating system to decide what needs to be compiled, when, and call it correctly. And they also better understood what to optimize for.
I don't know if the bet was even particularly wrong. If they had done a little better job on performance, capitalized on the pains of Netburst + AMD64 transition, and survived long enough to do integrated 3D graphics and native libraries for Javascript + media decoding it might have worked out fine. That alternate universe might have involved a merger with Imagination when the Kyro was doing poorly and the company had financial pain. We'll never know.
1 reply →
Aren't modern CPUs, essetially, dynamic translators from x86_64 instruction set into internal RISC-like intsruction sets?
Folks like to say that, but that's not what's happening.
The key difference is: what is an instruction set? Is it a Turing-complete thing with branches, calls, etc? Or is it just data flow instructions (math, compares, loads and stores, etc)?
X86 CPUs handle branching in the frontend using speculation. They predict where the branch will go, issue data flow instructions from that branch destination, along with a special "verify that I branched to the right place" instruction, which is basically just the compare portion of the branch. ARM CPUs do the same thing. In both X86 and ARM CPUs, the data flow instructions that the CPU actually executes look different (are lower level, have more registers) than the original instruction set.
This means that there is no need to translate branch destinations. There's never a place in the CPU that has to take a branch destination (an integer address in virtual memory) in your X86 instruction stream and work out what the corresponding branch destination is in the lower-level data flow stream. This is because the data flow stream doesn't branch; it only speculates.
On the other hand, a DBT has to have a story for translating branch destinations, and it does have to target a full instruction set that does have branching.
That said, I don't know what the Transmeta CPUs did. Maybe they had a low-level instruction set that had all sorts of hacks to help the translation layer avoid the problems of branch destination translation.
3 replies →
Not to the same level. Crusoe was, in many ways, more classic CISC than x86 - except it's microcode was actually doing dynamic translation to internal ISA instead of operating like interpreter in old CISCs.
x86 ISA had the funny advantage of being way closer to RISC than "beloved" CISC architectures of old like m68k or VAX. Many common instructions translate to single "RISCy" instruction for the internal microarchitecture (something AMD noted IIRC in the original K5 with its AMD29050-derived core as "most instructions translate to 1 internal microinstruction, some between 2 to 4"). X86 prefixes are also way simpler than the complicated logic of decoding m68k or VAX. An instruction with multiple prefixes will quite probably decode to single microinstruction.
That said, there's funny thing in that Transmeta tech survived quite a long way to the point that there were Android tablets, in fact flagship Google ones like Nexus 9, whose CPU was based on it - because nvidia "Denver" architecture used same technology (AFAIK licensed from Transmeta, but don't cite me on this)
4 replies →
Modern CPUs still translate individual instructions to corresponding micro-ops, and do a bit of optimization with adjacent micro-ops. Transmeta converted whole regions of code at a time, and I think it tried to do higher-level optimizations.
Did anyone try dynamic recompilation from x86 to x86? Like a JIT taking advantage of the fact that the target ISA is compatible with with the source ISA.
Yes, I think the conclusion was that it did improve performance on binaries that were not compiled with optimizations, but didn't generate enough gains on optimized binaries to set of the cost of re-compilation.
https://dl.acm.org/doi/10.1145/358438.349303
(this is not about x86 but PA-RISC, but the conclusions would likely be very similar...)
Notably VMware and alike in pre-hardware virtualization era did something like that to run x86 programs fast under virtualization instead of interpreting x86 through emulation.
One aspect of Transmeta not mentioned by this article is their "Code Morphing" technique used by the Crusoe and Efficeon processors. This was a low level piece of software similar to a JIT compiler that translated x86 instructions to the processor's native VLIW instruction set.
Similar technology was developed later by Nvidia, which had licensed Transmeta's IP, for the Denver CPU cores used in the HTC Nexus 9 and the Carmel CPU cores in the Magic Leap One. Denver was originally intended to target both ARM and x86 but they had to abandon the x86 support due to patent issues.
https://en.wikipedia.org/wiki/Project_Denver
Code morphing was fascinating. I had no idea nVidia tried anything similar.
I always felt Transmeta could have carved out a small but sustained niche by offering even less-efficient "morphing" for other architectures, especially discontinued ones. 680x0, SPARC, MIPS, Alpha, PA-RISC... anything the vendors stopped developing hardware (or competitive hardware) for.
Here is an old doc how it worked
https://homepage.divms.uiowa.edu/~ghosh/4-18-06.pdf
I think it's correct to say Transmeta did partial software emulation, though lines get blurry here.
So glad someone else also knew about this connection :) Details about Denver are pretty minimal, but this talk at Stanford is one of the most detailed I’ve been able to find for those interested. It’s fascinating stuff with lots of similarities to how Transmeta operated: https://youtu.be/oEuXA0_9feM?si=WXuBDzCXMM4_5YhA
There was a Hot Chips presentation by them that also gave some good details. Unlike the original Transmeta design they first ran code natively and only recompiled the hot spots.
Very similar approach is used in MCST Elbrus CPUs: https://en.wikipedia.org/wiki/Elbrus-8S#Supported_operating_...
God, I had a manager who was the worst manager I ever had who was the last one to stay at Transmeta to turn off the lights. Between working there and working at DEC he could boast that he'd supervised both Dave Cutler and Linus Torvalds.
One time I had to unravel a race condition and he seemed pissed that it took a few days and when I tried to explain the complexity he told me his name was on a patent for a system that would let several VAXes share a single disk and didn't need a lecture.
>he told me his name was on a patent for a system that would let several VAXes share a single disk
Ha, "We stand on the shoulders of giants"...
“Ah, let’s just put it on a VAX then…”
I interviewed there around 1997 or 98. As part of the interview process, I had the opportunity to have lunch with Linus Torvalds. (I did not get an offer)
To me, what's important about Transmeta is that they brought over some kid developer named Linus Torvalds to the States from Finland. He had invented some hobby operating systen. I wonder what ever happened to him. :-)
Didn't Transmeta's technology end up in Apple's PowerPC emulator Rosetta, following the switch to Intel?
IIRC Transmeta's technology came out of HP (?) research into dynamic inlining of compiled code, giving performance comparable to profile-guided optimization without the upfront work. It worked similarly to an inlining JIT compiler, except it was working with already compiled code. Very interesting approach and one I think could be generally useful. Imagine if, say, your machine's bootup process was optimized for the hardware you actually have. I'm going off decades old memories here, so the details might be incorrect.
No, you are confusing Transmeta with Transitive. https://en.wikipedia.org/wiki/QuickTransit
Dynamo <https://www.cse.iitm.ac.in/~krishna/courses/2022/odd-cs6013/...>?
In the early 1990s, HP had a product called “SoftPC” that was used to emulate x86 on PA-RISC. IIRC, however, this was an OEM product written externally. My recollection of how it worked was similar to what is described in the Dynamo paper. I’m wondering if HP bought the technology and whether Dynamo was a later iteration of it? Essentially, it was a tracing JIT. Regardless, all these ideas ended up morphing into Rosetta (versions 1 and 2), though as I understand it, Rosetta also uses a couple hardware hooks to speed up some cases that would be slow if just performed in software.
2 replies →
I remember it being in one of Sony VAIO's product lines called the picturebook, for its small form factor and a swivel webcam.
hat was the first laptop i owned ;-) as a frequent traveler it was a very useful device.
A lot ended up in HotSpot for the JVM. I know a number of extremely good engineers whose career path went TransMeta -> Sun -> Google.
All I know about Transmeta is that Linus Torvalds moved over from Finland to the USA to work at this startup.
Other than that, it seems to have sunk without a trace.
I worked at Transmeta. I remember for the launch of one of the Crusoe-powered laptops, there was a bug that prevented the BIOS from booting Linux. Since the laptop was only going to run Windows ME, they didn’t fix it. Of course when Linus got a demo unit to play with, the first thing he did was try to install Linux on it. He let everyone know, and the bug was fixed soon there after.
Back in the day, Linux was less tolerant of incorrect behavior than Windows 9x was, and would crash, terminate a process, or otherwise surface errors at times when Windows 9x would just keep going until the bugs corrupted memory or similar. Having Linus aboard as a technical advisor, soneone to whom you can say "hey, the CPU is crashing here, what's the kernel trying to do at that spot?", alone, probably would have been well worth the money to hire him.
1 reply →
Nice.
Glad you were a part of it at the time?
1 reply →
Not before becoming the worst sort of patent trolls. "in January 2009, Transmeta sold itself to Novafora, who in turn sold the patent portfolio to Intellectual Ventures". (This was long after Linus had left.)
> But they were still a technology company, and if their plans had gone well, they would have sold their product to dotcoms
I'm not sure that that's really correct; they were very desktop-oriented.
Well, they ended up being mobile-oriented, but even that didn’t work. They were definitely not server-oriented and they really couldn’t compete at desktop. Honestly, while the tech was interesting, it wasn’t really solving a problem that anyone was struggling with.
> it wasn’t really solving a problem that anyone was struggling with
They did push the envelope on efficiency. My Crusoe-equipped laptop could go six hours on the stock battery (12+ on the extended batteries) back when most laptops struggled to get three.
They probably would have worked well as server processor, because they were pretty energy efficient, but they were slow the first time a program was run, but sped up after caching the translation. Most servers run the same software over and over again, so they could have been competitive.
It would have been an extremely difficult time to enter the market though, because at the time Intel was successfully paying server manufacturers to not offer superior competing products.
I had a pretty slick Toshiba Libretto L1 from Japan at the time - twice as wide as long, with a 1280x600 display.
Its 600Mhz Transmeta Crusoe CPU was pretty slow, unfortunately. Like a Celeron 333Mhz IIRC.
I used a Fujitsu Lifebook P-2046 laptop at university. It had an 800Mhz Crusoe chip. IIRC it shipped with 256 MB of RAM, which I eventually upgraded to 384.
Somehow I managed to tolerate running Gentoo on it. Compiling X, OpenOffice, or Firefox were multi-day affairs. One thing that annoyed me was I could never get the graphics card (an ATI Rage 128 with 4 MB RAM, IIRC) working with acceleration under Linux, and that was when compositing window managers were gaining prevalence; I kept trying to get it working in the hope that it would take a bit of the load off of the struggling CPU.
Despite the bad performance, it worked really well for a college student: it was great for taking notes, and the batteries (extended main and optical drive bay) would easily last a full day of classes. It wouldn't run Eclipse very well, but most of my CS assignments were done using a text editor, anyways.
Just looked up their investments. These were the quaint days when 88 million investment was a lot of money.
The last non Apple laptop I had was a Fujitsu lifebook with a Transmeta processor in it. I did way too much work trying to get Linux to use every bit of the hardware. Mostly researching what others had done, but also contributed to the ACPI code - half the buttons didn’t work on Linux because the factory default was broken. Windows had its own, but rather than pilfer that, someone pointed out that later lifebooks had fewer issues so I backported fixes from those, and I think invented one of my own by trying things that seemed reasonable.
I also looked at the TM specific flags that they documented, and was surprised to find some that hadn’t been enabled on Linux despite Linus still working there at the time. They looked to be useful for low power mode, and at that time I was looking for a carry-everywhere laptop with decent run time so I invested in those flags.
Turns out they didn’t do anything observable to the system. Power draw was unphased by flipping these toggles. I don’t believe those changes ever got merged.
But it was the Linux fuckery that convinced me I wanted a bask shell and a Unix CLI and just get shit done without having to fiddle all the time. I had better things to do. So I’ve been on apple since except for Pi, Docker, and work.
at the time, just out of undergrad, I ended up working for the remnants of the #9 Video Card company that had been bought by S3 and was masking a last effort at making a Linux-based Transmeta-powered "web-pad" (tablet): the "Frontpath ProGear" (new management wouldn't let them give it a Beatles related name that #9 equipment used to get)
in any case due to the unfortunate timing of the dot-com implosion it never really went anywhere (I wish I had managed to keep one, they used to appear on ebay occasionally)
the one thing I remember is that it was memory limited, it had 64MB but I think the code-morphing software really wanted 16MB of it which really cut into the available system memory
More interestingly, whatever happened to David "Pardo" Keppel, who PhD thesis and person were somewhat central to Transmeta (at least as far as I remember it). For someone who was doing a CS PhD in the mid 90s, he has a vanishingly tiny online footprint. Not sure if that is inspiring or concerning ...
Hopefully he is living his best life and doing what he enjoys.
> so IBM handled manufacturing of its first-generation CPUs.
I'm curious: Is there a consensus on which startup companies achieved success using IBM as a fab? or if not a consensus, I'd settle for anecdotes too.
My own company (which built 40G optical transponders) used them back in that era. While the tech was first rate, the pricing was something to behold.
Cisco and Cray used IBM fabs for multiple generations in the aughts but they weren't startups. Before the rise of TSMC it was a weird situation where fabless companies were kind of picking up extra capacity from IDMs.
I don't know about startups, but the Cell processor in the PS3 and the Xenon processor in the Xbox 360 we both fabricated by IBM.
The Nintendo GameCube and Wii also had IBM CPUs.
> the pricing was something to behold
I guess you mean that not in a good way?
I'd imagine so, IBM are many things (some of them brilliant) but I don't think anyone's ever accused them of been cheap.
I remember my Compaq TC1000 well, a pen tablet convertible running Windows very sluggishly with a Transmeta Crusoe processor. Nice promise, execution not so much unfortunately.
I liked the Transmeta web page from before they launched. It was just bare HTML with no styling. It said:
The product hype and lack of knowledge about what it was meant that nobody knew what to expect. In these hyped expectations, and with Torvalds on board, everyone expected that everything would be different. But it wasn't.
A similar product launch was the Segway, where we went from this incredible vision of everyone on Segways to nobody wanting one.
The hype was part of the problem with Transmeta. Even in it's delivered form it could have found a niche. For example, the network computer was in vogue at the time, thanks to Oracle. A different type of device, like a Chromebook might have worked.
With Torvalds connected to Transmeta and the stealthy development, we never did get to hear about who was really behind Transmeta and why.
https://web.archive.org/web/19970710102251/http://www.transm...
Then, https://web.archive.org/web/20000229173916/http://www.transm... , when content appeared around Feb 2000.
Product launch PDF from Jan 19, 2000: https://web.archive.org/web/20000815231116/http://www.transm...
It looks like rather than hiring a designer, they let one of their engineers (or worse, the CEO) design the Transmeta logo. I don’t know what that font is, but it might be even worse than Papyrus.
Thanks for that, I was almost right - This web page is not here yet.
I still use this as important placeholder text, not that anyone outside HN would get the reference.
> I liked the Transmeta web page from before they launched. It was just bare HTML with no styling. It said: > > This page is not here yet.
I remember that fondly.
If you did view source there was a comment that said something like:
No, there are no hidden messages in the source code, either.
It also said in an html comment,
which at the time I took as some inside joke.
> A similar product launch was the Segway, where we went from this incredible vision of everyone on Segways to nobody wanting one.
The problem with Segway in Germany was rather the certification for road traffic. Because of the insane red tape involved, the introduction was delayed, and for the same reason nobody thus wanted one.
The children of Segway are still here!
Electric unicycles and Onewheels!
And they're really fun!
2 replies →
I recall a coworker being excited several years ago about catching someone lying about their linux experience before their interview. If what they said was true, they'd have to have been working on it during it's first year.
He was then excited after the interview because the individual had been working at transmeta with Linus, and his resume was accurate. He didn't end up working with us, but I wasn't privy to any additional information.
[dead]
TL;DR:
…they became a patent troll
Nonsense. They are licensing IP they created.
Thats different qualitatively and quantitatively than buying patent rights for cheap (since the even the original patent holders didn't think it was worth much) and suing random people who happen to use a product that may infringe on the patent.