The Hidden Power of BCD Instructions

11 years ago (hugi.scene.org)

AAD is basically an 8-bit immediate multiply-accumulate, so it could be said that the 8088 had a MAC instruction - with self-modifying code you could do a real 8-bit multiply-accumulate. ;-)

There's also this 5-byte sequence to convert a nybble (0-F) in AL into the appropriate ASCII hex digit (30h-39h, 41h-46h):

    cmp al, 10
    sbb al, 69h
    das

x86 code can be insanely dense - check out the 256-byte and below categories in the demoscene, for example.

This is good stuff. If anybody wants to dig deeper to articles like this, I have to mention the Hugi Coding Digest[1] (an executable "diskmag") from 2003 which contains all the articles related to programming from Hugi #11 to Hugi #27, including this one.

The topics of the articles are as follows: "Mathematics & Theoretical Computer Science", "General Programming Techniques", "Searching & Sorting", "Object-Orientated Programming", "File Formats", "Text Processing", "2D Graphics Programming", "3D Graphics Programming", "Windows Graphics Programming (GDI, DirectDraw, Direct3D)", "OpenGL", "Sound Programming", "Synchronization & Scripting for Demos", "Hardware-centered Programming", "Code Optimization, FPU", "Data Compression", "64k, 4k and even smaller intros", "Windows", "Linux", "Other Non-Wintel platforms", "Active Server Pages", "ActiveX", "Assembler", "C++", "Flash", "Java", "JavaScript", "PHP", "Other Programming Languages", "Miscellaneous".

Hell, it also has some nice tracker music on the background.

Obviously the format is a bit cumbersome -- but I think it's a good dive into the demoscene culture. Also most of the articles are written by hobbyists -- the real young hackers (oh and a few crackers too!) who just want to share what they have learned.

I think it should run natively on Windows and runs on Linux via Wine. Just launch the hugicode.exe -- of course with appropriate security caution, and if you trust me, Hugi and scene.org to have no malicious intent. :)

Why is the hacking culture like this dead? It was still somewhat well alive just 10 years ago, never mind 15 or 20 years ago. Even after so many years, it still saddens me to look back into gems like this Hugi Special Digest from a decade ago and see it forgotten and gone. Not just the contents or the release itself, but the computing culture which has died along with the demoscene.

[1]: https://www.scene.org/file.php?file=/mags/hugi/hugise01.zip&...

  • Why is the hacking culture like this dead?

    The demoscene is very much alive, if you look at places like pouet.net there's plenty of new demos released even in the sub-1k categories. The newest ones there are from this month. However, you might be correct to say that it's become less known amongst general computer users and programmers, and I think the consumer-oriented nature of computers today (especially mobile devices) is mostly to blame; users are restrained and actively discouraged from tinkering with their machines software and hardware-wise, and isolated from knowledge by many layers of abstraction and complexity. There's a big movement against users sharing executables with each other and running them, and while the security concerns are real, I think it's also had a chilling effect on the hobbyists. The fact that antimalware software tends to detect packed demos as suspicious/infected (false positives) doesn't help either. In addition, many people probably found their way into demoscene via the warez scene that it grew from - and with the growing antipiracy concerns, that route is becoming narrower too.

    While I don't think the demoscene is currently "dead" per se, it's certainly at risk of becoming even more of an obscure and fringe culture than it is now.

    • I agree with this, I think the fraction of people who are looking at computers in this deep way is similar to what it has always been, but it is still a small fraction. And as such its activities are swamped in the noise of other things with the same name.

      Perhaps part of the difference is that before (when RAM/CPU was expensive/slow) you were forced to do this to make something impressive and now we have an excess of compute and RAM. So to rekindle that challenge we set an artificial limit.

This really brings back memories. 25 years ago I used (what was then) the undocumented variation in the itoa() routine for the Borland C run-time library. The purpose was to eliminate the need for a 16-byte table to generate hex codes when base-16 output was desired. itoa() was a part of the printf() library so this table became embedded in virtually every executable. Knocking that out was a meaningful size optimization in those days.

Note that all of the decimal arithmetic instructions are invalid in 64-bit mode.

They had to scavenge opcode space from somewhere, and the bcd were deemed unnecessary.

  • Unfortunately they (AMD) didn't reassign those opcodes for some other purpose - they've just become completely invalid.

    Instead, a whole row of useful general-purpose increment and decrement instructions was replaced by 16 REX prefices. A bit odd if you consider that the number of BCD and segment prefix opcodes they made invalid would've been more than enough to be assigned to the new REXes, and still maintain a consistent encoding...

Since both AAD (opcode 0x37) and AAS (opcode 0x3F) are 1 byte long, and a sequential combination of them won't change the program semantics...another bonus is that ascii characters for 0x37 and 0x3F are printable characters...

This would make them good replacement for a sequence of NOP's...

Anyone want to benchmark to see if these short sequences are faster?

  • They're slower, but things like this can be really useful if you're golfing for size. They're available on some other chips like the 6502, as well - although how they actually work varies! (And is mostly undocumented, too - anything that wants to use the Z flag after an ADC, for example.)

    You can also abuse them as part of things like base change routines sometimes… which is sort of what they're intended for.

    • The 6502 and variants (65c02, 65816, emulators, etc) have slightly different algorithms for BCD math. Adding illegal BCD numbers and checking the result is one way to identify the processor.

  • The author himself states that they are slow. I reckon that at that these instructions are translated into a whole slew of micro-ops. Rarely used instructions like this often are not well optimized.