Comment by saagarjha

1 day ago

No I think Python's struct module is also really bad. My point is if you are making a new DSL for laying out arbitrary formats why not do something better than what we have

Author here, this is a valid point but there are also valid reasons to choose C structures. The larger framework that this is a part of is primarily targeted towards people working in cybersecurity, not software engineers. Cybersecurity people are very often not great software engineers and there is a high throughput of “throwaway” scripts, or “make a quick hacky change”. C is commonly already well understood, a bespoke DSL usually is not and requires a learning step. You can “hit the ground running”, so to say.

And, as a bonus, creating, say, a filesystem implementation is now often as easy as copy/pasting existing C structure definitions, either from the original source (which is usually C) or from reversing tools such as IDA/Ghidra.

There’s no right or wrong way in my opinion, just preferences.

I would assume dissect.cstruct was written for interopt with c programs using C structs, or to use formats documented as C structs. Not as a greenfield tool for arbitrary formats.

C structs seem less bad than python structs, so why not use them? Especially why write a struct parser and create a DSL for it, when there's already one that you can use that uses a well known DSL you might already understand.

OK so what's your alternative then? It's easy to say you don't like something but the onus is on to show there's something actually better.

The library used in the author's post seems perfectly readable to me, enough that it didn't even register until I read your comment. Could it be tweaked slightly to not use C syntax? Sure, but it's still going to need roughly the same pattern of identifier + type (including size). Types in C are straightforward so long as you don't have functions/pointers (which have the "inside out" problem, but they're not needed for binary encodings), so you're going to be looking at pretty trivial changes to syntax. Certainly not enough to warrant this level of quibbling.

  • idk just spitballing I would maybe do something like

      from parser import struct, packed, array, u8, u32, u64
      
      @struct(packed)
      class ASIF:
          magic: array[u8, 4]
          field4: u32
          field8: u32
          fieldC: u32
          field10: u64
          field18: u64
          field20: array[u8, 16]
          field30: u64
          field38: u64
          field40: u32
          field44: u32
          field48: u32
          field4C: u32
      
      let asif = ASIF.from_bytes(...)
      print(asif.fieldC)

    • I'll admit I do really like that.

      I still think it proves my point: your original objection was about the syntax being C-like and, as I predicted, the differences in syntax in your idea (where the type goes, colon vs positional, etc.) are all trivialities that don't affect usability.

      What's better about your idea is that it's actual Python code rather than being embedded in a string. Maybe that was your point originally and I misunderstood.

      Looks like this package works like this: https://harrymander.xyz/dataclasses-struct/