← Back to context

Comment by hansvm

1 hour ago

My initial implementation used [9][9]u9 (which desugars to [9][9]u16 with some zero bits) and was a fair bit slower. If I had to guess, it's because the shift/extract/align you're describing isn't actually a part of the core solving algorithm, and when you have box constraints, knights-move constraints, etc, you're usually not doing anything which fits in a single u16.