← Back to context

Comment by affyboi

4 years ago

Hey guys, author of the project here. I'll admit I was a bit surprised to see this on the front page, I want to say that this is very much alpha software (I very deliberately can't let it hit 1.0 yet), so definitely very rough around the edges. I've been very busy with work recently so I've definitely had a lull in my open source contributions, but I hope that I can get back on track soon (TM).

Right now I have plans to implement the Myers diff algorithm to make diffsitter more efficient (profiling shows that the current biggest bottleneck is memory allocations), and after that I wanted to start experimenting with different heuristics to make the diffs themselves more useful. Support for the unified diff format is also in my plans.

It's nice to see people discussing the project, I'm very flattered that you guys have taken the time to look at it, and please feel free to open issues with features that you want to see or any shortcomings you've found.

The backstory for this project is that I once had to review a diff in a golang codebase that realigned a bit struct because someone added a new field, and I was slightly annoyed that I was seeing all these extra lines changed even though I really only cared about the fields that were added, so I did a little weekend project to see if it was possible to do a diff directly on the AST instead, and I've been working on this in small pieces ever since.

cool idea!

have you thought about using the ast representation to regenerate "canonicalized" versions of the source files upon which you can then represent the diffs (and context)?

  • Sorry I'm not sure I quite follow by what you mean by canonical, do you something like applying a standard style?

    • yeah, but more aggressive. for both the left and right side, parse to an ast and then go back to text, but do things like enforce orderings on things where orderings don't matter and strip out comments, and force whitespace to be clearly defined... so you have a "just the facts" view into what changed that plays nice with existing diff tools.

      (i'm not suggesting this be committed, but rather it be a view)