Comment by tom_mellior

5 years ago

Ah, sorry, I should probably clarify that there are two completely distinct uses of the term "interval" in data flow analysis.

One (introduced in https://amturing.acm.org/p137-allen.pdf and discussed in the old Dragon Book) groups program points into nested "intervals", i.e., intervals are pieces of code. Analysis information is then propagated among these intervals. This is what I was referring to above. As far as I know this approach is completely obsolete by now, with everyone using methods based more or less indirectly on Kildall's worklist iteration approach (http://static.aminer.org/pdf/PDF/000/546/451/a_unified_appro...). The theoretical advantage I was referring to above was that the interval approach is AFAIR sometimes more efficient because it can take program structure (loop nesting) into account more directly.

The other meaning of "intervals" is used for an analysis domain, i.e., the kind of data you want your analysis to compute. For example, you might determine that the possible values of some variable x are in the interval [10, 20], the values of y are in the interval [5, 15], and then you can compute that the values of z = x + y must be in [15, 35]. There is nothing wrong with this approach, and any optimizing compiler is pretty much forced to use something like this. Based on your reference to modular congruences I think you probably want to do analysis of numerical values and meant this sense, which is anyway the only one in common use nowadays. Go ahead and do this. Extensions are with congruences, like "x is in [10, 20] and also congruent to 1 modulo 4" and/or "known bits", like "x is in [10, 20], and I know that (counting from the LSB) its bit 2 is 0 and bit 3 is 1".

Personally I learned about data flow analysis in a university course based on Principles of Program Analysis: https://www.springer.com/gp/book/9783540654100. It's a book that goes into a lot of mathematical depth but not much breadth of analyses, and it doesn't handle data flow analysis on SSA form. Then I learned additional details from various well-known papers like the ones linked above, and the classical early papers around analysis on SSA form, like https://www.cse.wustl.edu/~cytron/531Pages/f11/Resources/Pap.... I'm sure there must be a more straightforward text to get you there, but I don't think I can give a definite recommendation. I have vague memories of the coverage in Muchnick's "Advanced Compiler Design and Implementation" and/or Cooper and Torczon's "Engineering a Compiler" being reasonable, but I can't find my copies right now.