← Back to context

Comment by hkopp

5 years ago

Cool. I have a background in IT security and am sometimes doing symbolic (or concolic) execution on binaries instead of source code. So I may not be the target audience.

Nevertheless I think you should do a comprehensive list what your tool can detect and what not, instead of some examples. From the sample programs i learned that it can check two programs for semantical equivalence and can detect undefined behavior. I did not get the example with the memory violation, as symbolica.h is not included in the code. Additionally, your tool could create test cases that trigger the error, afaik.

Regarding the feedback for the user, as multiple others have mentioned, the running time is very long and thus a simple timer may not be sufficient. But I do not have an idea how to improve that.

I am not sure if you are working on the target program directly, or on the compiled binary. In the example with the division by zero, there is a print statement in order to stop the compiler from removing dead code. But I do not understand where you need a compiler, if you are working directly on C. On the other hand, when you are working on the compiled binary, then your tool is very similar to other symbolic execution engines, such as angr.

Hey, thanks for the feedback. A more comprehensive list of features is on our todo list, we’ll update the site shortly.

The memory violation one doesn’t include symbolica.h because we don’t need to symbolize any variables for that one. When we run the compiled code through our solver it adds additional constraints such as, “don’t access memory that’s out of bounds”, so we can detect that automatically.

On your final point, we’re working on the LLVM bitcode so we do have to compile the code, hence the print statement. You’re right in that this is similar to other symbolic executors. We built this one as we found others were quite difficult to get started with and were hard to extend with the features we personally wanted as well as not supporting the languages we wanted either. Obviously at this point in time we’ve not achieved these goals yet, but we think we’ve built the foundations to be able to do so.

  • Thank you for the answer and a big kudos for the project.

    Regarding the memory violation, could symbolica deal with symbolic memory? Can it deal with symbolic files as input? How about syscalls with symbolic inputs? These are the main problems I had when I worked on my toy symbolic execution engine. If yes, you should definitely market these features.

    • Yeah it can deal with symbolic memory. As long as the memory allocation is fixed size the contents can be fully or partially symbolic. We also support symbolic addresses for allocations to ensure that pointer arithmetic is fully tested.

      We're currently simulating a lot of the underlying system at the C std library level. For a number of reasons we'd like to lower this to the raw syscall and assembly level. This would allow any lib C implementation to be tested along with the application code, and we may even be able to simulate threading and the file system. Syscalls could be made symbolic along with files by treating the entire system symbolically, but obviously this is a lot of work so it's something that we're gradually building towards.