-
Notifications
You must be signed in to change notification settings - Fork 54
Add a new delegate to allow API tracing #505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…vironment (and later TraceInterpolatingProverEnvironment)
I started working on the delta-debugger today and wrote a python script to reduce the size of the traces. So far it does little more that some dead-code elimination, but that's already enough to bring down the size of the trace by a factor of ten. I believe that another factor of two should be possible with some aggressive optimization. The issue now is that I don't quite know where to put such a script in JavaSMT. We could handle this as a separate project, or maybe include it in the JavaSMT source tree, similar to the @baierd, @kfriedberger: What is your opinion? Here is the file in question: #!/usr/bin/env python3
import re
import sys
from collections import defaultdict
from pathlib import Path
# Read a trace file
def readTrace(path):
with open(path) as file:
return [line.rstrip() for line in file]
# Build a map with line numbers for all variable definitions
def getLinesForDefinitions(trace):
lineNumber = 1
lineDefs = dict()
for line in trace:
if line.find('=') >= 0:
leftSide = line[0:(line.find('=') - 1)]
name = re.match('var (.*)', leftSide)
lineDefs[name.group(1)] = lineNumber
lineNumber = lineNumber + 1
return lineDefs
# Build a dependency graph for the definitions
# Maps from variables to the places where they are used
def buildDependencies(lineDefs, trace):
lineNumber = 1
deps = defaultdict(list)
for line in trace:
expr = line[(line.find('=') + 2):] if line.find('=') >= 0 else line
object = expr[0:expr.find('.')]
if object[0].islower():
deps[lineDefs[object]].append(lineNumber)
# FIXME Parse the expression to get the variables
for m in re.finditer('(config|logger|notifier|var[0-9]+)', expr):
deps[lineDefs[m.group()]].append(lineNumber)
lineNumber += 1
return deps
# Collect all top-level statements
# Top-level statements are:
# *.addConstraint(*)
# *.isUnsat()
# *.getModel()
# *.asList()
# FIXME Finish this list
def usedTopLevel(lineDefs, trace):
tl = set()
for line in trace:
m = re.fullmatch(
'var (var[0-9]+) = (var[0-9]+).(isUnsat\\(\\)|getModel\\(\\)|asList\\(\\)|addConstraint\\((var[0-9]+)\\));',
line)
if m != None:
tl.add(lineDefs[m.group(1)])
return tl
# Calculate the closure of all used definitions, starting with the top-level statements
def usedClosure(tl, deps):
cl = set()
st = set(tl)
while cl.union(st) != cl:
cl = cl.union(st)
st = set()
for (key, val) in deps.items():
if set(val).intersection(cl) != set():
st.add(key)
return cl
# Keep only statements and definitions that are used
def filterUnused(used, trace):
lineNumber = 1
reduced = []
for line in trace:
if line.find('=') == -1 or lineNumber in used:
reduced.append(line)
lineNumber += 1
return reduced
# Remove all definitions that are not used (recursively)
def removeDeadCode(trace):
lineDefs = getLinesForDefinitions(trace)
deps = buildDependencies(lineDefs, trace)
tl = usedTopLevel(lineDefs, trace)
cl = usedClosure(tl, deps)
return filterUnused(cl, trace)
# We'll use multiple passes to reduce the size of the trace:
# 1. Read the trace
# 2. Remove unused code
# 3. Remove unnecessary toplevel commands
# 4. Loop: Remove aliasing (by duplicating the definitions)
# 5. Loop: Reduce terms
# 6. Remove unused prover environments
if __name__ == '__main__':
arg = sys.argv
if not len(sys.argv) == 2:
print('Expecting a path to a trace file as argument')
exit(-1)
path = Path(sys.argv[1])
if not (path.is_file()):
print(f'Could not find file "{path}"')
exit(-1)
# TODO Implement steps 3-6
# TODO Check that the reduced trace still crashes
trace = readTrace(path)
for line in removeDeadCode(trace):
print(line)The idea is to run JavaSMT with |
…ct that is not tracked
…stitute to add the new terms to the cache Trace: Rebuild formulas in mgr.applyTactic and mgr.simplify to add the terms to the cache
JavaSMT throws an exception only after these declarations were already added to the trace. We simply ignore these symbols as they are never used
The script will translate all traces from the JavaSMT tests and then passes them to the solver How to use: ant tests cd scripts ./tracingTest.sh cvc5 --incremental --fp-exp
Since Smtlib 2.7 the standard says the name should be int_to_bv, but this is not recognized by the MathSAT parser
|
I've added a new script that converts the traces from the tests to Smtlib and then tries them out on a solver. Here is how it can be used:
@Parameters(name = "{0}")
public static Solvers[] getAllSolvers() {
return new Solvers[] {Solvers.CVC5};
}
The last step requires the cvc5 binaries to be installed on the system. Alternatively you may download the files from github and then call the script with a path to the cvc5 binary: |
|
Thank you very much for your hard work! To answer your previous question about the script and address the current state; ideally we want all to be integrated into JavaSMT in a way that we can just set 1 (or multiple) options and get the result of all of this in a single file. Could you make a full list of the steps that execute either a new script or JavaSMT and what would be needed to move the steps into JavaSMT (per point in the list). E.g. 1. execute script 1 (needed because ...), 2. execute JavaSMT from script 1 (to...)... |
Thanks!
We could have the tracer output Smtlib directly. However, collecting the trace and reducing the trace will always have to be done in two separate steps as the original run might segfault. After that there just isn't any (safe) way to recover and start with the reduction step The current workflow therefore looks like this:
As mentioned, we could get rid of the second step by outputting Smtlib directy. The downside is that this limits our options if some of the trace can't be expressed in Smtlib. It's hard to tell how often this will happen in practice, but I think that for now it's better to output the trace as a JavaSMT program first. That way, we could still try to write our own delta-debugger for JavaSMT if the conversion to Smtlib turns out to be a problem
You can find an example run in one of my earlier runs. It should work if you apply this little patch to your Alternatively you could just run one of the tests in IntelliJ. Tracing is enabled by default on this branch, so there is no need to set any options. I've picked After the run the traces can be found in Now you could simply copy this code into a new JavaSMT project and run it. However, even for this simple test the trace is already close to 100 lines long, and generally traces can grow much larger. This is a problem as the JVM has a limit on the code size of a method, so traces larger than ~5000 lines will generally not compile. Even if we found some way around this limitation the traces would still be way too large to report to the developers if there is a bug in one of the solver. Because of this the next step is now to convert the JavaSMT trace to Smtlib: Here is the output for our test. It's mostly a line-by-line translation of the earlier JavaSMT trace: The smt trace can then be run on any solver: Since there is no crash, there is nothing to debug and we're finished. Otherwise, the next step would be to use Finally, the bash script |
Needed if the variable does not occur in the body of the quanified formula

This is a preliminary draft for adding API tracing to JavaSMT with the help of a new delegate. The idea is to record all API calls and generate a new Java program from them. By running this program the exact sequence of calls can then be recreated. The main application here is debugging, where the traces allow us to create easy to reproduce examples for solver errors. This is especially useful when the error occurs as part of a larger program where it can be hard to pin down the exact sequence of JavaSMT calls that are needed to trigger the bug.
We use a new delegate to implement this feature. Setting
solver.tracetotruewill enable tracing, and the output will be stored in a file calledtrace*.javaTODO
Finish the implementation. Currently we only have (parts of) the ArrayFormulaManager, IntegerFormulaManager, BooleanFormulaManager, UFManager and ProverEnvironmentWrite the trace to a file while it's being created. We'll need this to debug segfaults as the trace is otherwise lostdoneConsider adding an option to skip duplicate calls. (The trace is currently way too long)Fixed, but not committed yetWrite a simple delta-debugger to shrink the trace down even further3Maybe later..We're now using ddSmt, see comment #505 (comment)
Things left to do
Add support for missing formula managers in the scriptStill missing: floating point, quantifier, strings and separation logic. At least the first two should still be added before mergingHandle solver options in the scriptModel generation and global definitions are now turned on by default. Other options can be added by the user
Fix undo point in the trace loggerDone, but we should double check theShould be fine now, but still needs to be cleaned upRebuilder