Gymbo is a Proof of Concept for a Gradient-based Symbolic Execution Engine, implemented from scratch. Building on recent advancements that utilize gradient descent to solve SMT formulas [1, 2], Gymbo leverages gradient descent to discover input values that fulfill each path constraint during symbolic execution.
Compared to other prominent symbolic execution tools, Gymbo's implementation is notably simpler and more compact. We hope that this project will assist individuals in grasping the fundamental principles of symbolic execution and SMT problem-solving through gradient descent.
One practical usage of Gymbo is debugging ML models like neural networks to detect unexpected behaviors. For example, you can generate adversarial examples with Gymbo by converting neural networks to imperative programs.
Another unique feature of Gymbo is that it can track the probabilistic symbolic variables. We adopt the PBRANCH algorithm proposed in [3] and currently support the discrete uniform, bernoulli, and binomial distributions.
Please note that Gymbo is currently under development and may have bugs.
Your feedback and contributions are always greatly appreciated.
Parses a logical expression from a C-like language program.
Definition: parser.h:240
Please note that Gymbo currently ignores / when solving path constraints.
Internal Algorithm
Gymbo converts the path constraint into a numerical loss function, which becomes negative only when the path constraint is satisfied. Gymbo uses the following transformation rule:
!(a) => -a + eps
(a < b) => a - b + eps
(a <= b) => a - b
(a > b) => b - a + eps
(a >= b) => b - a
(a == b) => abs(a - b)
(a != b) => -abs(a - b) + eps
(a && b) => max(a, b)
(a || b) => min(a, b)
Here, eps is the smallest positive value of the type for a and b.
Optionally, Gymbo can use DPLL (SAT solver) to decide the assignment for each unique term, sometimes resulting in better scalability.
CLI Tool
gymbo command accepts the following command-line options:
-d: Set the maximum depth for symbolic execution (default: 256).
-v: Set the verbosity level (default: 1). Use -1 for minimal output.
-i: Set the number of iterations for gradient descent (default: 100).
-a: Set the step size for gradient descent (default: 1.0).
-e: Set the eps for the differentiable expression (default: 1.0).
-t: Set the maximum number of trials of gradient descent (default: 3)
-l: Set the minimum value of initial parameters (default: -10)
-h: Set the maximum value of initial parameters (default: 10)
-s: Set the random seed (default: 42)
-p: (optional) If set, use DPLL to determine the assignment for each term. Otherwise, solve the loss function directly transformed from the path constraints.
./gymbo "if (a < 3) if (a > 4) return 1;" -v 0
>Compiling the input program...
>Start Symbolic Execution...
>---------------------------
>Result Summary
>#Total Path Constraints: 4
>#SAT: 3
>#UNSAT: 1
>List of UNSAT Path Constraints
># var_1 < 3 and 4 < var_1 and 1
libgymbo: Header-only Library
Since gymbo consists of the header-only library, you can easily create your own symbolic execution tool.
We offer some helper functions within pymlgymbo library to convert the ML models of famous Python library like sklearn and PyTorch to the program that gymbo can process.
The following code generates the adversarial examples against a neural network, as proposed in [3]
Gymbo is entirely implemented in C++ and requires only standard libraries. The process of compiling from source code to stack machines is based on the implementation of rui314/chibicc [4], while the symbolic execution approach is inspired by beala/symbolic [5].