*Notes from my GSoC application that I missed! Based on **this** paper. I was applying for CPA checker’s project on implementing DAR ~ unfortunately I was late because I thought it meant 1800 EST for some reason, it was 1800 UTC (or something along those lines)! Anyway, I am taking graduate complexity theory next semester, but till then I thought I’d post my notes and elaborate on a few bits and pieces.*

### Interpolation

Interpolation is a concept used in mathematics and logic to describe the process of constructing a new formula that "fills the gap" between two existing formulas. The objective of Craig's interpolation is to provide a way to find an intermediate formula between two logical formulas that preserves their logical relationship.

Craig's interpolation theorem can be applied to formal verification, which is the process of checking whether a system or program satisfies certain properties or requirements. Specifically, interpolation can be used to simplify the process of verifying properties of a system or program by reducing the search space of the verification process.

Formal verification typically involves exploring the state space of a system or program, which can be large and complex. By finding an interpolant between two formulas, we can often reduce the search space by focusing on the shared structure between the two formulas. For example, if we have a property that we want to verify, we can use interpolation to find a formula that captures the common structure between the property and the negation of the property. This formula can then be used to guide the search for counterexamples or to construct a proof of correctness.

### Craig’s Interpolation Theorem

Suppose we have two propositional formulas:

φ= ¬(P ∧ Q) → (¬R ∧ Q)

ψ= (S → P) ∨ (S → ¬R)

We want to determine whether there exists a formula ρ that "interpolates" between φ and ψ, meaning that it satisfies the following conditions:

- Every non-logical symbol (i.e. propositional variable) in ρ occurs in both φ and ψ.
- φ logically implies ρ, i.e. whenever φ is true, ρ must also be true.
- ρ logically implies ψ, i.e. whenever ρ is true, ψ must also be true.

To find an interpolant ρ, we first write φ in conjunctive normal form (CNF), which is a standard way of expressing propositional formulas as conjunctions of disjunctions. Using standard rules of propositional logic, we can simplify φ as follows:

¬(P ∧ Q) → (¬R ∧ Q) ≡ (¬¬(P ∧ Q)) ∨ (¬R ∧ Q) (material implication) ≡ (¬P ∨ ¬Q) ∨ (¬R ∧ Q) (double negation) ≡ (¬P ∨ ¬R) ∧ (¬P ∨ Q) (distributivity)

Thus, φ is equivalent to the CNF formula (¬P ∨ ¬R) ∧ (¬P ∨ Q). Now, notice that the two propositional variables P and R that appear in this formula also appear in ψ. Specifically, P appears in the first disjunct (S → P) and in the second disjunct (S → ¬R), and R appears in the second disjunct (S → ¬R).

Therefore, we can take ρ to be the propositional formula P ∨ ¬R, which we can verify is indeed an interpolant:

- Every non-logical symbol in ρ (i.e. P and R) also appears in both φ and ψ.
- If φ is true, then either ¬P or ¬R must be true. But if P is true, then (S → P) is true, which by disjunction implies ψ. If ¬R is true, then (¬R ∧ Q) is false, which makes the whole right-hand side of φ true, so ψ is also true. Therefore, whenever φ is true, ρ (i.e. P ∨ ¬R) must also be true.
- If ρ (i.e. P ∨ ¬R) is true, then either P is true or ¬R is true. If P is true, then (S → P) is true, which by disjunction implies ψ. If ¬R is true, then (S → ¬R) is true, which by disjunction implies ψ. Therefore, whenever ρ is true, ψ must also be true.

**ELI5**: Let's imagine you have two friends, Alice and Bob, who are arguing about something. Alice says one thing, and Bob says something else. You want to figure out if Alice's statement can lead to Bob's statement being true.

Craig's interpolation theorem is like a special tool that can help you solve this problem. The tool is a special formula that you can use to find out if Alice's statement can lead to Bob's statement.

To use the tool, you need to find a common part in Alice's and Bob's statements. This common part is like a shared toy that both Alice and Bob like to play with.

Once you find this common part, you can use Craig's formula to create a new statement that uses only this common part. This new statement is like a special toy that both Alice and Bob can play with together.

If Alice's statement can lead to Bob's statement being true, then this new statement will be true too. This is like Alice and Bob both playing with the special toy and having fun together.

But if Alice's statement cannot lead to Bob's statement being true, then the new statement will be false. This is like the special toy being broken or not working properly, and Alice and Bob not being able to play with it together.

So, Craig's interpolation theorem is like a special tool that helps you figure out if one statement can lead to another statement being true, by finding a common part and creating a new statement that uses only this common part.

### A note on bounded and unbounded model checking

Unbounded model checking is a technique for verifying properties of systems with infinite or unbounded state spaces. In contrast to bounded model checking, which only explores a finite prefix of the system's behavior, unbounded model checking explores the entire state space of the system.

The basic idea behind unbounded model checking is to use abstraction and refinement techniques to represent the infinite state space of the system using a finite, abstract model. The abstract model captures the essential behavior of the system and allows for efficient verification algorithms to be applied.

### Limitations on Craig Interpolation

Craig interpolation is a refutation-based technique in formal verification.

In a refutation-based approach, the goal is to prove the negation of a property, rather than proving the property directly. To prove the negation of a property, we assume that the property is false, and then attempt to construct a proof that leads to a contradiction.

In the context of Craig interpolation, we are trying to find an interpolant between a formula A and its negation ¬A. The interpolant captures the shared structure between A and ¬A, and can be used to guide the search for a counterexample to the property A. To construct the interpolant, Craig interpolation uses a refutation-based technique, where we assume that A is true and ¬A is false, and attempt to construct a proof that leads to a contradiction.

Specifically, Craig interpolation constructs a proof by contradiction using a tableau-based algorithm. The algorithm builds a tree structure of possible truth assignments for the variables in A and ¬A, and attempts to find a contradiction by propagating conflicts between the truth assignments. The interpolant is constructed by selecting a subset of the conflict clauses that are common to both A and ¬A, and combining them in a way that preserves their logical relationship.

While Craig interpolation is sound in theory, there is no guarantee that an interpolant will always be found, even if A logically implies ¬B. This can happen if the formulas A and ¬B do not share enough structure to construct an interpolant.

In some cases, the interpolant may be too specific to the given formulas A and ¬A, and may not generalize well to other similar formulas. This can lead to overfitting and inaccurate results.

### IC3 (Incremental Inductive Construction and Checking)

The main idea behind IC3 is to incrementally build an over-approximation of the reachable state space of the system, starting from an initial set of reachable states. The over-approximation is constructed using an induction-based approach, where the goal is to prove that the system satisfies a safety property by showing that the negation of the property is unreachable from the initial set of reachable states.

The IC3 algorithm works by iteratively adding new clauses to the over-approximation until the negation of the safety property is proved to be unreachable. In each iteration, the algorithm checks whether the new clauses can be added to the over-approximation without introducing any new reachable states. If the new clauses do not introduce any new reachable states, then the over-approximation is updated and the algorithm proceeds to the next iteration. If the new clauses do introduce new reachable states, then the algorithm uses Craig's interpolation to construct a proof that shows the existence of a counterexample.

### Forward Reachability Analysis

Forward reachability analysis is a method used in formal verification to determine whether a system can reach a certain state or set of states from its initial state. The system is typically represented as a finite-state machine, where each state represents a configuration of the system, and the transitions between states represent the possible actions that the system can take.

Suppose we have a simple finite-state machine that models a traffic light system with two states: "Green" and "Red". The traffic light starts in the "Green" state, and it can transition to the "Red" state when a button is pressed. We want to use forward reachability analysis to determine whether the traffic light can ever reach the "Red" state from its initial state.

```
public class TrafficLight {
enum State {
Green,
Red
}
private State currentState = State.Green;
public void pressButton() {
if (currentState == State.Green) {
currentState = State.Red;
System.out.println("Traffic light turned red");
} else {
System.out.println("Traffic light is already red");
}
}
public static void main(String[] args) {
TrafficLight trafficLight = new TrafficLight();
System.out.println("Initial state: " + trafficLight.currentState);
// Perform forward reachability analysis
boolean canReachRedState = false;
for (int i = 0; i < 10; i++) {
trafficLight.pressButton();
System.out.println("Current state: " + trafficLight.currentState);
if (trafficLight.currentState == State.Red) {
canReachRedState = true;
break;
}
}
if (canReachRedState) {
System.out.println("The traffic light can reach the Red state");
} else {
System.out.println("The traffic light cannot reach the Red state");
}
}
}
```

Think of FRA as fundamentally a loop over a bunch of states, but it is obviously not that simple.

### Backward Reachability Analysis

Backward reachability analysis is a technique used in formal verification to check whether a system can reach a certain state, such as an error state or an unsafe state, starting from any of its initial states. The technique starts from the goal state, which is typically an error state or an unsafe state, and works backward to determine whether there exists a path of transitions from the initial state(s) of the system that can lead to the goal state.

The backward reachability analysis typically involves computing the pre-image of the goal state under the system's transition relation. The pre-image is the set of states from which the system can transition to the goal state in one step. This pre-image is then used as the new goal state for the next step of the analysis, and the process is repeated until either a path from an initial state to the goal state is found, or it is determined that no such path exists.

Backward reachability analysis is particularly useful for verifying safety properties of a system, such as absence of deadlock, buffer overflow, or other types of errors. It complements forward reachability analysis, which checks whether a system can reach a certain state by starting from its initial state and working forward.

### Overapproximation

Over-approximation is a technique used to compute an approximation of the reachable states of a system that may be larger than the actual set of reachable states. An over-approximation can be computed by using a conservative approximation of the system's behavior, which may lead to false positives, but guarantees that any property that holds for the over-approximation also holds for the actual set of reachable states.

Over-approximations are useful in formal verification as they can be used to prove properties of the system without exhaustively exploring the entire state space of the system, which may be very large or even infinite. However, the trade-off for using over-approximations is that they may be more computationally expensive to compute and may lead to less precise results than under-approximations or exact techniques.

### Transition Relation

In formal verification of systems, a transition relation is a mathematical representation of the behavior of a system, which describes how the system can move from one state to another. The transition relation is typically modeled as a binary relation between states, where each pair of states in the relation represents a possible transition between those states.

### DAR

DAR uses the observation that a single SAT check provides information about both states reachable from the initial state and states that reach the bad state. The DAR algorithm computes a Forward Reachability Sequence (F¯) and a Backward Reachability Sequence (B¯) that represent an over-approximation of the set of states that are reachable from the initial state and the set of states that can reach the bad state, respectively, in exactly i transitions. Both sets are disjoint from each other. The existence of either F¯ or B¯ of length n ensures that no counterexample of length n exists in M. The algorithm seems to use two different interpolants extracted from an unsatisfiable propositional formula, one for forward reachability and the other for backward reachability, to traverse the state space in a dual fashion.

### First Iteration of DAR

- Initialize the forward reachability sequence F¯ to the set of initial states, denoted by INIT(V), and the backward reachability sequence B¯ to the complement of the bad states, denoted by ¬p(V).
- Check the satisfiability of the propositional formula INIT(V) ∧ TR(V,V') ∧ ¬p(V), where TR(V,V') is the transition relation. If the formula is satisfiable, then a counterexample of length one exists, meaning that there is a transition from an initial state to a bad state in one step. If the formula is unsatisfiable, proceed to step 3.
- Extract two interpolants using the unsatisfiable formula INIT(V) ∧ TR(V,V') ∧ ¬p(V). The forward interpolant provides an over-approximation of the post-image of INIT(V) that is disjoint from ¬p(V), denoted by F1 = FI(F0, B0). The backward interpolant provides an over-approximation of the pre-image of ¬p(V) that is disjoint from INIT(V), denoted by B1 = BI(F0, B0).
- Update the forward reachability sequence F¯ to include the new over-approximation of the states reachable in one additional step: F1 = FI(F0, B0). Similarly, update the backward reachability sequence B¯ to include the new over-approximation of the states that can reach the bad states in one additional step: B1 = BI(F0, B0).
- Repeat steps 2-4 for a larger number of steps to compute longer sequences of reachable states, until the desired level of precision is achieved or a counterexample is found.

### Generalised DAR iteration

(I wasn’t sure about this, but here’s my understanding)

The algorithm aims to check the validity of a counterexample of a given length in a finite transition system by iteratively refining over-approximations of the set of reachable states from the initial state and the set of states that can reach the bad state. The local strengthening phase checks if the approximated counterexample is locally invalid, which is a sufficient condition for its invalidity. The algorithm performs iterative local strengthening to update the approximations of reachable states and states that can reach the bad state, in order to refine the over-approximations and find a valid or invalid counterexample of the given length.