Problem

IntervalMDP.ProblemType
Problem{S <: IntervalMarkovProcess, F <: Specification}

A problem is a tuple of an interval Markov process and a specification.

Fields

  • system::S: interval Markov process.
  • spec::F: specification (either temporal logic or reachability-like).
source
IntervalMDP.SpecificationType
Specification{F <: Property}

A specfication is a property together with a satisfaction mode and a strategy mode. The satisfaction mode is either Optimistic or Pessimistic. See SatisfactionMode for more details. The strategy mode is either Maxmize or Minimize. See StrategyMode for more details.

Fields

  • prop::F: verification property (either temporal logic or reachability-like).
  • satisfaction::SatisfactionMode: satisfaction mode (either optimistic or pessimistic). Default is pessimistic.
  • strategy::StrategyMode: strategy mode (either maximize or minimize). Default is maximize.
source
IntervalMDP.SatisfactionModeType
SatisfactionMode

When computing the satisfaction probability of a property over an interval Markov process, be it IMC or IMDP, the desired satisfaction probability to verify can either be Optimistic or Pessimistic. That is, upper and lower bounds on the satisfaction probability within the probability uncertainty.

source
IntervalMDP.StrategyModeType
StrategyMode

When computing the satisfaction probability of a property over an IMDP, the strategy can either maximize or minimize the satisfaction probability (wrt. the satisfaction mode).

source

Temporal logic

IntervalMDP.LTLFormulaType
LTLFormula

Linear Temporal Logic (LTL) property (first-order logic + next and until operators) [1]. Let $ϕ$ denote the formula and $M$ denote an interval Markov process. Then compute $M ⊧ ϕ$.

[1] Vardi, M.Y. (1996). An automata-theoretic approach to linear temporal logic. In: Moller, F., Birtwistle, G. (eds) Logics for Concurrency. Lecture Notes in Computer Science, vol 1043. Springer, Berlin, Heidelberg.

source
IntervalMDP.LTLfFormulaType
LTLfFormula

An LTL formula over finite traces [1]. See LTLFormula for the structure of LTL formulas. Let $ϕ$ denote the formula, $M$ denote an interval Markov process, and $H$ the time horizon. Then compute $M ⊧ ϕ$ within traces of length $H$.

Fields

  • formula::String: LTL formula
  • time_horizon::T: Time horizon of the finite traces

[1] Giuseppe De Giacomo and Moshe Y. Vardi. 2013. Linear temporal logic and linear dynamic logic on finite traces. In Proceedings of the Twenty-Third international joint conference on Artificial Intelligence (IJCAI '13). AAAI Press, 854–860.

source
IntervalMDP.isfinitetimeMethod
isfinitetime(spec::LTLfFormula)

Return true for an LTLf formula. LTLf formulas are specifically over finite traces.

source
IntervalMDP.PCTLFormulaType
PCTLFormula

A Probabilistic Computation Tree Logic (PCTL) formula [1]. Let $ϕ$ denote the formula and $M$ denote an interval Markov process. Then compute $M ⊧ ϕ$.

[1] M. Lahijanian, S. B. Andersson and C. Belta, "Formal Verification and Synthesis for Discrete-Time Stochastic Systems," in IEEE Transactions on Automatic Control, vol. 60, no. 8, pp. 2031-2045, Aug. 2015, doi: 10.1109/TAC.2015.2398883.

source

Reachability

IntervalMDP.FiniteTimeReachabilityType
FiniteTimeReachability{VT <: Vector{<:CartesianIndex}, T <: Integer}

Finite time reachability specified by a set of target/terminal states and a time horizon. That is, denote a trace by $s_1 s_2 s_3 \cdots$, then if $T$ is the set of target states and $H$ is the time horizon, the property is

\[ \mathbb{P}(\exists k = \{0, \ldots, H\}, s_k \in T).\]

source
IntervalMDP.InfiniteTimeReachabilityType
InfiniteTimeReachability{R <: Real, VT <: Vector{<:CartesianIndex}}

InfiniteTimeReachability is similar to FiniteTimeReachability except that the time horizon is infinite, i.e., $H = \infty$. In practice it means, performing the value iteration until the value function has converged, defined by some threshold convergence_eps. The convergence threshold is that the largest value of the most recent Bellman residual is less than convergence_eps.

source
IntervalMDP.terminal_statesMethod
terminal_states(prop::InfiniteTimeReachability)

Return the set of terminal states of an infinite time reachability property.

source
IntervalMDP.convergence_epsMethod
convergence_eps(prop::InfiniteTimeReachability)

Return the convergence threshold of an infinite time reachability property.

source

Reach-avoid

IntervalMDP.FiniteTimeReachAvoidType
FiniteTimeReachAvoid{VT <: AbstractVector{<:CartesianIndex}}, T <: Integer}

Finite time reach-avoid specified by a set of target/terminal states, a set of avoid states, and a time horizon. That is, denote a trace by $s_1 s_2 s_3 \cdots$, then if $T$ is the set of target states, $A$ is the set of states to avoid, and $H$ is the time horizon, the property is

\[ \mathbb{P}(\exists k = \{0, \ldots, H\}, s_k \in T, \text{ and } \forall k' = \{0, \ldots, k\}, s_k' \notin A).\]

source
IntervalMDP.terminal_statesMethod
terminal_states(prop::FiniteTimeReachAvoid)

Return the set of terminal states of a finite time reach-avoid property. That is, the union of the reach and avoid sets.

source
IntervalMDP.terminal_statesMethod
terminal_states(prop::InfiniteTimeReachAvoid)

Return the set of terminal states of an infinite time reach-avoid property. That is, the union of the reach and avoid sets.

source

Safety

IntervalMDP.FiniteTimeSafetyType
FiniteTimeSafety{VT <: Vector{<:CartesianIndex}, T <: Integer}

Finite time safety specified by a set of avoid states and a time horizon. That is, denote a trace by $s_1 s_2 s_3 \cdots$, then if $A$ is the set of avoid states and $H$ is the time horizon, the property is

\[ \mathbb{P}(\forall k = \{0, \ldots, H\}, s_k \notin A).\]

source
IntervalMDP.InfiniteTimeSafetyType
InfiniteTimeSafety{R <: Real, VT <: Vector{<:CartesianIndex}}

InfiniteTimeSafety is similar to FiniteTimeSafety except that the time horizon is infinite, i.e., $H = \infty$. In practice it means, performing the value iteration until the value function has converged, defined by some threshold convergence_eps. The convergence threshold is that the largest value of the most recent Bellman residual is less than convergence_eps.

source

Reward specification

IntervalMDP.FiniteTimeRewardType
FiniteTimeReward{R <: Real, AR <: AbstractArray{R}, T <: Integer}

FiniteTimeReward is a property of rewards $R : S \to \mathbb{R}$ assigned to each state at each iteration and a discount factor $\gamma$. The time horizon $H$ is finite, so the discount factor is optional and the optimal policy will be time-varying. Given a strategy $\pi : S \to A$, the property is

\[ V(s_0) = \mathbb{E}\left[\sum_{k=0}^{H} \gamma^k R(s_k) \mid s_0, \pi\right].\]

source
IntervalMDP.rewardMethod
reward(prop::FiniteTimeReward)

Return the reward vector of a finite time reward optimization.

source
IntervalMDP.discountMethod
discount(prop::FiniteTimeReward)

Return the discount factor of a finite time reward optimization.

source
IntervalMDP.InfiniteTimeRewardType
InfiniteTimeReward{R <: Real, AR <: AbstractArray{R}}

InfiniteTimeReward is a property of rewards assigned to each state at each iteration and a discount factor for guaranteed convergence. The time horizon is infinite, i.e. $H = \infty$, so the optimal policy will be stationary.

source
IntervalMDP.rewardMethod
reward(prop::FiniteTimeReward)

Return the reward vector of a finite time reward optimization.

source
IntervalMDP.discountMethod
discount(prop::FiniteTimeReward)

Return the discount factor of a finite time reward optimization.

source

Hitting time

IntervalMDP.ExpectedExitTimeType
ExpectedExitTime{R <: Real, VT <: Vector{<:CartesianIndex}}

ExpectedExitTime is a property of hitting time with respect to an unsafe set. An equivalent characterization is that of the expected number of steps in the safe set until reaching the unsafe set. The time horizon is infinite, i.e., $H = \infty$, thus the package performs value iteration until the value function has converged. The convergence threshold is that the largest value of the most recent Bellman residual is less than convergence_eps. As this is an infinite horizon property, the resulting optimal policy will be stationary. In formal language, given a strategy $\pi : S \to A$ and an unsafe set $O$, the property is defined as

\[ V(s_0) = \mathbb{E}\left[\lvert \omega_{0:k-1} \rvert \mid s_0, \pi, \omega_{0:k-1} \notin O, \omega_k \in O \right]\]

where $\omega = s_0 s_1 \ldots s_k$ is the trajectory of the system, $\omega_{0:k-1} = s_0 s_1 \ldots s_{k-1}$ denotes the subtrajectory excluding the final state, and $\omega_k = s_k$.

source