A (finite piecewise-constant dense-time) signal over alphabet Σ\Sigma is a left-continuous function f ⁣:(0,δ]Σf \colon (0, \delta] \to \Sigma with δR0\delta \in \R_{\geq 0} and having finitely many discontinuities. The duration δ\delta of ff is denoted f|f|. Every signal can be summarized by a unique sequence seq(f)=(σ1,δ1)(σk,δk)(Σ×R>0)\mathrm{seq}(f) = (\sigma_1, \delta_1) \cdots (\sigma_k, \delta_k) \in (\Sigma \times \R_{>0})^*, where σiσi+1\sigma_i \neq \sigma_{i+1} for all i<ki < k, which we write more compactly as σ1δ1σkδk\sigma_1^{\delta_1} \cdots \sigma_k^{\delta_k}.

For example, a1.5b1a2.5a^{1.5} b^1 a^{2.5} represents the signal f ⁣:(0,5]Σf \colon (0, 5] \to \Sigma given by

(0,1.5]a, (1.5,2.5]b, (2.5,5]a. (0, 1.5] \mapsto a,\ (1.5, 2.5] \mapsto b,\ (2.5, 5] \mapsto a.

This signal can be visualized as follows:

Given a signal ff with seq(f)=σ1δ1σkδk\mathrm{seq}(f) = \sigma_1^{\delta_1} \cdots \sigma_k^{\delta_k}, the aa-duration of ff is fa=i:σi=aδi|f|_a = \sum_{i : \sigma_i = a} \delta_i. Moreover, we define the number of pieces of ff as #f=k\#f = k. For our previous example, we have fa=4|f|_a = 4, fb=1|f|_b = 1 and #f=3\# f = 3.

Concatenation

We write S(Σ)\mathcal{S}(\Sigma) to denote the set of all signals, and we write ε\varepsilon to denote the unique signal of duration 00. Given f,gS(Σ)f, g \in \mathcal{S}(\Sigma), we write fgf \cdot g to denote the signal h ⁣:(0,f+g]h \colon (0, |f| + |g|] given by

h(τ)={f(τ)if τf,g(τf)otherwise. h(\tau) = \begin{cases} f(\tau) & \text{if } \tau \leq |f|, \\ g(\tau - |f|) & \text{otherwise}. \end{cases}

Note that seq(fg)\mathrm{seq}(f \cdot g) differs from seq(f)seq(g)\mathrm{seq}(f) \cdot \mathrm{seq}(g) when the last piece of ff and the first piece of gg output the same letter. For example, if seq(f)=a1.5b1a2.5\mathrm{seq}(f) = a^{1.5} b^1 a^{2.5} and seq(g)=a0.5b2\mathrm{seq}(g) = a^{0.5} b^2, then seq(fg)=a1.5b1a3b2\mathrm{seq}(f \cdot g) = a^{1.5} b^1 a^3 b^2:

In general, we have

  • fg=f+g|f \cdot g| = |f| + |g|,
  • fgσ=fσ+gσ|f \cdot g|_\sigma = |f|_\sigma + |g|_\sigma,
  • #f+#g1#(fg)#f+#g\#f + \#g - 1 \leq \#(f \cdot g) \leq \#f + \#g.

Algebraic view and relation with words

The set of signals S(Σ)\mathcal{S}(\Sigma) equipped with \cdot forms a monoid with the empty signal ε\varepsilon as the identity. We consider R0\R_{\geq 0} as a monoid under addition with 00 as the identity.

Let \equiv be the relation over (Σ×R0)(\Sigma \times \R_{\geq 0})^* defined by uvu \equiv v iff uu and vv are equal after reduction using the rewriting rules (σ,δ)(σ,δ)(σ,δ+δ)(\sigma, \delta) (\sigma, \delta') \to (\sigma, \delta + \delta') and (σ,0)ε(\sigma, 0) \to \varepsilon. For example, a0.5a1b0.5b0.5a1a1.5b1a1a^{0.5} a^1 b^{0.5} b^{0.5} a^1 \equiv a^{1.5} b^1 a^1:

Note that {\equiv} is a congruence and that S(Σ)(Σ×R0)/\mathcal{S}(\Sigma) \cong (\Sigma \times \R_{\geq 0})^*/{\equiv} due to the isomorphism f[seq(f)]f \mapsto [\mathrm{seq}(f)]. In more algebraic terms, the monoid S(Σ)\mathcal{S}(\Sigma) can be seen as the free product R0R0\R_{\geq 0} \ast \cdots \ast \R_{\geq 0} (with Σ|\Sigma| copies).

It is the case that Σ(Σ×N)/\Sigma^* \cong (\Sigma \times \N)^*/{\equiv} due to the isomorphism σ1σn[σ11σn1]\sigma_1 \cdots \sigma_n \mapsto [\sigma_1^1 \cdots \sigma_n^1]. Furthermore, concatenation and durations behave as expected for signals representing standard finite words. For example, consider the signal fS(Σ)f \in \mathcal{S}(\Sigma) with seq(f)=a1b1a2\mathrm{seq}(f) = a^1 b^1 a^2. It corresponds to the word w=abaaΣw = abaa \in \Sigma^*. We have f=w=4|f| = |w| = 4, fa=wa=3|f|_a = |w|_a = 3, fb=wb=1|f|_b = |w|_b = 1, and f(i)=w(i)f(i) = w(i) for each i[1..4]i \in [1..4].

Yet another way to present Σ\Sigma^* is by considering words as finite discrete-time signals, e.g. w=abaaw = abaa can be seen as the signal f ⁣:[1..4]Σf \colon [1..4] \to \Sigma given by [1..2]a[1..2] \mapsto a, [3..3]b[3..3] \mapsto b and [4..4]a[4..4] \mapsto a.

Note that Σ\Sigma^* is not isomorphic to S(Σ)\mathcal{S}(\Sigma), even if we used rational durations or changed the alphabet.

Click for a proof.

For the sake of contradiction, suppose there exists an isomorphism f ⁣:S(Σ)Γf \colon \mathcal{S}(\Sigma) \to \Gamma^*. Let f1(a)=σ1δ1σkδkf^{-1}(a) = \sigma_1^{\delta_1} \cdots \sigma_k^{\delta_k} be in reduced form, i.e. with σiσi+1\sigma_i \neq \sigma_{i+1} and δi>0\delta_i > 0 for each ii. We have

a=f(f1(a))=f(σ1δ1σkδk)=f(σ1δ1)f(σkδk). \begin{aligned} a &= f(f^{-1}(a)) \\ &= f(\sigma_1^{\delta_1} \cdots \sigma_k^{\delta_k}) \\ &= f(\sigma_1^{\delta_1}) \cdots f(\sigma_k^{\delta_k}). \end{aligned}

Thus, all terms but one are equal to ε\varepsilon, i.e. a=f(σiδi)a = f(\sigma_i^{\delta_i}) for some i[1..k]i \in [1..k]. Note that

f(σiδi/2)f(σiδi/2)=f(σiδi/2σiδi/2)=f(σiδi)=a. f(\sigma_i^{\delta_i/2}) \cdot f(\sigma_i^{\delta_i/2}) = f(\sigma_i^{\delta_i/2} \cdot \sigma_i^{\delta_i/2}) = f(\sigma_i^{\delta_i}) = a.

Consequently, ww=aww = a for some wΓw \in \Gamma^*, which is impossible.

Relation with timed words

Timed words are similar to signals and have been extensively studied by the formal verification community. They can also be seen as sequences from (Σ×R0)(\Sigma \times \R_{\geq 0})^*, but where zero durations matter. For example, the timed word a1b0a0a1.5b0.5a^1 b^0 a^0 a^{1.5} b^{0.5} can be interpreted as follows:

  • After one unit of time, event abaaba occurred,
  • After another 1.5 unit of time, event aa occurred,
  • After another 0.5 unit of time, event bb occurred.

In that context, a1b0a0a1.5b0.5a^1 b^0 a^0 a^{1.5} b^{0.5} is not equivalent to a2.5b0.5a^{2.5} b^{0.5}. In fact, timed words are usually defined as sequences (σ1,τ1)(σk,τk)(Σ×R0)(\sigma_1, \tau_1) \cdots (\sigma_k, \tau_k) \in (\Sigma \times \R_{\geq 0})^* where τ1τk\tau_1 \leq \cdots \leq \tau_k are timestamps rather than durations. In that setting, our previous example is written as (a,1)(b,1)(a,1)(a,2.5)(b,3)(a, 1) (b, 1) (a, 1) (a, 2.5) (b, 3). For a more formal algebraic treatment of timed words and their relation with signals, see the work of Asarin, Caspi and Maler1.

Real-time regular languages

A subset LS(Σ)L \subseteq \mathcal{S}(\Sigma) is called a (signal) language. Let us introduce a counterpart to regular languages2. A real-time regular expression is obtained from this grammar, where σ\sigma ranges over Σ\Sigma, and II ranges over rational-bounded intervals:

r::=εσIr+rrrr r ::= \emptyset \mid \varepsilon \mid \sigma_I \mid r + r \mid r \cdot r \mid r^*

For example, a(0,)(b(0,2]a(0,1])a_{(0, \infty)} \cdot (b_{(0, 2]} \cdot a_{(0, 1]})^* denotes the language of signals of the form aδbβ1aα1bβkaαka^\delta b^{\beta_1} a^{\alpha_1} \cdots b^{\beta_k} a^{\alpha_k} where kNk \in \N, δR>0\delta \in \R_{> 0}, αi(0,1]\alpha_i \in (0, 1] and βi(0,2]\beta_i \in (0, 2]. Formally, the semantics is defined inductively by

=,ε={ε},σI={(0,δ]σ:δR>0},r+r=rr,rr={ww:wr,wr},r=r0r1r2. \begin{aligned} \llbracket \emptyset \rrbracket &= \emptyset, \\ % \llbracket \varepsilon \rrbracket &= \{\varepsilon\}, \\ % \llbracket \sigma_I \rrbracket &= \{(0, \delta] \mapsto \sigma : \delta \in \R_{> 0}\}, \\ % \llbracket r + r' \rrbracket &= \llbracket r \rrbracket \cup \llbracket r' \rrbracket, \\ % \llbracket r \cdot r' \rrbracket &= \{w \cdot w' : w \in \llbracket r \rrbracket, w' \in \llbracket r' \rrbracket\}, \\ % \llbracket r^* \rrbracket &= \llbracket r^0 \rrbracket \cup \llbracket r^1 \rrbracket \cup \llbracket r^2 \rrbracket \cup \cdots. \end{aligned}

A real-time regular language is a signal language LS(Σ)L \subseteq \mathcal{S}(\Sigma) such that L=rL = \llbracket r \rrbracket for some real-time regular expression. As in the classical setting, we can further provide an automaton model for these languages.

A real-time automaton is a tuple A=(Q,Σ,Δ,Q0,F,λ,ι)\mathcal{A} = (Q, \Sigma, \Delta, Q_0, F, \lambda, \iota) where

  • QQ is finite set of states;
  • Σ\Sigma is a finite alphabet;
  • ΔQ×Q\Delta \subseteq Q \times Q is the relation transition;
  • Q0,FQQ_0, F \subseteq Q are the initial and final states;
  • λ ⁣:QΣ\lambda \colon Q \to \Sigma is the state labeling function;
  • ι\iota is the time labeling function that maps each state to a rational-bounded interval.

An execution is a sequence (q1,δ1)(qn,δn)(q_1, \delta_1) \cdots (q_n, \delta_n) such that δiι(qi)\delta_i \in \iota(q_i) for all i[1..n]i \in [1..n], and (qi,qi+1)Δ(q_i, q_{i+1}) \in \Delta for all i[1..n1]i \in [1..n-1]. If q1Q0q_1 \in Q_0 and qnFq_n \in F, then we say that the execution accepts the following signal:

λ(q1)δ1λ(qn)δn. \lambda(q_1)^{\delta_1} \cdots \lambda(q_n)^{\delta_n}.

An execution is stuttering if λ(qi)=λ(qi+1)\lambda(q_i) = \lambda(q_{i+1}) for some i[1..n1]i \in [1..n-1]. Note that an execution accepting a signal ww is non-stuttering iff #w=n\# w = n.

For example, the language a(0,)(b(0,2]a(0,1])\llbracket a_{(0, \infty)} \cdot (b_{(0, 2]} \cdot a_{(0, 1]})^* \rrbracket is accepted by this real-time automaton:

Real-timed automata and real-time regular expressions have the same expressive power.

Stuttering and determinism

We say that an automaton is stuttering-free if

  • 0ι(qε)0 \in \iota(q_\varepsilon) for at most one state qεq_\varepsilon, in which case we have qεQ0q_\varepsilon \in Q_0, ι(qε)=[0,0]\iota(q_\varepsilon) = [0, 0] and qεq_\varepsilon is not connected to any other state;

  • λ(p)λ(q)\lambda(p) \neq \lambda(q) for each (p,q)Δ(p, q) \in \Delta.

We say that an automaton is state-deterministic if the following holds for all states qrq \neq r:

  • q,rQ0q, r \in Q_0 implies λ(q)λ(r)\lambda(q) \neq \lambda(r) or ι(q)ι(r)=\iota(q) \cap \iota(r) = \emptyset;

  • (p,q),(p,r)Δ(p, q), (p, r) \in \Delta implies λ(p)λ(q)\lambda(p) \neq \lambda(q) or ι(p)ι(q)=\iota(p) \cap \iota(q) = \emptyset.

An automaton is deterministic if it is both stuttering-free and state-deterministic. For example, the automaton depicted above is deterministic.

If we see (discrete) words as signals with integer durations, then any standard regular language can be accepted by a real-time automaton. For example, the language {ε,a,aa,}\{\varepsilon, a, aa, \ldots\} can be seen as the signal language LN={a(0,n]:nN}L_\N = \{a \mapsto (0, n] : n \in \N\} which equals (a[1,1])\llbracket (a_{[1, 1]})^* \rrbracket. However, determinism does not suffice.

No stutter-free real-time automaton accepts the signal language LNL_\N.

Augmented automata

A Kleene constraint is a duration constraint obtained by combining rational-bounded intervals with union Minkowski sum, star and complementation. For example, the subset of R0\R_{\geq 0} associated with the Kleene constraint (0,1]([2,4)+[5,5])(0, 1] \cup ([2, 4) + [5, 5]^*) is

(0,1][2,4)[7,9)[12,14). (0, 1] \cup [2, 4) \cup [7, 9) \cup [12, 14) \cup \cdots.

An augmented real-time automaton is a real-time automaton where ι(q)\iota(q) is a Kleene constraint rather than a single interval. For example, it is now trivial to provide a deterministic automaton for LNL_{\N}:

In fact, augmented real-time automata allow us to retrieve good properties of classical automata.

Augmented real-time automata are closed under complementation.
The emptiness and universality problems are decidable for real-time regular languages.
The following models have the same expressiveness: real-timed automata, augmented real-timed automata, and deterministic real-timed automata.

Let LS(Σ)L \subseteq \mathcal{S}(\Sigma) be accepted by a real-time automaton. There exists nN1n \in \N_{\geq 1} such that for all wS(Σ)w \in \mathcal{S}(\Sigma) with #wn\# w \geq n, there exist x,y,zS(Σ)x, y, z \in \mathcal{S}(\Sigma) such that

  • w=xyzw = xyz,
  • #y>1\# y > 1,
  • xykzLx y^k z \in L for all kNk \in \N.

From the above pumping lemma, we can show, e.g., that the language L(0,1]={wS({a,b}):0<w1}L_{(0, 1]} = \{w \in \mathcal{S}(\{a, b\}) : 0 < |w| \leq 1\} is not real-time regular. Indeed, for the sake of contradiction suppose L(0,1]L_{(0, 1]} is real-time regular. Let nNn \in \N be the threshold given by the pumping lemma. Let w=(a1/2nb1/2n)nw = (a^{1/2n} b^{1/2n})^n. We have w=1|w| = 1 and hence wLw \in L. Moreover, #w=2n\# w = 2n, and hence we can obtain w=xyzw = xyz from the pumping lemma. Consequently, we have xy2zL(0,1]xy^2z \in L_{(0, 1]}, which is a contradiction since y>0|y| > 0 by #y>1\# y > 1.

Timed regular languages

A shortcoming of real-time regular expressions is that they can only locally constrain the duration of letters. Thus, let us enrich them. A generalized timed regular expression is an expression derived from the following grammar, where σ\sigma ranges over Σ\Sigma, and II ranges over integer-bounded intervals:

r::=εσrIrrr(rr)(rr) r ::= \emptyset \mid \varepsilon \mid \underline{\sigma} \mid \langle r \rangle_I \mid r \cdot r \mid r^* \mid (r \lor r) \mid (r \land r)

The semantics of generalized timed regular expressions is defined inductively by

=,ε={ε},σ={(0,δ]σ:δR>0},rI={wr:wI},rr={ww:wr,wr},r=r0r1r2,rr=rr,rr=rr. \begin{aligned} \llbracket \emptyset \rrbracket &= \emptyset, \\ % \llbracket \varepsilon \rrbracket &= \{\varepsilon\}, \\ % \llbracket \underline{\sigma} \rrbracket &= \{(0, \delta] \mapsto \sigma : \delta \in \R_{> 0}\}, \\ % \llbracket \langle r \rangle_I \rrbracket &= \{w \in \llbracket r \rrbracket : |w| \in I\}, \\ % \llbracket r \cdot r' \rrbracket &= \{w \cdot w' : w \in \llbracket r \rrbracket, w' \in \llbracket r' \rrbracket\}, \\ % \llbracket r^* \rrbracket &= \llbracket r^0 \rrbracket \cup \llbracket r^1 \rrbracket \cup \llbracket r^2 \rrbracket \cup \cdots, \\ % \llbracket r \lor r' \rrbracket &= \llbracket r \rrbracket \cup \llbracket r' \rrbracket, \\ % \llbracket r \land r' \rrbracket &= \llbracket r \rrbracket \cap \llbracket r' \rrbracket. \end{aligned}

When an expression has no conjunction, we say that it is a timed regular expression. For example, the timed regular expression a(0,2](bc(0,8])\langle \underline{a}\rangle_{(0, 2]} (\langle \underline{b} \cdot \underline{c} \rangle_{(0, 8]})^* describes the language of signals of the form aαbβ1cγ1bβkcγka^\alpha b^{\beta_1} c^{\gamma_1} \cdots b^{\beta_k} c^{\gamma_k} where 0<α20 < \alpha \leq 2, kNk \in \N and 0<βi+γi80 < \beta_i + \gamma_i \leq 8 for all i[1..k]i \in [1..k]. The language L(0,1]L_{(0, 1]} seen earlier can be expressed by the timed regular expression (ab)(0,1]\langle(\underline{a} \lor \underline{b})^*\rangle_{(0, 1]}.

Timed regular expressions are less expressive than generalized timed regular expressions. Indeed, it is known that L={aδb1δcδ:δ(0,1)}L = \{a^\delta b^{1 - \delta} c^\delta : \delta \in (0, 1)\} cannot be described by a timed regular expression3. Yet, it is described by the generalized timed regular expression r=(ab[1,1]c)(abc[1,1])r = (\langle \underline{a} \cdot \underline{b}\rangle_{[1, 1]} \cdot \underline{c}) \land (\underline{a} \cdot \langle \underline{b} \cdot \underline{c}\rangle_{[1, 1]}).

Click for a proof.

\Rightarrow) Let w=aδb1δcδw = a^\delta b^{1 - \delta} c^\delta where δ(0,1)\delta \in (0, 1). We have aδb1δ=δ+(1δ)=1|a^\delta b^{1-\delta}| = \delta + (1 - \delta) = 1 and b1δcδ=(1δ)+δ=1|b^{1-\delta} c^\delta| = (1 - \delta) + \delta = 1. Therefore, wrw \in \llbracket r \rrbracket.

\Leftarrow) Let wrw \in \llbracket r \rrbracket. By definition, there exist x,y,z,x,y,zR>0x, y, z, x', y', z' \in \R_{>0} such that w=axbyczw = a^x b^y c^z, x+y=1x + y = 1, w=axbyczw = a^{x'} b^{y'} c^{z'} and y+z=1y' + z' = 1. Since all letters are distinct, we must have x=xx = x', y=yy = y' and z=zz = z'. Thus, 1x=y=y=1z=1z1 - x = y = y' = 1 - z' = 1 - z and hence x=zx = z. This means that w=axb1xcxLw = a^x b^{1 - x} c^x \in L.

Timed automata

Timed automata are a well-established formalism for describing languages of timed words. Let us explain, through an example, how they can also describe signal languages. Let us reconsider the timed regular expression r=a(0,2](bc(0,8])r = \langle \underline{a}\rangle_{(0, 2]} (\langle \underline{b} \cdot \underline{c} \rangle_{(0, 8]})^*. The language of rr is described by this timed automaton:

We start in the initial state q0q_0 at time 00. From there, time elapses at the same rate in clocks xx and yy. A transition labeled by “φ;op\varphi; \texttt{op}” can be taken if constraint φ\varphi holds in the current clock valuation, upon which op\texttt{op} is executed. Formally, φ\varphi is a Boolean combination of integer-bounded interval constraints over clocks, and the operation resets a (possibly empty) subset of the clocks.

Consider an execution

(q0,x0,y0)δ0(q1,x1,y1)δ1δn1(qn,xn,yn), (q_0, x_0, y_0) \to^{\delta_0} (q_1, x_1, y_1) \to^{\delta_1} \cdots \to^{\delta_{n-1}} (q_n, x_n, y_n),

where δi\delta_i is the time elapsed in qiq_i, and (xi,yi)(x_i, y_i) is the value of the two clocks. Let λ ⁣:QΣ\lambda \colon Q \to \Sigma label states. If x0=y0=0x_0 = y_0 = 0 and state qnq_n is accepting, then the automaton accepts the signal fS(Σ)f \in \mathcal{S}(\Sigma) such that

f=λ(q0)δ0λ(qn1)δn1. f = \lambda(q_0)^{\delta_0} \cdots \lambda(q_{n-1})^{\delta_{n-1}}.

Recall the classical translation from standard regular expressions to non-deterministic finite automata. A similar inductive translation holds in the timed setting. For example, for rI\langle r \rangle_I, we construct a timed automaton for rr; add a new clock zz; and add the constraint “zIz \in I” to each transition leading to an accepting state.

Any generalized timed regular language is accepted by a timed automaton.

It can further be shown that (i) the signal language of a one-clock timed automaton is expressible by a timed regular expression; and (ii) the signal language LL of a timed automaton can be written as the renaming of the intersection of the language of finitely many one-clock timed automata. By “renaming”, we mean that LS(Σ)L \subseteq \mathcal{S}(\Sigma) equals h(L)h(L'), where LS(Σ)L' \subseteq \mathcal{S}(\Sigma') and h ⁣:ΣΣh \colon \Sigma' \to \Sigma is a letter-to-letter morphism. Altogether, we obtain the following characterization for signal languages:

A signal language is generalized timed regular iff it is the renaming of the language of a timed automaton.

The same theorem holds for timed words. Herrmann proved that renaming is necessary in that setting, and claimed that the proof can be adapted to signals4.

Relation with real-time regular languages

Real-time regular languages form a strict subset of languages recognized by one-clock timed automata. Indeed, a real-timed automaton can be seen as a one-clock timed automaton where the single clock is reset at each transition. In particular, the language L(0,1]={wS({a,b}):0<w1}L_{(0, 1]} = \{w \in \mathcal{S}(\{a, b\}) : 0 < |w| \leq 1\}, which is not real-time regular, can be recognized by a one-clock timed automaton: