A (finite piecewise-constant dense-time) signal over alphabet
Σ is a left-continuous function f:(0,δ]→Σ with δ∈R≥0 and having finitely many
discontinuities. The duration δ of f is denoted ∣f∣.
Every signal can be summarized by a unique sequence seq(f)=(σ1,δ1)⋯(σk,δk)∈(Σ×R>0)∗, where σi=σi+1 for all i<k, which we write more compactly as σ1δ1⋯σkδk.
For example, a1.5b1a2.5 represents the signal f:(0,5]→Σ given by
(0,1.5]↦a, (1.5,2.5]↦b, (2.5,5]↦a.
This signal can be visualized as follows:
Given a signal f with seq(f)=σ1δ1⋯σkδk, the a-duration of f is ∣f∣a=∑i:σi=aδi. Moreover, we define the number of pieces
of f as #f=k. For our previous example, we have ∣f∣a=4,
∣f∣b=1 and #f=3.
Concatenation
We write S(Σ) to denote the set of all signals, and
we write ε to denote the unique signal of duration
0. Given f,g∈S(Σ), we write f⋅g to
denote the signal h:(0,∣f∣+∣g∣] given by
h(τ)={f(τ)g(τ−∣f∣)if τ≤∣f∣,otherwise.Note that seq(f⋅g) differs from seq(f)⋅seq(g) when the last piece of f and the first piece
of g output the same letter. For example, if seq(f)=a1.5b1a2.5 and seq(g)=a0.5b2, then
seq(f⋅g)=a1.5b1a3b2:
In general, we have
- ∣f⋅g∣=∣f∣+∣g∣,
- ∣f⋅g∣σ=∣f∣σ+∣g∣σ,
- #f+#g−1≤#(f⋅g)≤#f+#g.
Algebraic view and relation with words
The set of signals S(Σ) equipped with ⋅ forms a
monoid with the empty signal
ε as the identity. We consider R≥0 as a monoid
under addition with 0 as the identity.
Let ≡ be the relation over (Σ×R≥0)∗
defined by u≡v iff u and v are equal after reduction
using the rewriting rules (σ,δ)(σ,δ′)→(σ,δ+δ′) and (σ,0)→ε. For
example, a0.5a1b0.5b0.5a1≡a1.5b1a1:
Note that ≡ is a
congruence and
that S(Σ)≅(Σ×R≥0)∗/≡ due to the isomorphism f↦[seq(f)]. In more algebraic terms, the monoid
S(Σ) can be seen as the free
product R≥0∗⋯∗R≥0 (with ∣Σ∣ copies).
It is the case that Σ∗≅(Σ×N)∗/≡ due
to the isomorphism σ1⋯σn↦[σ11⋯σn1]. Furthermore, concatenation and durations behave
as expected for signals representing standard finite words. For
example, consider the signal f∈S(Σ) with
seq(f)=a1b1a2. It corresponds to the word w=abaa∈Σ∗. We have ∣f∣=∣w∣=4, ∣f∣a=∣w∣a=3, ∣f∣b=∣w∣b=1, and f(i)=w(i) for each i∈[1..4].
Yet another way to present Σ∗ is by considering words as
finite discrete-time signals, e.g. w=abaa can be seen as the
signal f:[1..4]→Σ given by [1..2]↦a,
[3..3]↦b and [4..4]↦a.
Note that Σ∗ is not isomorphic to S(Σ), even
if we used rational durations or changed the alphabet.
Click for a proof.
For the sake of contradiction, suppose there exists an isomorphism f:S(Σ)→Γ∗. Let f−1(a)=σ1δ1⋯σkδk be in reduced form,
i.e. with σi=σi+1 and δi>0 for each
i. We have
a=f(f−1(a))=f(σ1δ1⋯σkδk)=f(σ1δ1)⋯f(σkδk).Thus, all terms but one are equal to ε, i.e. a=f(σiδi) for some i∈[1..k]. Note that
f(σiδi/2)⋅f(σiδi/2)=f(σiδi/2⋅σiδi/2)=f(σiδi)=a.
Consequently, ww=a for some w∈Γ∗, which is impossible.
Relation with timed words
Timed words are similar
to signals and have been extensively studied by the formal
verification community. They can also be seen as sequences from (Σ×R≥0)∗, but where zero durations matter. For example,
the timed word a1b0a0a1.5b0.5 can be interpreted as
follows:
- After one unit of time, event aba occurred,
- After another 1.5 unit of time, event a occurred,
- After another 0.5 unit of time, event b occurred.
In that context, a1b0a0a1.5b0.5 is not equivalent to
a2.5b0.5. In fact, timed words are usually defined as
sequences (σ1,τ1)⋯(σk,τk)∈(Σ×R≥0)∗ where τ1≤⋯≤τk are
timestamps rather than durations. In that setting, our previous
example is written as (a,1)(b,1)(a,1)(a,2.5)(b,3). For a
more formal algebraic treatment of timed words and their relation with
signals, see the work of Asarin, Caspi and Maler.
Real-time regular languages
A subset L⊆S(Σ) is called a (signal)
language. Let us introduce a counterpart to regular languages. A
real-time regular expression is obtained from this grammar, where
σ ranges over Σ, and I ranges over rational-bounded
intervals:
r::=∅∣ε∣σI∣r+r∣r⋅r∣r∗For example, a(0,∞)⋅(b(0,2]⋅a(0,1])∗
denotes the language of signals of the form aδbβ1aα1⋯bβkaαk where k∈N,
δ∈R>0, αi∈(0,1] and βi∈(0,2]. Formally, the semantics is defined inductively by
[[∅]][[ε]][[σI]][[r+r′]][[r⋅r′]][[r∗]]=∅,={ε},={(0,δ]↦σ:δ∈R>0},=[[r]]∪[[r′]],={w⋅w′:w∈[[r]],w′∈[[r′]]},=[[r0]]∪[[r1]]∪[[r2]]∪⋯.A real-time regular language is a signal language L⊆S(Σ) such that L=[[r]] for some
real-time regular expression. As in the classical setting, we can
further provide an automaton model for these languages.
A real-time automaton is a tuple A=(Q,Σ,Δ,Q0,F,λ,ι) where
- Q is finite set of states;
- Σ is a finite alphabet;
- Δ⊆Q×Q is the relation transition;
- Q0,F⊆Q are the initial and final states;
- λ:Q→Σ is the state labeling function;
- ι is the time labeling function that maps each state to a
rational-bounded interval.
An execution is a sequence (q1,δ1)⋯(qn,δn)
such that δi∈ι(qi) for all i∈[1..n], and (qi,qi+1)∈Δ for all i∈[1..n−1]. If q1∈Q0 and
qn∈F, then we say that the execution accepts the following
signal:
λ(q1)δ1⋯λ(qn)δn.An execution is stuttering if λ(qi)=λ(qi+1) for
some i∈[1..n−1]. Note that an execution accepting a signal w
is non-stuttering iff #w=n.
For example, the language [[a(0,∞)⋅(b(0,2]⋅a(0,1])∗]] is accepted by this real-time
automaton:
Real-timed automata and real-time regular expressions have the same
expressive power.
Stuttering and determinism
We say that an automaton is stuttering-free if
-
0∈ι(qε) for at most one state qε,
in which case we have qε∈Q0, ι(qε)=[0,0] and qε is not connected to any other state;
-
λ(p)=λ(q) for each (p,q)∈Δ.
We say that an automaton is state-deterministic if the following
holds for all states q=r:
-
q,r∈Q0 implies λ(q)=λ(r) or ι(q)∩ι(r)=∅;
-
(p,q),(p,r)∈Δ implies λ(p)=λ(q) or
ι(p)∩ι(q)=∅.
An automaton is deterministic if it is both stuttering-free and
state-deterministic. For example, the automaton depicted above is
deterministic.
If we see (discrete) words as signals with integer durations, then any
standard regular language can be accepted by a real-time
automaton. For example, the language {ε,a,aa,…}
can be seen as the signal language LN={a↦(0,n]:n∈N} which equals [[(a[1,1])∗]]. However,
determinism does not suffice.
No stutter-free real-time automaton accepts the signal language
LN.
Augmented automata
A Kleene constraint is a duration constraint obtained by combining
rational-bounded intervals with union Minkowski
sum, star and
complementation. For example, the subset of R≥0 associated
with the Kleene constraint (0,1]∪([2,4)+[5,5]∗) is
(0,1]∪[2,4)∪[7,9)∪[12,14)∪⋯.An augmented real-time automaton is a real-time automaton where
ι(q) is a Kleene constraint rather than a single interval. For
example, it is now trivial to provide a deterministic automaton for
LN:
In fact, augmented real-time automata allow us to retrieve good
properties of classical automata.
Augmented real-time automata are closed under complementation.
The emptiness and universality problems are decidable for real-time
regular languages.
The following models have the same expressiveness: real-timed
automata, augmented real-timed automata, and deterministic real-timed
automata.
Let L⊆S(Σ) be accepted by a real-time
automaton. There exists n∈N≥1 such that for all w∈S(Σ) with #w≥n, there exist x,y,z∈S(Σ) such that
- w=xyz,
- #y>1,
- xykz∈L for all k∈N.
From the above pumping lemma, we can show, e.g., that the language
L(0,1]={w∈S({a,b}):0<∣w∣≤1} is not
real-time regular. Indeed, for the sake of contradiction suppose
L(0,1] is real-time regular. Let n∈N be the threshold
given by the pumping lemma. Let w=(a1/2nb1/2n)n. We have
∣w∣=1 and hence w∈L. Moreover, #w=2n, and hence we can
obtain w=xyz from the pumping lemma. Consequently, we have xy2z∈L(0,1], which is a contradiction since ∣y∣>0 by #y>1.
Timed regular languages
A shortcoming of real-time regular expressions is that they can only
locally constrain the duration of letters. Thus, let us enrich them. A
generalized timed regular expression is an expression derived from
the following grammar, where σ ranges over Σ, and I
ranges over integer-bounded intervals:
r::=∅∣ε∣σ∣⟨r⟩I∣r⋅r∣r∗∣(r∨r)∣(r∧r)The semantics of generalized timed regular expressions is defined
inductively by
[[∅]][[ε]][[σ]][[⟨r⟩I]][[r⋅r′]][[r∗]][[r∨r′]][[r∧r′]]=∅,={ε},={(0,δ]↦σ:δ∈R>0},={w∈[[r]]:∣w∣∈I},={w⋅w′:w∈[[r]],w′∈[[r′]]},=[[r0]]∪[[r1]]∪[[r2]]∪⋯,=[[r]]∪[[r′]],=[[r]]∩[[r′]].When an expression has no conjunction, we say that it is a timed
regular expression. For example, the timed regular expression
⟨a⟩(0,2](⟨b⋅c⟩(0,8])∗ describes the language of signals
of the form aαbβ1cγ1⋯bβkcγk where 0<α≤2, k∈N and 0<βi+γi≤8 for all i∈[1..k]. The language L(0,1]
seen earlier can be expressed by the timed regular expression
⟨(a∨b)∗⟩(0,1].
Timed regular expressions are less expressive than generalized timed
regular expressions. Indeed, it is known that L={aδb1−δcδ:δ∈(0,1)} cannot be described by a timed
regular expression. Yet, it is described by the generalized timed
regular expression r=(⟨a⋅b⟩[1,1]⋅c)∧(a⋅⟨b⋅c⟩[1,1]).
Click for a proof.
⇒) Let w=aδb1−δcδ where
δ∈(0,1). We have ∣aδb1−δ∣=δ+(1−δ)=1 and ∣b1−δcδ∣=(1−δ)+δ=1. Therefore, w∈[[r]].
⇐) Let w∈[[r]]. By definition,
there exist x,y,z,x′,y′,z′∈R>0 such that w=axbycz, x+y=1, w=ax′by′cz′ and y′+z′=1. Since
all letters are distinct, we must have x=x′, y=y′ and z=z′. Thus, 1−x=y=y′=1−z′=1−z and hence x=z. This
means that w=axb1−xcx∈L.
Timed automata
Timed automata are
a well-established formalism for describing languages of timed
words. Let us explain, through an example, how they can also describe
signal languages. Let us reconsider the timed regular expression r=⟨a⟩(0,2](⟨b⋅c⟩(0,8])∗. The language of r is described
by this timed automaton:
We start in the initial state q0 at time 0. From there, time
elapses at the same rate in clocks x and y. A transition labeled
by “φ;op” can be taken if constraint φ holds
in the current clock valuation, upon which op is
executed. Formally, φ is a Boolean combination of
integer-bounded interval constraints over clocks, and the operation
resets a (possibly empty) subset of the clocks.
Consider an execution
(q0,x0,y0)→δ0(q1,x1,y1)→δ1⋯→δn−1(qn,xn,yn), where δi is the time elapsed in qi, and (xi,yi) is
the value of the two clocks. Let λ:Q→Σ label
states. If x0=y0=0 and state qn is accepting, then the
automaton accepts the signal f∈S(Σ) such that
f=λ(q0)δ0⋯λ(qn−1)δn−1.Recall the classical translation from standard regular expressions to
non-deterministic finite automata. A similar inductive translation
holds in the timed setting. For example, for ⟨r⟩I, we
construct a timed automaton for r; add a new clock z; and add the
constraint “z∈I” to each transition leading to an accepting
state.
Any generalized timed regular language is accepted by a timed automaton.
It can further be shown that (i) the signal language of a one-clock
timed automaton is expressible by a timed regular expression; and
(ii) the signal language L of a timed automaton can be written as
the renaming of the intersection of the language of finitely many
one-clock timed automata. By “renaming”, we mean that L⊆S(Σ) equals h(L′), where L′⊆S(Σ′) and h:Σ′→Σ is a
letter-to-letter morphism. Altogether, we obtain the following
characterization for signal languages:
A signal language is generalized timed regular iff it is the renaming of the language of a timed automaton.
The same theorem holds for timed words. Herrmann proved that renaming
is necessary in that setting, and claimed that the proof can
be adapted to signals.
Relation with real-time regular languages
Real-time regular languages form a strict subset of languages
recognized by one-clock timed automata. Indeed, a real-timed automaton
can be seen as a one-clock timed automaton where the single clock is
reset at each transition. In particular, the language L(0,1]={w∈S({a,b}):0<∣w∣≤1}, which is not
real-time regular, can be recognized by a one-clock timed automaton: