Defining mtts
An mtt is defined in a plain text file looking
almost as described in the definition. The following example illustrates
how to define the rules:
q_init(Doc(r,l)) -> Doc(q_inbx(r),e())
q_inbx(Inbox(r,l)) -> Inbox(q_mail(r),q_trsh1(l,q_spam(r)))
q_trsh1(Trash(r,l),y) -> Trash(q_trsh2(r,y),e())
q_mail(Mail(r,l)) -> Mail(e(),q_mail(l))
q_mail(Spam(r,l)) -> q_mail(l)
q_mail(e()) -> e()
q_spam(Mail(r,l)) -> q_spam(l)
q_spam(Spam(r,l)) -> Spam(e(),q_spam(l))
q_spam(e()) -> e()
q_trsh2(Mail(r,l),y) -> Mail(e(),q_trsh2(l,y))
q_trsh2(Spam(r,l),y) -> Spam(e(),q_trsh2(l,y))
q_trsh2(e(),y) -> y
Each line defines the transformation rule for exactly one function.
The first line, for example, defines a rule for function q_init.
Since it is the first rule, q_init is the initial function
of the defined mtt, meaning that transforming the input starts at its
root node with this function.
The left-hand side of a rule defines at which labels the rule
can be applied and the number and names of the additional parameters.
Function q_trsh1 in the third rule has one parameter named
y. The matching label can be seen in the first argument
of each function, i.e., the initial rule transformes the root node
of the input, if it is labled with Doc. The variables
r and l denote the sons of the current node.
A right-hand side describes how to transform the current node and
how to continue the transformation. It is assembled of constant
parts like Doc or e in the first rule and recursive
function calls like q_inbx(r). A function is always called
for one of the input variables of the left-hand side, here r.
If the function has additonal parameters, their positions can
also be filled with constant output part or recursive calls to other
functions. This can be seen in the second rule, where q_trsh1
is called for the left child of the current node and the parameter
contains the call to function q_spam.
|
Defining TL programs
A TL program is defined in a plain text file like the following one:
q("Doc") -> Doc(q_inbx("Inbox"),q_trsh("Trash",p_inbx("Inbox")));
q_inbx("Inbox") -> Inbox(q1("Mail"));
p_inbx("Inbox") -> p1("Spam/Mail");
q1("Mail") -> Mail();
p1("Mail") -> Mail();
This example defines the same transformation as the mtt above and
illustrates in its shortness, that TL is a very compact language to
define transformations. TL has a different view on its input, because
it can handle an arbitrary number of sons of a node -- whereas mtts
work on ranked trees.
The first line again defines the initial rule, here for function
q. The left-hand side specifies in the first argument
the match pattern, defining at which nodes the rule can be applied.
In our example, q transformes nodes labeled with Doc.
The right-hand sides specify the action part of each rule. The first
rule generates a Doc node in the output, whose content is
generated by the functions q_inbx and q_trsh.
The latter function has one parameter containing the result of a recursive
call to another function. Each recursive call has a select pattern
determining which nodes have to be transformed by this function.
A select pattern is a path from the current node to the nodes that
should be transformed. In the third rule, for example, the select
pattern Spam/Mail specifies, that each Mail node
under a Spam node is transformed by function p1.
This means, the rule selects all mails marked as spam in the
inbox.
|
Defining the output type
The socalled output type defines all unwanted outputs, because the
type checking routine tests whether the intersection of the computed
output type of the transformation with the predefined type is empty.
If the intersection is empty, the transformation generates only
those documents intended by the programmer.
For our mtt example, this type looks as follows:
pM,Mail,pL,pL;
pM,Mail,pL,pM;
Error,Mail,pL,pS;
Error,Mail,pL,Error;
pS,Spam,pL,pL;
pS,Spam,pL,pM;
pS,Spam,pL,pS;
pT,Trash,pL,pL;
pT,Trash,pM,pL;
pT,Trash,pS,pL;
Error,Inbox,pS,pT;
Error,Inbox,Error,pT;
pD,Doc,pI,pL;
Error,Doc,Error,pL;
pL,e;.
Error
The definition is partitioned into two parts: a rule set and after
the . the acepting states.
Each line of the rule set defines a transition of the automaton
accepting the set of documents. The first line, for example,
has to be read in the following way:
The automaton changes into state pM if the automaton is in state
pL for the left son and in state pM for the right son,
and the current node is labeled with Mail.
This is a straight forward notation for the transitions of a tree
automaton.
|
Type checking a program or mtt
If one wants to check the above mtt against the defined type, one
has to type
mttchk -top -mtt remove_spam.mtt -out mailbox.dfa
where the -mtt option defines the mtt and -out the
defined type.
Analogously, the TL program is checked in the following way:
mttchk -top -tl mail.tl -out test.dfa
The suffixes .tl, .mtt, and .dfa help only
to distinguish the different files, but are completely useless for
the program.
back to project page
|
|
|
|