In theoretical computer science and formal language theory, a regular tree grammar (RTG)^{[1]} is a formal grammar that describes a set of directed trees, or terms. A regular word grammar can be seen as a special kind of regular tree grammar, describing a set of singlepath trees.
Contents

Definition 1

Derivation of trees 2

Examples 3

Language properties 4

Alternative characterizations and relation to other formal languages 5

See also 6

References 7

External links 8
Definition
A regular tree grammar G is defined by the tuple
G = (N, Σ, Z, P),
where

N is a set of nonterminals,

Σ is a ranked alphabet (i.e., an alphabet whose symbols have an associated arity) disjoint from N,

Z is the starting nonterminal, with Z ∈ N, and

P is a set of productions of the form A → t, with A ∈ N, and t ∈ T_{Σ}(N), where T_{Σ}(N) is the associated term algebra, i.e. the set of all trees composed from symbols in Σ ∪ N according to their arities, where nonterminals are considered nullary.
Derivation of trees
The grammar G implicitly defines a set of trees: any tree that can be derived from Z using the rule set P is said to be described by G. This set of trees is known as the language of G. More formally, the relation ⇒_{G} on the set T_{Σ}(N) is defined as follows:
A tree t_{1}∈ T_{Σ}(N) can be derived in a single step into a tree t_{2} ∈ T_{Σ}(N) (in short: t_{1} ⇒_{G} t_{2}), if there is a context S and a production (A→t) ∈ P such that:

t_{1} = S[A], and

t_{2} = S[t].
Here, a context means a tree with exactly one hole in it; if S is such a context, S[t] denotes the result of filling the tree t into the hole of S.
The tree language generated by G is the language L(G) = { t ∈ T_{Σ}  Z ⇒_{G*} t }.
Here, T_{Σ} denotes the set of all trees composed from symbols of Σ, while ⇒_{G*} denotes successive applications of ⇒_{G}.
A language generated by some regular tree grammar is called a regular tree language.
Examples
Example
derivation tree from G
_{1} in linear (upper left table) and graphical (main picture) notation
Let G_{1} = (N_{1},Σ_{1},Z_{1},P_{1}), where

N_{1} = {Bool, BList } is our set of nonterminals,

Σ_{1} = { true, false, nil, cons(.,.) } is our ranked alphabet, arities indicated by dummy arguments (i.e. the symbol cons has arity 2),

Z_{1} = BList is our starting nonterminal, and

the set P_{1} consists of the following productions:

Bool → false

Bool → true

BList → nil

BList → cons(Bool,BList)
An example derivation from the grammar G_{1} is
BList ⇒ cons(Bool,BList) ⇒ cons(false,cons(Bool,BList)) ⇒ cons(false,cons(true,nil)).
The image shows the corresponding derivation tree; it is a tree of trees (main picture), whereas a derivation tree in word grammars is a tree of strings (upper left table).
The tree language generated by G_{1} is the set of all finite lists of boolean values, that is, L(G_{1}) happens to equal T_{Σ1}. The grammar G_{1} corresponds to the algebraic data type declarations
datatype Bool
= false
 true
datatype BList
= nil
 cons of Bool * BList
in the Standard ML programming language: every member of L(G_{1}) corresponds to a StandardML value of type BList.
For another example, let G_{2} = (N_{1},Σ_{1},BList1,P_{1} ∪ P_{2}), using the nonterminal set and the alphabet from above, but extending the production set by P_{2}, consisting of the following productions:

BList1 → cons(true,BList)

BList1 → cons(false,BList1)
The language L(G_{2}) is the set of all finite lists of boolean values that contain true at least once. The set L(G_{2}) has no datatype counterpart in Standard ML, nor in any other functional language. It is a proper subset of L(G_{1}). The above example term happens to be in L(G_{2}), too, as the following derivation shows:
BList1 ⇒ cons(false,BList1) ⇒ cons(false,cons(true,BList)) ⇒ cons(false,cons(true,nil)).
Language properties
If L_{1}, L_{2} both are regular tree languages, then the tree sets L_{1} ∩ L_{2}, L_{1} ∪ L_{2}, and L_{1} \ L_{2} are also regular tree languages, and it is decidable whether L_{1} ⊆ L_{2}, and whether L_{1} = L_{2}.
Alternative characterizations and relation to other formal languages
Rajeev Alur and Parthasarathy Madhusudan^{[2]}^{[3]} related a subclass of regular binary tree languages to nested words and visibly pushdown languages.
The regular tree languages are also^{[4]} the languages recognized by bottomup tree automata and nondeterministic topdown tree automata.
Regular tree grammars are a generalization of regular word grammars.
See also

Set constraint — a generalization of regular tree grammars

Treeadjoining grammar

A book devoted to tree grammars: Nivat, Maurice; Podelski, Andreas (1992). Tree Automata and Languages. Studies in Computer Science and Artificial Intelligence 10. NorthHolland.

Regular tree grammars were already described in 1968 by:

Brainerd, W.S. (1968). "The Minimalization of Tree Automata" (pdf). Information and Control 13: 484–491.

Thatcher, J.W.; Wright, J.B. (1968). "Generalized Finite Automata Theory with an Application to a Decision Problem of SecondOrder Logic". Mathematical Systems Theory 2 (1).

Applications of regular tree grammars include:


A

Solving

The set of all truths expressible in

Algorithms on regular tree grammars are discussed from an efficiencyoriented view in: Aiken, A.; Murphy, B. (1991). "Implementing Regular Tree Expressions". ACM Conference on Functional Programming Languages and Computer Architecture. pp. 427–447.

Given a mapping from trees to weights, Knuth's generalization of . Based on this information, it is straightforward to enumerate its language in increasing weight order. In particular, any nonterminal with infinite minimum weight produces the empty language.

Regular tree automata have been generalized to admit equality tests between sibling nodes in trees: Bogaert, B.; Tison, Sophie (1992). "Equality and Disequality Constraints on Direct Subterms in Tree Automata". Proc. 9th STACS. LNCS 577. Springer. pp. 161–172.

Allowing equality tests between deeper nodes leads to undecidibility: Tommasi, M. (1991), Automates d'Arbres avec Tests d'Égalités entre Cousins Germains, LIFLIT
References

^ "Regular tree grammars as a formalism for scope underspecification".

^ Alur, R.; Madhusudan, P. (2004). "Visibly pushdown languages" (PDF). Proceedings of the thirtysixth annual ACM symposium on Theory of computing  STOC '04. pp. 202–211. Sect.4, Theorem 5,

^ Alur, R.; Madhusudan, P. (2009). "Adding nesting structure to words" (PDF). Journal of the ACM 56 (3): 1–43. Sect.7

^ Comon et al, Tree Automata Techniques and Applications, 1997
External links

Tree Automata Techniques and Applications




Each category of languages, except those marked by a ^{*}, is a proper subset of the category directly above it. Any language in each category is generated by a grammar and by an automaton in the category in the same line.


This article was sourced from Creative Commons AttributionShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and USA.gov, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for USA.gov and content contributors is made possible from the U.S. Congress, EGovernment Act of 2002.
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a nonprofit organization.