locked
Mixing Language and Grammar Semantics: Unordered "Blocks" RRS feed

  • Question

  • I think the easiest way to stress my problem is a simple example:

    I wan't to match three keywords on each a line. The keywords stand for any complex semantic part of my language, for example variables, methods, properties and comments within a C# class.

    Given the following rules:
    • Each keyword can occur 0 to 1 times
    • At least one keyword is required
    • Ordered output, while input order doesn't matter
    • Empty lines are ignored
    ...the Grammar should match all of theese Examples:

    keyA
    keyB 
    keyC 


    keyC 
     
    keyA 

     
     
    keyB 

    Semantic Model Result
    Independent of the various inputs I expect a result Foo with a ordered lists of the keywords.
    Foo[
      "keyA"
      "keyB"
      "keyC" 
    ]


    First Draft, Too simple
    language TooSimple { 
       syntax Main = a:"keyA"? b:"keyB"? c:"keyC"
           => Foo [a, b, c]; 
       interleave WhiteSpace = (" " | "\r" | "\n"); 

    Matching, but to complicated?
    language TooLong { 
     token A = "keyA"
     token B = "keyB"
     token C = "keyC"
      
     syntax Main = Br? v:Any(A, B, C) Br? => v; 
      
     syntax Any(T1, T2, T3) =  
        t:T1 => t | t:T2 => t | t:T3 => t 
        | p:Pair(T1, T2) => p | p:Pair(T2, T3) => p | p:Pair(T1, T3) => p 
        | t:Trio(T1, T2, T3) => t; 
     
     syntax Pair(T1, T2) = t1:T1 Br t2:T2 
        => Foo [t1, t2] 
      | t2:T2 Br t1:T1 
        => Foo [t1, t2]; 
         
     syntax Trio(T1, T2, T3) =  
        t1:T1 Br t2:T2 Br t3:T3 
            => Foo [t1, t2, t3] 
        | t1:T1 Br t3:T3 Br t2:T2 
            => Foo [t1, t2, t3] 
        | t2:T2 Br t1:T1 Br t3:T3 
            => Foo [t1, t2, t3] 
        | t2:T2 Br t3:T3 Br t1:T1 
            => Foo [t1, t2, t3] 
        | t3:T3 Br t1:T1 Br t2:T2 
            => Foo [t1, t2, t3] 
        | t3:T3 Br t2:T2 Br t1:T1 
            => Foo [t1, t2, t3]; 
      
     token Br = ("\r" | "\n")+; 
     interleave WhiteSpace = " "

    IMHO the problem is, that I have to mix my grammar and my "language" in one place. Usually the different blocks would have a similar shape. I could just write a abstract grammar for these shapes instead, but then I would loose the "typesafety" in my resulting MGraph.

    What do you think? Is there a much better way to solve this problem? In a xml-based DSL, xsd:choice would help out.

    Wednesday, November 12, 2008 12:11 PM

Answers

  • Hi!  I'm on the dev team for MGrammar and we've been thinking about this scenario, so I'm glad to see a customer hitting it.  We'd like to make this easy to indicate in the language.  Will let you know once we've made progress here.
    • Marked as answer by lcorneliussen Monday, November 17, 2008 10:30 PM
    Monday, November 17, 2008 9:31 PM

All replies

  • Thinking of a DSL that would solve that specific problem, I would like to write:

    syntax Main = {A? | B? | C? delimiter Br}+; 

    • "{}" would indicate a unsorted list within my grammar. In this way I could also get rid of the recursive definition of lists
    • A? doesn't mean, the syntax is optional, but the list element is optional and limited to one.
    Comments?

    Wednesday, November 12, 2008 1:11 PM
  • Hi!  I'm on the dev team for MGrammar and we've been thinking about this scenario, so I'm glad to see a customer hitting it.  We'd like to make this easy to indicate in the language.  Will let you know once we've made progress here.
    • Marked as answer by lcorneliussen Monday, November 17, 2008 10:30 PM
    Monday, November 17, 2008 9:31 PM
  • Nice! Glad to hear from you! How much of MGrammar is done so far? Whats your guess? Is there a roadmap that could be made public?

    Often the structure of my text won't be exactly the same as my semantic model is like. The projections could offer more control, or there could be a second step with a model 2 model transformation which could base on linq syntax.

    What do you think?

    Monday, November 17, 2008 10:42 PM