\input  "supp-pdf"
\input "/usr/local/yacco2/diagrams/o2mac.tex"
\DOCtitle{Lr K Vocabulary}{yacco2\_k\_symbols}
{NS\_yacco2\_k\_symbols}{8}

@i "/usr/local/yacco2/copyright.w"
@** K symbols vocabulary.\fbreak
Ahh the ``Constant grammar'' symbols used throughout all grammars. 
Depending on the command line options \O2 can generate the grammar
and possibly the various flavours of the Terminal vocabulary. 
Under normal development, the grammar writer 
compiles and emits just the grammar.
Flavours of the terminal vocabulary  are not usually generated unless
there have been changes
to either {\bf error} or
{\bf terminal} with accompaning command line option.
At the
initial ``big bang'' of bootstrapping \O2's library and compiler / compiler,
both {\bf raw characters} and {\bf lr k} terminals were generated
using their command line options 
{/rc} and {/lrk}. The command line option 
now uses an Unix style {-t, -err} and
these 2 terminal types ``lrk'' and ``raw characters'' 
are now cast in cement: u cannot regen them
 from their ``*.T'' file definitions but
u can change their ``big bang'' generated ``c++'' modules.
These terminals are now read-only: 
they will never be changed by a user of \O2 and who'd want to anyway?.

The hardwired  ``k'' terminals are used by \O2's library
 for internal parsing situations.
Apart from {\bf eog} who represents 
the end-of-grammar and end-of-file conditions,
all other definitions are not part of the token source 
 stream being parsed.
I call them meta terminals
as they are never in the token stream but 
represent internal parsing conditions within the emitted
finite-state table that triggers the \O2's library routines.
For example, the presence of \paralleloperator within 
a parse state  
indicates the potential ``to run'' threads.
If u look carefully, their file definitions and 
implementations reside in \O2's ``../yacco2/library/grammars'' folder.
Their definition files are ``yacco2\_k\_symbols.T'' and
``yacco2\_characters.T'' with 
their ``c++'' variants having the ``.h'' and ''.cpp'' extensions.

@*2 {\bf eog}.\fbreak
Enum: T\_LR1\_eog\_
\fbreak
\line{Class: LR1\_eog \hfil AB: N \hfil AD: N}


 Used to indicate an end-of-grammar or an end-of-file condition.
 When the token container is reached, calls for another terminal
 will always return the |eog|.
 It's your door bouncer before hell.
 
\fbreak 
\hrule
@*3 eog user-declaration directive.
@<eog user-declaration directive@>=

    LR1_eog();
  
@*3 eog user-implementation directive.
@<eog user-implementation directive@>=

    LR1_eog::LR1_eog()      
    T_CTOR("eog",T_LR1_eog_,0,false,false)
	{}
    LR1_eog LR1_eog__;
    yacco2::CAbs_lr1_sym* yacco2::PTR_LR1_eog__ = &LR1_eog__;
  
@*2 {\bf eolr}.\fbreak
Enum: T\_LR1\_eolr\_
\fbreak
\line{Class: LR1\_eolr \hfil AB: N \hfil AD: N}


 Used to indicate all-terminals of the 
 terminal vocabulary including itself.
 It saves finger blisters by not having
 to be explicit in the thread's lookahead expression.
 Dieting hasn't been this effective to code bloat. 
 
\fbreak 
\hrule
@*3 eolr user-declaration directive.
@<eolr user-declaration directive@>=

    LR1_eolr();
  
@*3 eolr user-implementation directive.
@<eolr user-implementation directive@>=

    LR1_eolr::LR1_eolr()
      T_CTOR("eolr",T_LR1_eolr_,0,false,false)
    {}
    LR1_eolr LR1_eolr__;
    yacco2::CAbs_lr1_sym* NS_yacco2_k_symbols::PTR_LR1_eolr__ = &LR1_eolr__;
  
@*2 {\bf \ALLshift{}}.\fbreak
Enum: T\_LR1\_all\_shift\_operator\_
\fbreak
\line{Class: LR1\_all\_shift\_operator \hfil AB: N \hfil AD: N}


 Represents the wild token situation.
 Lowers the specific shifts of the finite-state-table and
 allows the grammar writer to field the unexpected from returned 
 threads. Good stuff. 

Caveat: One should use the \QUEshift to field unknow return Tes if
they are to be interpreted as errors.  
 
\fbreak 
\hrule
@*3 \ALLshift\BRACEOPEN{}\BRACECLOSE{} user-declaration directive.
@<\ALLshift\BRACEOPEN{}\BRACECLOSE{} user-declaration directive@>=

    LR1_all_shift_operator();
  
@*3 \ALLshift\BRACEOPEN{}\BRACECLOSE{} user-implementation directive.
@<\ALLshift\BRACEOPEN{}\BRACECLOSE{} user-implementation directive@>=

    LR1_all_shift_operator::LR1_all_shift_operator()
      T_CTOR("|+|",T_LR1_all_shift_operator_,0,false,false)
    {}
    LR1_all_shift_operator LR1_all_shift_operator__;
    yacco2::CAbs_lr1_sym* NS_yacco2_k_symbols::PTR_LR1_all_shift_operator__ 
= &LR1_all_shift_operator__;
  
@*2 {\bf \INVshift{}}.\fbreak
Enum: T\_LR1\_invisible\_shift\_operator\_
\fbreak
\line{Class: LR1\_invisible\_shift\_operator \hfil AB: N \hfil AD: N}


 It's a nice way to program 
 out of an ambiguous grammar.
 It can also lower the code bloat
 of a thread's first set. 
 
\fbreak 
\hrule
@*3 \INVshift\BRACEOPEN{}\BRACECLOSE{} user-declaration directive.
@<\INVshift\BRACEOPEN{}\BRACECLOSE{} user-declaration directive@>=

    LR1_invisible_shift_operator();
  
@*3 \INVshift\BRACEOPEN{}\BRACECLOSE{} user-implementation directive.
@<\INVshift\BRACEOPEN{}\BRACECLOSE{} user-implementation directive@>=

    LR1_invisible_shift_operator::LR1_invisible_shift_operator()
      T_CTOR("|.|",T_LR1_invisible_shift_operator_,0,false,false)
    {}
    LR1_invisible_shift_operator LR1_invisible_shift_operator__;
    yacco2::CAbs_lr1_sym* NS_yacco2_k_symbols::PTR_LR1_invisible_shift_operator__
 = &LR1_invisible_shift_operator__;
  
@*2 {\bf \QUEshift{}}.\fbreak
Enum: T\_LR1\_questionable\_shift\_operator\_
\fbreak
\line{Class: LR1\_questionable\_shift\_operator \hfil AB: N \hfil AD: N}


 Represents a questionable grammar situation. 
It pinpoints programmed error points within the grammar.
The subrule using this symbol has a lr(0) reduction as
the lookahead is not kosher and so would probably not reduce 
in the lr(1) context. 
It can be used both in the following grammar expressions:\fbreak
\INDENT{1.5cm}{1) \subrule \QUEshift}
\INDENT{1.5cm}{2) \subrule \PARshift \quad \QUEshift \quad NULL}
Point 1 covers the state where the current token being parsed
is improper. Point 2 is more interesting as it captures
a returned terminal that the thread passes back as an error.

The \QUEshift was not one of the original
``k'' terminals. It replaced the ``eof'' terminal 
which was marginal in intent.
I felt the \QUEshift symbol drew the reader's eye of the grammar where
``faulty'' points where captured and to force lr(0) context processing
to reduce its subrule.
Why lr(0) context? Glad u asked, the lookahead terminal --- the current 
terminal being parsed, is in error and 
so ``how is the subrule with the \QUEshift to reduce after its shifted T?''.
It must be divorced of any lookahead and just acted upon.

Now another question arises: ``how is this condition detected in 
a parsing state of
mixed conditions --- threading, shifting, reducing''?
There is a pecking order on the conditions tried by the parser:\fbreak
\INDENT{1.5cm}{$\circ$ threading} 
\INDENT{2.5cm}{if tried and unsuccessful the balance of conditions are attempted}
\INDENT{1.5cm}{$\circ$ shifts pecking order by their presence in current parse state:}
\INDENT{2.5cm}{can the current token be shifted?}
\INDENT{2.5cm}{\QUEshift --- error condition}
\INDENT{2.5cm}{\INVshift --- explicit \emptyrule}
\INDENT{2.5cm}{\ALLshift --- any terminial}
\INDENT{1.5cm}{$\circ$ reduce}
\INDENT{2.5cm}{note shifting is favoured over reducing}
 
\fbreak 
\hrule
@*3 \QUEshift\BRACEOPEN{}\BRACECLOSE{} user-declaration directive.
@<\QUEshift\BRACEOPEN{}\BRACECLOSE{} user-declaration directive@>=

    LR1_questionable_shift_operator();
  
@*3 \QUEshift\BRACEOPEN{}\BRACECLOSE{} user-implementation directive.
@<\QUEshift\BRACEOPEN{}\BRACECLOSE{} user-implementation directive@>=

    LR1_questionable_shift_operator::LR1_questionable_shift_operator()
      T_CTOR("|?|",T_LR1_all_shift_operator_,0,false,false)
    {}
    LR1_questionable_shift_operator LR1_questionable_shift_operator__;
    yacco2::CAbs_lr1_sym* NS_yacco2_k_symbols::PTR_LR1_questionable_shift_operator__ 
= &LR1_questionable_shift_operator__;
  
@*2 {\bf \REDshift{}}.\fbreak
Enum: T\_LR1\_reduce\_operator\_
\fbreak
\line{Class: LR1\_reduce\_operator \hfil AB: Y \hfil AD: Y}


 Its presence within the individual state of
 the ``fsm'' table is to force a reduce operation.
 Why? it's a back-to-back 
 situation within the state table whereby a thread should reduce 
 while its reducing lookahead is the \paralleloperator indicating to run a thread. 
 
\fbreak 
\hrule
@*2 {\bf \TRAshift{}}.\fbreak
Enum: T\_LR1\_fset\_transience\_operator\_
\fbreak
\line{Class: LR1\_fset\_transience\_operator \hfil AB: Y \hfil AD: Y}


 \TRAshift has dual purposes: used in  \O2linker to process the transient
 first sets generated by threads, and used within a grammar's
``chained call procedure'' expression to lower thread overhead by
calling a procedure with explicit intent on double use of its
``first set'' token.
I'll give an example of a ``chained procedure call'' expression drawn from
the ``pass3.lex'' grammar handling the grammar's file include expression:\fbreak
\fbreak
\INDENT{1cm}{\subrule "\ATsign" Rprefile\_inc\_dispatcher}
\fbreak
The ``Rprefile\_inc\_dispatcher'' grammar rule has the following subrule:\fbreak
\fbreak
\INDENT{1cm}{\subrule  \TRAshift "file-inclusion" NS\_prefile\_include::PROC\_TH\_prefile\_include}
\fbreak
The ``chained'' part is in the duplicating of ``\ATsign'';
that is, the parsing mechanism does not get a new terminal when shifted but
passes this T onto the called procedure.
The called PROC\_TH\_prefile\_include procedure / thread has its 
start rule as:\fbreak
\fbreak
\INDENT{1cm}{\subrule  "\ATsign" Rpossible\_ws  Rfile\_string Reof}
\fbreak
The repeated use of ``\ATsign'' was to reenforce the idea 
that the procedure called cuz of ``\ATsign'': there's that ``first set'' again.
Well time will pass its comments on this thought process. 
 
\fbreak 
\hrule
@*3 \TRAshift\BRACEOPEN{}\BRACECLOSE{} user-declaration directive.
@<\TRAshift\BRACEOPEN{}\BRACECLOSE{} user-declaration directive@>=

    LR1_fset_transience_operator();
  
@*3 \TRAshift\BRACEOPEN{}\BRACECLOSE{} user-implementation directive.
@<\TRAshift\BRACEOPEN{}\BRACECLOSE{} user-implementation directive@>=

    LR1_fset_transience_operator::LR1_fset_transience_operator()
      T_CTOR("|t|",T_LR1_fset_transience_operator_,0,false,false)
    {}
    LR1_fset_transience_operator LR1_fset_transience_operator__;
    yacco2::CAbs_lr1_sym* NS_yacco2_k_symbols::PTR_LR1_fset_transience_operator__ 
        = &LR1_fset_transience_operator__;
  
@*2 {\bf \PARshift{}}.\fbreak
Enum: T\_LR1\_parallel\_operator\_
\fbreak
\line{Class: LR1\_parallel\_operator \hfil AB: N \hfil AD: N}


 Its presence within the individual state of
 the ``fsm'' table dictates potential threads to run.
 You see it sprinkled throughout my grammars 
 to call threads.
 This is part of \O2's raison d'{\^e}tre. 
 
\fbreak 
\hrule
@*3 \PARshift\BRACEOPEN{}\BRACECLOSE{} user-declaration directive.
@<\PARshift\BRACEOPEN{}\BRACECLOSE{} user-declaration directive@>=

    LR1_parallel_operator();
  
@*3 \PARshift\BRACEOPEN{}\BRACECLOSE{} user-implementation directive.
@<\PARshift\BRACEOPEN{}\BRACECLOSE{} user-implementation directive@>=

    LR1_parallel_operator::LR1_parallel_operator()
      T_CTOR("|||",T_LR1_parallel_operator_,0,false,false)
    {}
    LR1_parallel_operator LR1_parallel_operator__;
    yacco2::CAbs_lr1_sym* NS_yacco2_k_symbols::PTR_LR1_parallel_operator__ 
= &LR1_parallel_operator__;
  
@*1 {\bf lrk-sufx} directive.\fbreak

As they are constants, they are defined globally
to save space / overhead in the typical new create / delete 
cycle of terminals. Thar's recycling going on in this green space.

@<lrk-sufx directive@>=

    extern yacco2::CAbs_lr1_sym* PTR_LR1_parallel_operator__;
    extern yacco2::CAbs_lr1_sym* PTR_LR1_fset_transience_operator__;
    extern yacco2::CAbs_lr1_sym* PTR_LR1_invisible_shift_operator__;
    extern yacco2::CAbs_lr1_sym* PTR_LR1_questionable_shift_operator__;
    extern yacco2::CAbs_lr1_sym* PTR_LR1_all_shift_operator__;
    extern yacco2::CAbs_lr1_sym* PTR_LR1_eolr__;


@** Index.
