Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resurrect the %errorhandlertype directive for back-compat (#320) #322

Merged
merged 1 commit into from
Oct 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions ChangeLog.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,20 @@
# Revision history for Happy

## 2.1.1

This release fixes two breaking changes:

* Properly qualify all uses of Prelude functions, fixing #131
* Bring back the old `%errorhandlertype` directive, the use of which is
discouraged in favour of the "Reporting expected tokens" mechanism
in Happy 2.1, accesible via `%error.expected`.

## 2.1

* Added `--numeric-version` CLI flag.
* Documented and implemented the new feature "Resumptive parsing with ``catch``"
* Documented (and reimplemented) the "Reporting expected tokens" feature
(which turned to cause a breaking change in this release: #320)

## 2.0.2

Expand Down
91 changes: 89 additions & 2 deletions doc/syntax.rst
Original file line number Diff line number Diff line change
Expand Up @@ -273,10 +273,34 @@ Error declaration

%error { <identifier> }

%error { <identifier> } { <identifier> }

.. index:: ``%error``

Specifies the function to be called in the event of a parse error.
The type of ``<identifier>`` varies depending on the presence of ``%lexer`` (see :ref:`Summary <sec-monad-summary>`) and ``%errorhandlertype`` (see the following).
(optional)
Specifies the functions to be called in the event of a parse error.

The first, one-action form specifies a single function (often referred to as
``parseError``) that reports the error and aborts the parse (in the sense of
early return).
When ``%error`` is not specified, the function is assumed to be called ``happyError``.

The type of ``parseError`` varies depending on the presence of ``%lexer``
(see :ref:`Summary <sec-monad-summary>`) and
the :ref:``presence of `%error.expected`` <sec-error-expected-directive>`.

The second, two-action form specifies a pair of functions ``abort`` and
``report`` which are necessary to handle multiple parse errors during
:ref:`resumptive parsing using the ``catch`` mechanism <sec-catch>`.
In this case, ``report`` is called for every parse error and additionally
receives a continuation for resuming the parse as the last argument.
When Happy is unable to resume the parse after a parse error, it calls
``abort``, which is *not* supposed to report an error as well.

To illustrate the correspondence between the two forms:
In a non-resumptive parser (i.e. one that does not use ``catch``),
the one-action form ``%error { \\ tks -> report tks abort }`` is equivalent to
the two-action form ``%error { abort } { report }``.

.. _sec-errorhandlertype-directive:

Expand All @@ -289,6 +313,69 @@ Additional error information

.. index:: ``%errorhandlertype``

(deprecated)
Happy 2.1 overhauled and superseded this directive in favour of the simple,
optional flag directive ``%error.expected``. See <sec-error-expected-directive>.

.. _sec-error-expected-directive:

Reporting expected tokens
-------------------------

.. index:: ``%error.expected``

(optional)
Often, it is useful to present users with suggestions as to which kind of tokens
where expected at the site of a syntax error.
To this end, when the ``%error.expected`` directive is specified, happy assumes that
the error handling function (resp. ``report`` function when using the binary
form of the ``%error`` directive) takes a ``[String]`` argument (the argument
*after* the token stream, in case of a non-threaded lexer) listing all the
stringified tokens that could be shifted at the site of the syntax error.
The strings in this list are derived from the ``%token`` directive.

Here is an example, inspired by test case ``monaderror-explist``:

.. code-block:: none

%tokentype { Token }
%error { handleErrorExpList }
%error.expected

%monad { ParseM } { (>>=) } { return }

%token
'S' { TokenSucc }
'Z' { TokenZero }
'T' { TokenTest }

%%

Exp : 'Z' { 0 }
| 'T' 'Z' Exp { $3 + 1 }
| 'S' Exp { $2 + 1 }

%%

type ParseM = ...

handleErrorExpList :: [Token] -> [String] -> ParseM a
handleErrorExpList ts explist = throwError $ ParseError $ explist

...


Additional error information
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

::

%error.expected

.. index:: ``%error.expected``

Deprecated in favour of the simple, optional flag directive ``%error.expected``.

(optional)
The expected type of the user-supplied error handling can be applied with additional information.
By default, no information is added, for compatibility with previous versions.
Expand Down
57 changes: 3 additions & 54 deletions doc/using.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1053,7 +1053,8 @@ simple non-threaded lexer):
...

Note the use of ``catch`` in the second ``Exp`` rule and
the use of the binary form of the ``%error`` directive.
the use of the two-action form of the ``%error`` directive
(see :ref:`the documentation for ``%error`` <sec-error-directive>`).
The directive specifies a pair of functions ``abort`` and ``report``
which are necessary to handle multiple parse errors.

Expand Down Expand Up @@ -1110,15 +1111,9 @@ A couple of notes:
Similarly, ``abort`` must always throw an exception and cannot return a
syntax tree at all. It should *not* report a parse error as well.

To illustrate how the new binary ``%error`` decomposition corresponds to
the regular unary one, consider the definition
``myError tks = report tks abort``.
This definition could be used in ``%error { myError }``; in this case, the
parser would always abort after the first error.

* Whether or not the ``abort`` and ``report`` functions get passed the
list of tokens is subject to the :ref:`same decision logic as for ``parseError`` <sec-monad-summary>`.
When using :ref:`the ``%error.expected`` directive <sec-expected-list>`,
When using :ref:`the ``%error.expected`` directive <sec-error-expected-directive>`,
the list of expected tokens is passed to ``report`` only, between ``tks``
and ``resume``.

Expand All @@ -1127,52 +1122,6 @@ to the user of happy; the example above simply emitted the string ``catch``
whenever it stands-in an for an errorneous AST node.
A more reasonable implementation would be similar to typed holes in GHC.

.. _sec-expected-list:

Reporting expected tokens
-------------------------

.. index:: expected tokens

Often, it is useful to present users with suggestions as to which kind of tokens
where expected at the site of a syntax error.
To this end, when ``%error.expected`` directive is specified, happy assumes that
the error handling function (resp. ``report`` function when using the binary
form of the ``%error`` directive) takes a ``[String]`` argument (the argument
*after* the token stream, in case of a non-threaded lexer) listing all the
stringified tokens that were expected at the site of the syntax error.
The strings in this list are derived from the ``%token`` directive.

Here is an example, inspired by test case ``monaderror-explist``:

.. code-block:: none

%tokentype { Token }
%error { handleErrorExpList }
%error.expected

%monad { ParseM } { (>>=) } { return }

%token
'S' { TokenSucc }
'Z' { TokenZero }
'T' { TokenTest }

%%

Exp : 'Z' { 0 }
| 'T' 'Z' Exp { $3 + 1 }
| 'S' Exp { $2 + 1 }

%%

type ParseM = ...

handleErrorExpList :: [Token] -> [String] -> ParseM a
handleErrorExpList ts explist = throwError $ ParseError $ explist

...

.. _sec-multiple-parsers:

Generating Multiple Parsers From a Single Grammar
Expand Down
4 changes: 2 additions & 2 deletions happy.cabal
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: happy
version: 2.1
version: 2.1.1
license: BSD2
license-file: LICENSE
copyright: (c) Andy Gill, Simon Marlow
Expand Down Expand Up @@ -139,7 +139,7 @@ executable happy
array,
containers >= 0.4.2,
mtl >= 2.2.1,
happy-lib == 2.1
happy-lib == 2.1.1

default-language: Haskell98
default-extensions: CPP, MagicHash, FlexibleContexts, NamedFieldPuns
Expand Down
8 changes: 4 additions & 4 deletions lib/backend-glr/src/Happy/Backend/GLR/ProduceCode.lhs
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ the driver and data strs (large template).
> -> Maybe String -- User-defined stuff (token DT, lexer etc.)
> -> (DebugMode,Options) -- selecting code-gen style
> -> Grammar String -- Happy Grammar
> -> Pragmas -- Pragmas in the .y-file
> -> Directives -- Directives in the .y-file
> -> (String -- data
> ,String) -- parser
>
Expand Down Expand Up @@ -372,7 +372,7 @@ Do the same with the Happy goto table.
%-----------------------------------------------------------------------------
Create the 'GSymbol' ADT for the symbols in the grammar

> mkGSymbols :: Grammar String -> Pragmas -> ShowS
> mkGSymbols :: Grammar String -> Directives -> ShowS
> mkGSymbols g pragmas
> = str dec
> . str eof
Expand Down Expand Up @@ -423,7 +423,7 @@ Creating a type for storing semantic rules
> type SemInfo
> = [(String, String, [Int], [((Int, Int), ([(Int, TokenSpec)], String), [Int])])]

> mkGSemType :: Options -> Grammar String -> Pragmas -> (ShowS, SemInfo)
> mkGSemType :: Options -> Grammar String -> Directives -> (ShowS, SemInfo)
> mkGSemType (TreeDecode,_,_) g pragmas
> = (def, map snd syms)
> where
Expand Down Expand Up @@ -673,7 +673,7 @@ only unpacked when needed. Using classes here to manage the unpacking.
This selects the info used for monadic parser generation

> type MonadInfo = Maybe (String,String,String)
> monad_sub :: Pragmas -> MonadInfo
> monad_sub :: Directives -> MonadInfo
> monad_sub pragmas
> = case monad pragmas of
> (True, _, ty,bd,ret) -> Just (ty,bd,ret)
Expand Down
46 changes: 21 additions & 25 deletions lib/backend-lalr/src/Happy/Backend/LALR/ProduceCode.lhs
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ Produce the complete output file.

> produceParser :: Grammar String -- grammar info
> -> Maybe AttributeGrammarExtras
> -> Pragmas -- pragmas supplied in the .y-file
> -> Directives -- directives supplied in the .y-file
> -> ActionTable -- action table
> -> GotoTable -- goto table
> -> [String] -- language extensions
Expand All @@ -53,7 +53,7 @@ Produce the complete output file.
> , starts = starts'
> })
> mAg
> (Pragmas
> (Directives
> { lexer = lexer'
> , imported_identity = imported_identity'
> , monad = (use_monad,monad_context,monad_tycon,monad_then,monad_return)
Expand Down Expand Up @@ -365,19 +365,16 @@ happyMonadReduce to get polymorphic recursion. Sigh.
The token conversion function.

> produceTokenConverter
> = case lexer' of {
>
> Nothing ->
> str "happyTerminalToTok term = case term of {\n" . indent
> = str "happyTerminalToTok term = case term of {\n" . indent
> . (case lexer' of Just (_, eof') -> str eof' . str " -> " . eofTok . str ";\n" . indent; _ -> id)
> . interleave (";\n" ++ indentStr) (map doToken token_rep)
> . str "_ -> -1#;\n" . indent . str "}\n" -- -1# signals an invalid token
> . str "_ -> -1#;\n" . indent . str "}\n" -- token number -1# (INVALID_TOK) signals an invalid token
> . str "{-# NOINLINE happyTerminalToTok #-}\n"
> . str "\n"
> . str "happyLex kend _kmore [] = kend notHappyAtAll []\n"
> . str "happyLex _kend kmore (tk:tks)\n"
> . str " | Happy_GHC_Exts.tagToEnum# (i Happy_GHC_Exts.==# -1#) = happyReport' (tk:tks) [] happyAbort\n" -- invalid token (-1#); lexer error.
> . str " | Prelude.otherwise = kmore i tk tks\n"
> . str " where i = happyTerminalToTok tk\n"
> . str "\n" .
> (case lexer' of {
> Nothing ->
> str "happyLex kend _kmore [] = kend notHappyAtAll []\n"
> . str "happyLex _kend kmore (tk:tks) = kmore (happyTerminalToTok tk) tk tks\n"
> . str "{-# INLINE happyLex #-}\n"
> . str "\n"
> . str "happyNewToken action sts stk = happyLex (\\tk -> " . eofAction "notHappyAtAll" . str ") ("
Expand All @@ -390,13 +387,7 @@ The token conversion function.
> . str "\n";

> Just (lexer'',eof') ->
> str "happyTerminalToTok term = case term of {\n" . indent
> . str eof' . str " -> " . eofTok . str ";\n" . indent
> . interleave (";\n" ++ indentStr) (map doToken token_rep)
> . str "_ -> Prelude.error \"Encountered a token that was not declared to happy\"\n" . indent . str "}\n"
> . str "{-# NOINLINE happyTerminalToTok #-}\n"
> . str "\n"
> . str "happyLex kend kmore = " . str lexer'' . str " (\\tk -> case tk of {\n" . indent
> str "happyLex kend kmore = " . str lexer'' . str " (\\tk -> case tk of {\n" . indent
> . str eof' . str " -> kend tk;\n" . indent
> . str "_ -> kmore (happyTerminalToTok tk) tk })\n"
> . str "{-# INLINE happyLex #-}\n"
Expand All @@ -409,7 +400,7 @@ The token conversion function.
> -- superfluous pattern match needed to force happyReport to
> -- have the correct type.
> . str "\n";
> }
> })

> where

Expand Down Expand Up @@ -729,16 +720,21 @@ in the presence of the %error.expected directive.
The last argument is the "resumption", a continuation that tries to find
an item on the stack taking a @catch@ terminal where parsing may resume,
in the presence of the two-argument form of the %error directive.
In order to support the legacy %errorhandlertype directive, we retain
have a special code path for `OldExpected`.

> callReportError = -- this one wraps around report_error_handler to expose a unified interface
> str "(\\tokens expected resume -> " .
> (if use_monad then str ""
> else str "HappyIdentity Prelude.$ ") .
> report_error_handler .
> (case (error_handler', lexer') of (DefaultErrorHandler, Just _) -> id
> _ -> str " tokens") .
> (if error_expected' then str " expected"
> else id) .
> (case error_expected' of
> OldExpected -> str " (tokens, expected)" -- back-compat for %errorhandlertype
> _ ->
> (case (error_handler', lexer') of (DefaultErrorHandler, Just _) -> id
> _ -> str " tokens") .
> (case error_expected' of NewExpected -> str " expected"
> NoExpected -> id)) .
> (case error_handler' of ResumptiveErrorHandler{} -> str " resume"
> _ -> id) .
> str ")"
Expand Down
1 change: 1 addition & 0 deletions lib/data/HappyTemplate.hs
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
type Happy_Int = Happy_GHC_Exts.Int#
data Happy_IntList = HappyCons Happy_Int Happy_IntList

#define INVALID_TOK -1#
#define ERROR_TOK 0#
#define CATCH_TOK 1#

Expand Down
2 changes: 2 additions & 0 deletions lib/frontend/boot-src/Parser.ly
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ The parser.
> spec_expect { TokenKW TokSpecId_Expect }
> spec_error { TokenKW TokSpecId_Error }
> spec_errorexpected { TokenKW TokSpecId_ErrorExpected }
> spec_errorhandlertype { TokenKW TokSpecId_ErrorHandlerType }
> spec_attribute { TokenKW TokSpecId_Attribute }
> spec_attributetype { TokenKW TokSpecId_Attributetype }
> code { TokenInfo $$ TokCodeQuote }
Expand Down Expand Up @@ -125,6 +126,7 @@ The parser.
> | spec_expect int { TokenExpect $2 }
> | spec_error code optCode { TokenError $2 $3 }
> | spec_errorexpected { TokenErrorExpected }
> | spec_errorhandlertype id { TokenErrorHandlerType $2 }
> | spec_attributetype code { TokenAttributetype $2 }
> | spec_attribute id code { TokenAttribute $2 $3 }

Expand Down
Loading