Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create GF source files from a CF file #150

Open
inariksit opened this issue Jan 5, 2023 · 2 comments
Open

Create GF source files from a CF file #150

inariksit opened this issue Jan 5, 2023 · 2 comments

Comments

@inariksit
Copy link
Member

inariksit commented Jan 5, 2023

I want to be able to do this:

$ cat myGrammar.cf
S ::= "Hello" "World" ;

$ gf -f gf myGrammar.cf
Writing myGrammar.gf myGrammarCnc.gf

$ cat myGrammar.gf
abstract myGrammar = {
cat S ; 
fun S_Hello_World : S ;
}

concrete myGrammarCnc of myGrammar = {
lincat S = Str ; 
lin S_Hello_World = "Hello" ++ "World" ;
}

I infer from this line that this has been an intention 12 years ago.

cfGF :: FilePath -> IO GF

However, this line shows that it has never worked:

cfGF = error "no cfGF"

I see also that the conversion from cf goes directly into PGF here without going to GF source code first:

compileCFFiles :: Options -> [FilePath] -> IOE ()
compileCFFiles opts fs = do
bnfc_rules <- fmap concat $ mapM (getBNFCRules opts) fs
let rules = bnfc2cf bnfc_rules
startCat <- case rules of
(Rule cat _ _ : _) -> return cat
_ -> fail "empty CFG"
let pgf = cf2pgf (last fs) (mkCFG startCat Set.empty rules)
unless (flag optStopAfterPhase opts == Compile) $
do probs <- liftIO (maybe (return . defaultProbabilities) readProbabilitiesFromFile (flag optProbsFile opts) pgf)
let pgf' = setProbabilities probs $ if flag optOptimizePGF opts then optimizePGF pgf else pgf
writePGF opts pgf'
writeOutputs opts pgf'

Is there any way to piece together generation of GF source code from existing code, such as using any functions that produce canonical GF? I have already tried to use the -f canonical_gf flag, but it doesn't work for cf files as input, only for gf files.

@anka-213
Copy link
Member

anka-213 commented Jan 5, 2023

It doesn't look like it's a simple case of just connecting the components. These are the functions responsible for producing the canonical gf grammar for canonical_gf

-- | Generate Canonical code for the named abstract syntax
abstract2canonical :: ModuleName -> G.Grammar -> Abstract

-- | Generate Canonical GF for the given concrete module.
concrete2canonical :: G.Grammar -> GlobalEnv -> ModuleName -> ModuleName -> ModuleInfo -> Concrete

and they require a GF.Grammar.Grammar.Grammar, which seems to contain a lot of details from a GF file:

-- | A grammar is a self-contained collection of grammar modules
data Grammar = MGrammar {
moduleMap :: Map.Map ModuleName ModuleInfo,
modules :: [Module]
}
-- | Modules
type Module = (ModuleName, ModuleInfo)
data ModuleInfo = ModInfo {
mtype :: ModuleType,
mstatus :: ModuleStatus,
mflags :: Options,
mextend :: [(ModuleName,MInclude)],
mwith :: Maybe (ModuleName,MInclude,[(ModuleName,ModuleName)]),
mopens :: [OpenSpec],
mexdeps :: [ModuleName],
msrc :: FilePath,
mseqs :: Maybe (Array SeqId Sequence),
jments :: Map.Map Ident Info
}

They convert it to the GF.Grammar.Canonical.Grammar format:

-- | Abstract Syntax
data Abstract = Abstract ModId Flags [CatDef] [FunDef] deriving Show
abstrName (Abstract mn _ _ _) = mn
data CatDef = CatDef CatId [CatId] deriving Show
data FunDef = FunDef FunId Type deriving Show
data Type = Type [TypeBinding] TypeApp deriving Show
data TypeApp = TypeApp CatId [Type] deriving Show
data TypeBinding = TypeBinding VarId Type deriving Show
--------------------------------------------------------------------------------
-- ** Concreate syntax
-- | Concrete Syntax
data Concrete = Concrete ModId ModId Flags [ParamDef] [LincatDef] [LinDef]
deriving Show
concName (Concrete cnc _ _ _ _ _) = cnc
data ParamDef = ParamDef ParamId [ParamValueDef]
| ParamAliasDef ParamId LinType
deriving Show
data LincatDef = LincatDef CatId LinType deriving Show
data LinDef = LinDef FunId [VarId] LinValue deriving Show
-- | Linearization type, RHS of @lincat@
data LinType = FloatType
| IntType
| ParamType ParamType
| RecordType [RecordRowType]
| StrType
| TableType LinType LinType
| TupleType [LinType]
deriving (Eq,Ord,Show)
newtype ParamType = ParamTypeId ParamId deriving (Eq,Ord,Show)
-- | Linearization value, RHS of @lin@
data LinValue = ConcatValue LinValue LinValue
| LiteralValue LinLiteral
| ErrorValue String
| ParamConstant ParamValue
| PredefValue PredefId
| RecordValue [RecordRowValue]
| TableValue LinType [TableRowValue]
--- | VTableValue LinType [LinValue]
| TupleValue [LinValue]
| VariantValue [LinValue]
| VarValue VarValueId
| PreValue [([String], LinValue)] LinValue
| Projection LinValue LabelId
| Selection LinValue LinValue
| CommentedValue String LinValue
deriving (Eq,Ord,Show)
data LinLiteral = FloatConstant Float
| IntConstant Int
| StrConstant String
deriving (Eq,Ord,Show)
data LinPattern = ParamPattern ParamPattern
| RecordPattern [RecordRow LinPattern]
| TuplePattern [LinPattern]
| WildPattern
deriving (Eq,Ord,Show)
type ParamValue = Param LinValue
type ParamPattern = Param LinPattern
type ParamValueDef = Param ParamId
data Param arg = Param ParamId [arg]
deriving (Eq,Ord,Show,Functor,Foldable,Traversable)
type RecordRowType = RecordRow LinType
type RecordRowValue = RecordRow LinValue
type TableRowValue = TableRow LinValue
data RecordRow rhs = RecordRow LabelId rhs
deriving (Eq,Ord,Show,Functor,Foldable,Traversable)
data TableRow rhs = TableRow LinPattern rhs
deriving (Eq,Ord,Show,Functor,Foldable,Traversable)
-- *** Identifiers in Concrete Syntax
newtype PredefId = PredefId Id deriving (Eq,Ord,Show)
newtype LabelId = LabelId Id deriving (Eq,Ord,Show)
data VarValueId = VarValueId QualId deriving (Eq,Ord,Show)
-- | Name of param type or param value
newtype ParamId = ParamId QualId deriving (Eq,Ord,Show)
--------------------------------------------------------------------------------
-- ** Used in both Abstract and Concrete Syntax
newtype ModId = ModId Id deriving (Eq,Ord,Show)
newtype CatId = CatId Id deriving (Eq,Ord,Show)
newtype FunId = FunId Id deriving (Eq,Show)
data VarId = Anonymous | VarId Id deriving Show
newtype Flags = Flags [(FlagName,FlagValue)] deriving Show
type FlagName = Id
data FlagValue = Str String | Int Int | Flt Double deriving Show
-- *** Identifiers
type Id = RawIdent
data QualId = Qual ModId Id | Unqual Id deriving (Eq,Ord,Show)

which is then printed directly using render80

abs2canonical (cnc,gr) =
writeExport ("canonical/"++render absname++".gf",render80 canAbs)
where
absname = srcAbsName gr cnc
canAbs = abstract2canonical absname gr


So if one manage to write a function for converting a GF.Grammar.CFG.ParamCFG into either a GF.Grammar.Grammar.Grammar or a GF.Grammar.Canonical.Grammar, the rest would be simple, but the question is how to do that conversion. Perhaps the cfg2pgf function can at least be a source of inspiration:

cf2pgf :: FilePath -> ParamCFG -> PGF
cf2pgf fpath cf =

@inariksit
Copy link
Member Author

inariksit commented Jan 6, 2023

I see, thanks for digging into it @anka-213!

Abstract syntax is quite easy to copy and paste together, once you open the cf file in the GF shell and then type the commands pg -cats and pg -funs, copy those into a file and surround with the required abstract myGrammar = { cat … fun … }. But I couldn't find anything for producing the concrete.

Motivation for my question is to recreate this work https://github.com/smucclaw/sandbox/tree/default/aarne#readme , where the pipeline involves automatically producing first a CF grammar, then converting it to a GF grammar and continuing to refine the rules manually. But it may well be that the first step of string-based GF grammar is not even necessary, and one could jump right into RGL-based concrete syntax, which inevitably needs human effort. (Or an automated script, but that's not a matter of GF compiler to do it.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants