Spanish: New Grammar Module with Initialization #35

tajmone · 2021-09-19T09:06:52Z

tajmone
Sep 19, 2021
Maintainer

@Rich15, as mentioned in other Discussions (#33) I have the impression that the Spanish Foundation Library would greatly benefit from an initialization system that takes care of handling GNA (gender, number, animate/inanimate) by defining for each noun the required elements at initialization time.

I think that what I've done with the Alan3-Italian project (Italian translation of the StdLib 2) would be very close to what is needed here.
You can refer to the lib_italian.i module for the full fledged (and commented) code (the Italian Foundation is still in early draft, and haven't ported that code fully yet).

If you could study the lib_italian.i module it might provide you insight as to whether that would work for Spanish too. The ALAN code should be rather intuitive, and I've added lots of comments and examples, which should make it fairly intuitive to study — also, I think that the Spanish and Italian are similar enough for you to be able to understand all is linguistic references and examples (reading is always easy, the hard part is writing).

In any case, I think that the first step would be to move all code that relates to Spanish grammar issues to a dedicated module, e.g. gramática.i, so we have all the code in one place, which makes it easier to track what's doing what.
(ALAN doesn't impose order specific definitions, e.g. that a class declaration follows its instances, which makes it easy to organize code as one wishes).

Then, the second step will be to create at least one test example which covers all GNA cases (actor, containers, lists, inventories, etc.) so that we have in the repository a reference transcript with correct accordance of GNA. This would allow us to implement the initialization code and track any breaking changes, because any changes in the generated transcript would show up in Git's work area, allowing us to catch any unexpected result in changes diffs.

I can't stress how important it is to have a good test suite, especially before introducing big changes like this, because the transcripts are our means to ensure that the new code will produce the expected output.

In this respect, your help would be invaluable, because you could create one or more small test adventures to cover all the possible gender and number combinations for different game objects, actors, etc., and their test solution script to generate the transcripts — I can do very little here, due to lack of Spanish knowledge.

Solutions/transcripts (*.a3s/*.a3t) can also contain comments (starting with ;), so it's easy to annotate bugs, things to change, etc. Again, I use the ; @XXX notation in transcripts too (e.g. ; @BUG, ;@NOTE, etc.) to make it easy to search for annotation in transcripts throughout the entire repository, at once.

I'll now begin studying the Spanish code more closely, and move to a separate grammar module everything that has to do with grammar related definitions. I've also started to collect links to Spanish grammar references and tutorial in English, which I'll be adding to the Wiki, so I can get a clearer picture of what the initialization code should do.

I'll be posting here further considerations, "discoveries", proposal and questions, so we can keep this goal all in one discussion, at least until we've set some clear goals — then we could create a milestone in the Dashboard, with all its sub-tasks as individual Issues to accomplish (probably this change is going to require multiple steps).

Any suggestions in this respect? Questions? Considerations?

Rich15 · 2021-09-20T03:03:36Z

Rich15
Sep 20, 2021
Maintainer

@tajmone, after watching the Italian library I think a pretty similar system could be implemented to the Spanish one. Initializing GNA should be a little bit easier because there are no gender considerations for most prepositions (with two exceptions, which I'll list in a moment). I guess some conjunctions should also be included ("and", "to"...).

Prepositions gender and number

From the prepositions most used (and the ones I remember), only two change depending on gender or number:

"de"(of/di): When preceding a m.s definite article ("el"), the preposition "absorbs" the article. E.g: "El traje de el vendedor" => "El traje del vendedor". All other cases are written with the preposition followed by the article ("El vestido de la mujer"). This does not apply to Proper Nouns (like El Paso).
"su"(his/her/their): When preceding plural nouns, a "-s" is added at the end. E.g: "El vendedor y sus zapatos", "Las mellizas y sus botas". (I don't know if this actually could be used somewhere or not, but it's the other preposition I remembered).

I'd want to try and make some test suites, but I'll need your help with the setup and things related to it :) (see my reply in #33). In the meantime, I'll keep looking for translations and orthographic correction in the code. I had a busy weekend so I couldn't review much, but when I have made some more commits I'll start a pull request.

1 reply

tajmone Sep 20, 2021
Maintainer Author

@tajmone, after watching the Italian library I think a pretty similar system could be implemented to the Spanish one. Initializing GNA should be a little bit easier because there are no gender considerations for most prepositions

I also think that too. Also, Spanish doesn't have articles and prepositions that contract and join to the noun via apostrophe (e.g. "l'amico" instead of "lo amico"), which makes things way easier.

I'll now need to read through the whole Spanish library source code, to understand better how the original library is supposed to work (not a lot of instructions either), and then move all the grammar related code into a single module, since right now it's scattered all over the places.

From the prepositions most used (and the ones I remember), only two change depending on gender or number [...]

Right now, I have the impression the library is not handling much besides number and gender, and that defining the various articles is pretty much left to end authors. Here's were the Spanish Inform 6 sources will come to the rescue, for they contain all the required tables in the comments, which should help understand what needs to be covered by the initialization code.

Right now, I have some piled up work to catch up with, but once that's dealt with I'll be able to find enough time to look into all this (I'll probably need an entire afternoon to study all the library code and test with its details).

Now let's focus on getting you up and running with the repository (#36) ...

tajmone · 2021-10-06T01:04:39Z

tajmone
Oct 6, 2021
Maintainer Author

GNA References in INFSP 6 Sources

@Rich15, I've been looking at the sources of the Spanish library for Inform 6, which I also stored on the Wiki for reference. The file that interests us for the sake of the grammar module is Spanish.h.

Here follow a few selections from that source, with the GNA arrays and the original comments (which are quite useful).

Articles

Array LanguageArticles -->
    ! Forma de contracción 0:
    ! Cdef   Def   Indef
    "El "    "el   " "un "
    "La "    "la   " "una "
    "Los "   "los  " "unos "
    "Las "   "las  " "unas ";

                   !             a           i
                   !             s     p     s     p
                   !             m f n m f n m f n m f n
Array LanguageGNAsToArticles --> 0 1 0 2 3 2 0 1 0 2 3 2;

GNA Tables

 Funciones para manejar concordancias de género y número en los
! mensajes de respuesta genéricos.
!
! o -> escribe la terminación -o -a -os -as según el género y número
! del objeto recibido, para hacer concordar los adjetivos.
!
! n -> escribe la terminación -"" -n según el número del objeto, para
! hacer concordar el verbo en plural.
!
! esta -> escribe "está" o "están" según el número del objeto.
!
! del -> escribe "del" "de la" "de los" "de las" según el género y
! número del objeto. En realidad, determina qué artículo habría de ir
! y si éste es "el" realiza la contracción "del".
!
! al -> como del, pero con "al" "a la" "a los" "a las".
!
! lo -> Escribe "lo" "la" "le" "los" "las" "les" (proname) según el
! género y número del objeto y según sea animate o no.
!

[ o obj gna;

    gna=GetGNAOfObject(obj);
    switch(gna)
    {
     0,6:  print "o";
     1,7:  print "a";
     3,9:  print "os";
     4,10: print "as";
    }
];

[ e obj gna;

    gna=GetGNAOfObject(obj);
    switch(gna)
    {
     0,6:  print "e";
     1,7:  print "a";
     3,9:  print "es";
     4,10: print "as";
    }
];


[ n obj;
    if (obj == player) print "s";
    else if (obj has pluralname) print "n";
];

[ s obj;
    if (obj has pluralname) print "s";
];

[ esta obj;
    print "está", (n) obj;
];

[ es obj;
    if (obj has pluralname) print "son";
    else print "es";
];

[ _Es obj;
    if (obj has pluralname) print "Son";
    else print "Es";
];

[ _s obj;
    if (obj has pluralname) print "es";
];

[ el_ obj;
    if (obj hasnt proper) PrefaceByArticle(obj,1,-1);
];

[ un_ obj;
    if (obj hasnt proper) PrefaceByArticle(obj,2,-1);
];


[ el obj;
    print (the) obj;
];

[ _El obj;
    print (The) obj;
];

![ un obj;
!    if (un_(obj)) print " ";
!    print (name) obj;
!];

[ un obj;
!  if (obj has proper) print "a "; ![infsp] ya se incluyó esta linea en el hackeo de Indefart.
  print (a) obj;
];

[ _Un obj; ! indefinido con la primera letra en mayuscula. -Eliuk Blau
  print (A) obj;
];


[ _nombre_ obj;
    print (name) obj;
];

[ numero obj;
    print (number) obj;
];


! "al" y "del" plantean un bonito problema. La contracción debe
! producirse si el artículo es "el", pero esto no se sabe consultando
! el GNA, pues hay palabras como "aguila" que son femeninas y no
! obstante usan "el" como artículo.
! El truquillo consiste en llamar a (the) para averiguar qué artículo
! se imprimiría (pero capturando la salida de ese print y guardándola
! en una variable). Si el artículo en cuestión empieza por 'e', se
! produce la contracción.
!
! Para capturar la salida a una variable, es necesario descender al
! lenguaje ensamblador de la Z machine. Este truco lo aprendí del
! Designers manual (respuesta al ejercicio 96, página 249)

Pronouns Table

The following table is less clear to me what it's needed for, but I'm including it here anyway.

Array LanguagePronouns table

!   palabra  GNAs que pueden                conectado
!            seguirle:                      a:
!              a     i
!              s  p  s  p
!              mfnmfnmfnmfn

    '-lo'    $$101000100001                    NULL
    '-los'   $$000101000101                    NULL
    '-la'    $$010000010000                    NULL
    '-las'   $$000010000010                    NULL
    '-le'    $$110000110000                    NULL
    '-les'   $$000110000110                    NULL
    'él'     $$100000100000                    NULL
    'ella'   $$010000010000                    NULL
    'ellos'  $$000100000100                    NULL
    'ellas'  $$000010000010                    NULL;

Of course, these snippets on their own are a bit out of context, but I believe they represent well all the aspects that our initialization code will need to handle.

Also, it's important to notice that Inform tries to deduce most of the GNA procedurally, on the fly, whereas in ALAN we'll be assigning all the required references on each and every game instance, at initialization time.

So our goal is to be able to determine how to allow author to provide a single descriptor (e.g. the definite article) and be able to derive all other GNA related stuff from that. If the article is not enough, because of some edge cases (e.g. like the above mentioned "aguila" which is feminine even though it takes "el" as article) we can always provide a special reference for that — e.g. in the Italian library I used a pseudo article "l'f" to indicate the feminine variant of the "l'" article, which is masculine. So, there's always an easy workaround to handle exceptions, once the general picture is clear.

Now we only need to come up with a way to translated all these grammar rules into ALAN code, and trying to keep it as simple and terse as possible.

0 replies

tajmone · 2021-10-08T23:47:20Z

tajmone
Oct 8, 2021
Maintainer Author

New Grammar Module (DRAFT)

@Rich15, I've managed to setup the new gramática.i module, although still a draft right now. You can follow the ongoing work by checking out the dev_alan-es_gramatica branch locally.

Immediate Results!

So far, I've only added some extra code to define all Spanish articles as NOISE WORDS (i.e. to be ignored by the parser);

Synonyms el, él, la, los, las = 'the'.

and this single line alone allowed me to safely delete around 100 redundant SYNTAX definition from the library code! (have a look at the branch commits, to see for yourself how a small change can reduce code size)

E.g. the pALANte library previously required the following SYNTAX to handle all possible input combinations:

Syntax
  afilar = afilar (obj1) con (obj2)
  afilar = afilar el (obj1) con (obj2).
  afilar = afilar el (obj1) con el (obj2).
  afilar = afilar (obj1) con el (obj2).

whereas now it's sufficient to write only:

Syntax afilar = afilar (obj1) con (obj2)

since all articles are stripped away from the player input at parse time.

I also believe that now the library can handle properly commands with any articles, whereas before it only understood the "el" article.

The next step will be to define all propositions variants as synonyms of their base form (e.g. "del" → "de", etc.) so that SYNTAXes only need to mention one preposition for all genders and numbers, since the parser will translate all variations to the base form during parsing. This will further reduce the number of redundant SYNTAXes in the library, slimming down the code even more.

Initialization Code from pALANte Lib

While moving all "grammar related" code to the new module, I discovered that the original pALANte library did contain some sort of initialization code, defined as an EVENT which was supposed to be SCHEDULED for execution in the START section of every adventure (I didn't use it in the test adventures, which is why some GNA might not be showing correctly).

Here's the code:

-- Evento que inicializa las terminaciones de los objetos -- y palabras
-- como 'son' o 'es'.
-- No olvidar llamarlo en la sección "start", por ejemplo con el comando:
--    schedule ini_terms at limbo after 0.


Event ini_terms
  For each o IsA object do
    If ser of o = "" then
      If o is plural then
        Set ser of o to "son".
      else
        Set ser of o to "es".
      End if.
    End if.
    If term_n of o = "" then
      If o is plural then
        Set term_n of o to "$$n".
      End if.
    End if.
    If term_s of o = "" then
      If o is femenina then
        If o is plural then
          Set term_s of o to "$$as".
        else
          Set term_s of o to "$$a".
        End if.
      else
        If o is plural then
          Set term_s of o to "$$os".
        else
          Set term_s of o to "$$o".
        End if.
      End if.
    End if.
  End for.
End event.

Although the approach I had in mind is a bit different, I'm trying to see if something can be recycled from this EVENT. But first I might need some help from you to understand a couple of things here...

The `ser` Attribute

I'm not sure what the ser string attribute is used for. It becomes "son" if the noun is plural, and "es" otherwise, but only if the author didn't provide a custom definition of set on the specific object. I have the impression this refers to the form of the verb BEING that needs to be used in messages (i.e. "son" for are, "es" for is), is that it? Would this cover all possible GNAs?

Any idea what it's used for? (grammatically)
What does ser mean or stand for?
Is it clear enough that we keep the same name (ser) in the new grammar module, or do you think we should use another attribute name?

The `term_n` Attribute

It seems that the term_n attribute is what I've been calling vocale in our previous discussions — i.e. the GNA suffix that should be appended to attributes, so they mirror the GNA of the noun they refer to.

What does term_n mean or stand for?
Any idea of a good name for this attribute? we would like it to be short but intuitive, as to the fact that it's a suffix for GNA concordance.

I'll be digging further into the library code to see how these are used (especially since we'll have to tweak that code later on) and decide if and how they can work with the new INITIALIZE base system I have in mind.

If you have some time, please checkout locally the new dev_alan-es_gramatica branch, compile all the adventures (Vampiro and tests) and play around a bit with the various commands (especially in Vampiro) to get an idea of how the parser is handling gender an number in prepositions — i.e. try to use the "full commands" instead of omitting articles and propositions, so you can verify the current status of GNA coverage by the library.

If you have any ideas or intuitions on how the grammar code should handle something, please let me know.

0 replies

tajmone · 2021-10-09T19:52:30Z

tajmone
Oct 9, 2021
Maintainer Author

New Initialization Code In Place

@Rich15, I've managed to replace in the gramática.i module the old ini_terms EVENT with a proper an INITIALIZE block on the entity class. So game authors no longer need to Schedule ini_terms in the Start at section of their games.

And I've added the new artículo string attribute on every entity, which will now be used to inform the library how to handle grammatical initialization for each instance (e.g. Has artículo "las".).

The original pALANte grammar attributes term_n and term_s were preserved, but I've currently omitted the ser attribute since it wasn't being used anywhere in the library (not documented either). Later we'll try to understand what its original usage intention was, and decide whether it's worth implementing it or not — since we're using a different approach, we might also come up with better alternative solutions.

I've tweaked the Vampiro and the test adventure to adopt the new system, replacing all femenina and plural instance definitions with the new artículo attribute instead. None of their generated transcripts showed any differences from the old system, which I take as a confirmation that the new approach has successfully replaced the original code from pALANte (hence the importance of proper tests coverage).

So far, so good. But of course it's still a draft and there's quite some work left to do, i.e. covering all the various grammar aspects that were not dealt with in the pALANte library.

Next Steps

Here's a list of things that deserve being looked into:

Exception nouns — like "aguila", which is feminine but take the "el" article.
- Are there are other types of exception like this? i.e. affecting articles other than "el"? or is this "feminine EL" the only exception?
Prepositions — do we need to define attributes to handle prepositions based on GNA? So far I only noticed that "de" sometimes becomes "des", i.e. if the noun has the "el" article (if I've understood correctly). In this case, we'll need an attribute like prep_DE to be able to retrieve the correct preposition from the instance (as opposed to having to use long If/Then/Else blocks).
- Could you list for me all the preposition that vary depending on the noun's GNA?
Proper Named Nouns — Currently the library has a dedicated named_actor class (defined in "atributos.i") to handle actors with proper name (i.e. so they are always mentioned without an article). We should get rid of this class and replace it with an attribute instead, so it can be applied to any class (after all, there are named objects and locations too) and be handled by the INITIALIZE block — we must do so, otherwise we'll be setting the Definite and Indefinite article on named instances too!
- What should this attribute be called? In English it would be something like Is named; in Italian it's Has nome_proprio.
Articles and Forms — We should investigate better whether the new INITIALIZE block need to consider also defining FORMs, beside ARTICLEs — it currently handles only Definite Article and Indefinite Article, but not Negative Article nor any type of Form! I'm not sure if and how these might be of interest for Spanish; and because in English they are rarely used, and in Italian I didn't yet find a way to use them, I'm not too experienced on their use either, but it's definitely a topic which is worth looking into — somehow, I have the feeling that the Negative Article is going to be useful in Spanish.
Rename term_n and term_s — I don't like the names of these attributes, they don't really convey any useful information about their role. Since they are both suffixes, and affect adjectives and verbs, I was thinking something like verb_suffix and adj_suffix, but possibly a shorter version (adj_sufx, or something like that, as long as it's still a useful mnemonic). But of course, they'd have to be meaningful in Spanish, not in English.
- Any proposals for their new Spanish names?
Tests, Tests and More Tests! — and of course, now more than ever we have a need for extensive tests coverage, since we need to bring up any bugs or unexpected edge cases before making the new grammar module definitively part of the library.

So @Rich15, I thought you might want to be kept up to date with ongoing work on the new grammar module, and hopefully the above info provides a good status resume. I hope you'll find some time to play around with it and give me some feedback.

I'm quite optimistic, in the past weeks the Spanish library has made some huge leaps in terms of improvements and fixes, and it seems that it's going into the right direction — let's just hope there are no big surprises awaiting around the corner (like the parameters bug of #43), you never know. But then, again, it's thanks to new challenges like this non-English library update that many ALAN limits and bugs were discovered and resolved in the past — an aspect of pioneering work which I truly love, because any work that helps a language evolve always carries and added value of merit, beside its own intrinsic value as a final product.

0 replies

Rich15 · 2021-10-12T23:40:47Z

Rich15
Oct 12, 2021
Maintainer

Hello @tajmone! This last week I couldn't get in the computer because I was feeling pretty bad but finally I'm feeling better. The grammar module is looking really great! You have done an excellent job creating the grammar initialization. I'll soon start testing some things with the already existing adventures. I still have the mensajes test in my to-Do's too.

Noise Words

So far, I've only added some extra code to define all Spanish articles as NOISE WORDS (i.e. to be ignored by the parser);
Synonyms el, él, la, los, las = 'the'.

"él" shouldn't be included in this line, since it isn't an article but a personal pronoun (he). I don't know if any commands use "él" but it's better to take it out of there before a bug pops.

pALANte Attributes

`ser`

I have the impression this refers to the form of the verb BEING that needs to be used in messages (i.e. "son" for are, "es" for is)

Yes, that's what it means. ser is the infinitive form, so when something is plural it changes to "son", and to "es" for singular. I think it already has a descriptive name. What it's weird to me is that it wasn't used anywhere in the library.

`term_n` and `term_s`

I think term_n attribute is used to add "-n" at the end of verbs; this only happens if the subject doing the action is plural.

In the other side, term_s adds a suffix in adjectives depending on the object GNA. This could be a problem with nouns like "águila", but I'll talk about this next.

Since they are both suffixes, and affect adjectives and verbs, I was
thinking something like verb_suffix and adj_suffix

These attributes have very confusing names indeed, so I think your suggestions are more descriptive: term_n = verb_suffix; term_s = adj_suffix.

Exception nouns

Exception nouns — like "aguila", which is feminine but take the "el" article.

Are there are other types of exception like this? i.e. affecting articles other than "el"? or is this "feminine EL" the only exception?

Yes, and this is a bit complex subject that confuses even native Spanish speakers, but I'll try to simplify it.

All feminine nouns starting with a stressed a or ha change their definite article to "el".

Some examples: el agua (water), el alma (soul), el águila (eagle), el arma (gun), el hada (fairy), el hacha (axe), etc.

All these nouns and their adjectives are feminine (e.g. el agua turbia, el hacha pesada).

NOTES:

Indefinite articles can be written both in feminine or masculine. Un águila and Una águila are both correct.
This DOES NOT apply to plural forms, in this case the article is feminine. E.g. las aguas, las armas, las hachas.
If an adjective or other words are placed between the noun and the article, the article is feminine. E.g. la pesada hacha, la pequeña hada, la negra agua.
Demonstrative adjectives and other type of adjectives are not affected by this, I.e. they are still feminine.

There is also the case of azúcar(sugar), which is feminine and uses "el", but it doesn't begin with a stressed a or ha.

I don't know how could we manage this cases, since ALAN can't differentiate stressed syllables in Spanish. Maybe they should be managed individually, adding something like a femenina attribute along with the artículo, and add code to change the definite article and adjective in this cases. As I said, this is a complex subject, and right now I'm blocked, but I'm sure we'll come up with a solution.

Here is some additional info if you want to read it:

Spanish words that start with stressed A and HA
el|Diccionario Panhispánico de dudas (in Spanish)
The Enigmatic Morphology of Spanish azúcar and the “New Feminine el” || This article also talks about azúcar and other exceptions.

Prepositions

Could you list for me all the preposition that vary depending on the noun's GNA?

There are two of them:

"a": When preceding a m.s. noun with a definite article, it "fuses" with the article. E.g. dar a el mago becomes dar al mago; the "a" fuses with "el" and becomes "al"
"de": Same case as with "a". E.g. la varita de el mago becomes la varita del mago. "de" transforms into "del".

Proper Named Nouns attribute

What should this attribute be called? In English it would be something like Is named; in Italian it's Has nome_proprio.

I think an appropriate name would be Has nombre_propio. As you can see it's very similar to Italian.

ARTICLE's and FORM's

it currently handles only Definite Article and Indefinite Article, but not Negative Article nor any type of Form!

I don't fully understand Negative Article yet. I think it would be used in a message like No hay ninguna camisa aquí (There's is no shirt here/There isn't any shirt here). If this is the case ningún would vary depending on GNA:

-- m.s.
Ningún

-- f.s.
Ninguna

-- m.p.
Ningunos 

-- f.p.
Ningunas

However, if this represents extra work, a message like No hay una camisa aquí makes sense too. In other words, the Negative Article could be replaced for the Indefinite Article.

So @Rich15, I thought you might want to be kept up to date with ongoing work on the new grammar module, and hopefully the above info provides a good status resume. I hope you'll find some time to play around with it and give me some feedback.

Thanks for the resume! I'm still a little sick but pretty soon I'll start doing tests and tweaking things locally.

I'm quite optimistic, in the past weeks the Spanish library has made some huge leaps in terms of improvements and fixes, and it seems that it's going into the right direction — let's just hope there are no big surprises awaiting around the corner (like the parameters bug of #43), you never know.

I'm optimistic too. I think this grammar module will be a huge improvement for the library. Let's see what comes in the future.

9 replies

tajmone Oct 16, 2021
Maintainer Author

@thoni56,

This is interesting. We surely want Alan to be able to do, in some fashion, what other systems can support.

That would be nice, but bare in mind that some IF systems tend to provide a generic IF-oriented programming language, and then program the IF "environment" directly in the library. E.g. TADS3 exposes in the library everything, from how the parser works to how commands are processed, which is why translating the library to other languages can be so challenging (IRC, 27.000 lines of code). In other systems the line separating the language and the library may be more fuzzier, e.g. the language is built around some core IF classes and data structures, but expose some some of them to the library to allow further manipulation (e.g. the parser tokens as an array to be manipulated at each parsing iteration of the input, like in Hugo).

As I understand it, this is a problem only on output. On input you can easily solve this with synonyms, right?
Synonym uccidi = uccidilo.

More the input actually. The problem with SYNONYMS is that they can only translate one word to another, whereas in cases of compound words where a preposition was assimilated to a verb (or an article to the preposition) it would require to translate them to the original uncompounded words. E.g.

uccidilo -> uccidi + IT

That's what the Inform Italian library does. Of course, we might disregard the gender of the annexed preposition and just use IT (instead of HER/HIM), although in some cases plurality might need to be preserved, e.g. it might refer to multiple parameters of a previous verb:

> examine pizza AND pasta
> mangiale (eat THEM) -> mangia + THEM

... so that the eat verb will be applied twice, once to each parameter.

As for the output, I think that the ALAN approach of using string attributes to fill in the required vowels offer a rather elegant solution. Of course, the author has to ensure that messages honour gender and number by referring to the grammatical string attribute to fill in the gender parts of a game message:

"You try to ucciderl" say gna_vowel of this. "but to no avail."

whereas in Inform the approach is to let the library handle that, since it has a more abstract approach to the language where it's the actual game engine and the library that handles declination of verbs, preposition, etc. Basically, a library has to define all the grammar rules for verbs declination, ordinal numbers and how they compound together, prepositions, etc., resulting in a long list of internal rules (GNA being a prominent term/rule applying everywhere). I believe this was done in order to allow translation to other languages, and at the same time to allow a single library to work in different narrative tenses, which is why even verbs and game-specific rules are described in generic terms, since they might apply to different tenses and subjects/objects, e.g.

Instead of taking something =>

The ALAN approach is more down to earth, and low-level, allowing the author to work much closer to the language and library, without all the hidden abstractions (what you see/write is what the player gets).

I still haven't read up on how Inform and TADS translations use features in the language. But I can't really see the point in manipulating the input during parsing. The point of input parsing is only to understand what was ment, not ensure that input follows the grammar of the spoken language.

I don't know Inform well enough, but in TADS3 the parser is coded in the library itself. E.g. I saw that the German translation of TADS library (the only translation so far) actually tweaks the parser algorithm in some places. First of all, both Inform and TADS have default settings on the max number of characters to parse in every input token (IRRC, around 6), so the words administrator and administration will be parsed identically (admini), but the German library extended this limitation to some extra characters, since German words are usually longer.

Other tasks that are common in the parsing hooks are converting worded numbers to "arithmetic" numbers ("twelve" => 12), decompositing compound words ("mangiaLO" => "mangia + ESSO" / eat it), etc., depending on the language needs.

I believe that the original idea behind exposing the parser was to accommodate any foreign language needs — i.e. programmers not knowing the different structures and needs of every spoken language, they just thought that exposing the internal mechanics as much as possible should make any language supportable — but that's a conjecture on my side, although the Inform Design Manual does seem to confirm this when it presents the rationale behind GNA to support languages like French, Italian and Spanish.

Of course, in these IF systems the code to manipulate the parser tends to be very hackish in nature (i.e. real low level programming tasks), so to introduce something like this in ALAN without violating its core philosophy might require some clever abstraction.

I personally think that what we need is a construct like the current SYNONYM, but that can split compound words for reparsing — the problem is much more similar to the apostrophe elision case, where if l'acqua doesn't match a known noun, it's split into l'+acqua and reparsed (which then is correctly matched, since the article is stripped away, and the noun matches 'water').

So, SYNONYMs are for translating one word into another, what we need is to break up an unmatched token via some "reparse rules". They look similar but they are not:

A SYNONYM is always translated before parsing.
A reparse rule would only apply to an input token that didn't match, and would result in the original token being translated or split up into one, two or more tokens, and the the whole input is reparsed again.

Furthermore, reparse rules would need to be rule-based, for having to spell out each verb with every suffix would be overkill:

mangialo -> mangia + it/him
mangiala -> mangia + it/her
mangiali -> mangia + it/them (f,p.)
mangiale -> mangia + it/them (m.p.)

whereas a simple Kleen star would do the trick:

*lo -> token + IT

where token is the token stripped of the ending part after *. I.e. a very simply RegEx system that only care about handling prefixes and suffixes (some language might use prefixes instead of suffixes).

I believe this construct should look something like this in ALAN:

Reparse "*lo" as (token) "it".
Reparse "dallo" as (token) "dai it".

Where the second example handles a more complex Italian case, that of the irregular verb give, which is DAI, but takes an irregular form when annexed to a preposition, so in this case we need a literal reparse, which acts more like a SYNONYM — i.e. there's no Kleen start, so the original token is dropped and replace entirely, whereas in rules with the star it's stripped of the suffix/prefix and reinserted where (token) is mentioned.

Does it make sense?

thoni56 Oct 16, 2021
Maintainer

Right. I misunderstood the meaning. The real problem is that the '-lo' part, that is "assimilated", actually represents the personal pronoun and should actually be replaced by 'it'.

I think your example makes perfect sense and was actually what made me understand my misinterpretation.

This is an aspect of languages that neither me nor Alan has encounterd before. Neither of us has actually had any experience with Romance languages (except me slowly trying to grasp some beginner Italian).

Thanks for the glimps into how other systems (not really) offer support for this. Again, working with Alan should be as little programming as possible.

Although your suggestion is clean and nice it builds on some mechanisms that are not easy to introduce in the command parser. I'll think about some other possible ways to support this.

Everything has a priority, and if this is essential I should put it high on the list. What is your current assessment of the "need" for this for writing games in Italian/Spanish with Alan?

Without this

there is no way we can write playable games
some games will be almost unplayable without explicit guidance
most games will be possible to play without explicit guidance
some games will have no problem at all
you can write almost any game with careful consideration

tajmone Oct 17, 2021
Maintainer Author

@thoni56, sorry for the late and sporadic replies but fever is really knocking me out in these days, so I spend most of the time in bed (or away from the PC anyhow).

Although your suggestion is clean and nice it builds on some mechanisms that are not easy to introduce in the command parser. I'll think about some other possible ways to support this.

Unfortunately I haven't yet got to understand the details of how the AMachine works internally, I'm still studying the storyfile header and how it organized data. One day I shall get a better understanding of the AMachine, and be able to see these problems in the right perspective.

What is your current assessment of the "need" for this for writing games in Italian/Spanish with Alan? Without this ...

Every game will be playable by alerting the users not to use contacted forms but use the split form VERB + ACTOR — even though it may sounds strange in the original language, in some cases.

E.g. instead of:

> x Zombie
> mataLO

use:

> x Zombie
> mata Zombie

or just > mata IT

The same applies to Italian, it's always possible to spell out the full verb. The only problem is when trying to use the pronoun after the verb, as in "KILL HIM", which in Spanish and Italian would be incorrect (or very strange) without the annexed pronoun — e.g. in Italian "uccidi ESSO" sounds really bad, because it would be natural to say just "uccidiLO", but saying "uccidi lo Zombie" is perfectly fine.

So it's really more a question of how well the parser can understand the native language, not about the game being unplayable.

Problem like the fact that currently the verbs with the extra 'DE' preposition are not working are more serious, because in some cases a preposition can mean a completely different action. But I haven't yet managed to look into that to see what's preventing it from working (but I have a theory about it, i.e. that there are too many alternative syntaxes, some of which overlap across different VERBs).

thoni56 Oct 17, 2021
Maintainer

@thoni56, sorry for the late and sporadic replies but fever is really knocking me out in these days, so I spend most of the time in bed (or away from the PC anyhow).

Not a problem. I mean the delay of a few days, to me that is definitely not a problem. Fever is more of a problem. Take it easy, this project, and the discussions will still be around. It's not like I suddenly have time and need an answer right now. Like you pointed out, it's a hobby. There are no deadlines.

Although your suggestion is clean and nice it builds on some mechanisms that are not easy to introduce in the command parser. I'll think about some other possible ways to support this.

Unfortunately I haven't yet got to understand the details of how the AMachine works internally, I'm still studying the storyfile header and how it organized data. One day I shall get a better understanding of the AMachine, and be able to see these problems in the right perspective.

It was not my intention to indicate lack of understanding, or that you should have it. It was only meant as a data point.

What is your current assessment of the "need" for this for writing games in Italian/Spanish with Alan? Without this ...

Every game will be playable by alerting the users not to use contacted forms but use the split form VERB + ACTOR — even though it may sounds strange in the original language, in some cases.

Good. Thanks, then I have a feeling for the priority.

The same applies to Italian, it's always possible to spell out the full verb. The only problem is when trying to use the pronoun after the verb, as in "KILL HIM", which in Spanish and Italian would be incorrect (or very strange) without the annexed pronoun — e.g. in Italian "uccidi ESSO" sounds really bad, because it would be natural to say just "uccidiLO", but saying "uccidi lo Zombie" is perfectly fine.

That is indeed a good solution that also would come natural as a second attempt when "uccidilo" did not work. It's like in the very old days when games did not have those pronoun forms.

Problem like the fact that currently the verbs with the extra 'DE' preposition are not working are more serious, because in some cases a preposition can mean a completely different action. But I haven't yet managed to look into that to see what's preventing it from working (but I have a theory about it, i.e. that there are too many alternative syntaxes, some of which overlap across different VERBs).

Once you have confirmed that theory, or another one, I'd be interested to see a game that exhibits that problem. Maybe there is something that can be done to avoid it. Since it is not easy to debug if it's just "not working" maybe we can also device some author tools to analyze the situation.

Rich15 Oct 17, 2021
Maintainer

I think that the PRONOUN is still needed in many other verbs, e.g.
> give the apple to HIM
In Italian you would have to use the standalone pronoun in this case, and I believe in Spanish too.

Yes, it's the same in Spanish. We should define PRONOUN too then.

-- Pronouns in Spanish:
m.s: Él -- HIM
f.s: Ella -- HER
m.p: Ellos -- THEM
f.p: Ellas -- THEM

What is your current assessment of the "need" for this for writing games in Italian/Spanish with Alan? Without this ...

Every game will be playable by alerting the users not to use contacted forms but use the split form VERB + ACTOR — even though it may sounds strange in the original language, in some cases.

Just wanted to confirm that is the same for Spanish games :)

The commands would be a bit larger, but they'd be perfectly understandable.

@thoni56, sorry for the late and sporadic replies but fever is really knocking me out in these days, so I spend most of the time in bed (or away from the PC anyhow).

Take care and get better soon @tajmone!

tajmone · 2021-10-24T04:42:03Z

tajmone
Oct 24, 2021
Maintainer Author

New Rake Functionality on The Way!

@thoni56 and @Rich15, I didn't disappear from the project, it's just that I've been (ans still are) very busy updating the ALAN StdLib repository, where I'm switching the entire toolchain to use only Rake to carry out tasks that before were handled with using Bash, batch and SED scripts.

So in the past week I've been working mostly on the custom Rake modules to automate ALAN and AsciiDoc related tasks — modules which are shared among all the ALAN repositories. Therefore, the good news is that once I've finished working on the StdLib repository I'll be updating the Rake modules here too, which will add more features to this repository.

The toughest part has been replacing the SED script that sanitizes/converts ARun-generated transcripts into well formed AsciiDoc files. Now the sanitation is done all in Ruby, so there's no need to run Rake in Bash in order to use SED. Also, I've improved the sanitation criteria compared to the previous SED script.

Once the Rake modules will be ready, we'll be able to enjoy using dynamically generated ALAN examples in the documentation of the libraries too!

This is something which we really need right now, so we can start documenting how the new Spanish grammar module works, but at the same time ensure that all examples in the documentation always mirror the latest library behaviour — which means we'll spot immediately bugs and library breakages, but also that as bugs are being fixed so the documentation will auto-fix itself.

I've almost finished handling the Rake switch in StdLib, just need to add one more custom Rake method (sanitation of ALAN sources) and the update all the repository documentation — which I guess is going to take me the whole afternoon, since all the Bash and batch scripts have been deleted, and the new build system needs to be documented from scratch.

So, I'll probably won't be able to do any work on the libraries this weekend, but I should be able to update the Rake modules at some point during the week, and then setup a draft documentation for the Spanish library too, as well as start adding dynamic examples to the English documentation.

Apart from this, all is well here.

0 replies

tajmone · 2021-11-02T16:26:03Z

tajmone
Nov 2, 2021
Maintainer Author

Lib 0.3.1 — Added Support for Feminine Nouns taking "EL"

I've just update the Spanish library with the tweaked grammar module that can now handle feminine nouns requiring the "EL" article. The solution was tested and documented, and seems to work fine, including when there's a preceding adjective and the normal "LA" article needs to be used instead.

@Rich15, when you have time please have a look at the updated Spanish documentation and tests, and let me know what you think of this solution.

2 replies

Rich15 Nov 4, 2021
Maintainer

@tajmone, I've looked at the tests and everything seems to be working great!

I've been inactive due to personal issues and lack of time, so I hadn't tested the new grammar module properly. I haven't detailed the Spanish documentation yet, but what I've seen so far is excellent :)

I will try to take a closer look to the documentation soon.

tajmone Nov 4, 2021
Maintainer Author

@tajmone, I've looked at the tests and everything seems to be working great!

Good to know, and thanks for looking into it.

I've been inactive due to personal issues and lack of time, so I hadn't tested the new grammar module properly.

That's fine, and normal. We all go through fluctuations of free time availability, and often maintainers are kept away from the project for weeks, even months due to priorities taking over; but this has never prevented the ALAN project to remain alive in the past decades. As long as we annotated everything via Issues and Discussions, stepping back into the working-flow is just a matter of sifting thought the annotations to remember where we left things at, and what has happened since.

I haven't detailed the Spanish documentation yet, but what I've seen so far is excellent :)

I'm also very happy with the results, especially the fact that now when we update the library all the examples transcripts in the documentation are automatically updated accordingly — less maintenance work! Believe me, I've worked on documents with outdated examples and it's really stressful to have to fix them if they were snippets pasted from a long-lost sample adventure, because you'd need to recreate the original context. "Setting & forgetting" theme is much better.

I will try to take a closer look to the documentation soon.

If you have any thoughts on how it could be improved, you can annotate your ideas directly in the AsciiDoc sources, either as comments (in case of questions, doubts, or notes for the maintainers) or as aside contents via admonitions (in case of notes that benefit the reader).

We should always assume that some users might already be reading the documentation, and develop it accordingly. After all, the current library is already functional, as it does all that pALANte did, plus more. So it makes sense to keep the documentation tidy even if it's an early Alpha draft.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spanish: New Grammar Module with Initialization #35

{{title}}

Replies: 7 comments 12 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Spanish: New Grammar Module with Initialization #35

tajmone Sep 19, 2021 Maintainer

Replies: 7 comments · 12 replies

Rich15 Sep 20, 2021 Maintainer

Prepositions gender and number

tajmone Sep 20, 2021 Maintainer Author

tajmone Oct 6, 2021 Maintainer Author

GNA References in INFSP 6 Sources

Articles

GNA Tables

Pronouns Table

tajmone Oct 8, 2021 Maintainer Author

New Grammar Module (DRAFT)

Immediate Results!

Initialization Code from pALANte Lib

The ser Attribute

The term_n Attribute

tajmone Oct 9, 2021 Maintainer Author

New Initialization Code In Place

Next Steps

Rich15 Oct 12, 2021 Maintainer

Noise Words

pALANte Attributes

ser

term_n and term_s

Exception nouns

Prepositions

Proper Named Nouns attribute

ARTICLE's and FORM's

tajmone Oct 16, 2021 Maintainer Author

thoni56 Oct 16, 2021 Maintainer

tajmone Oct 17, 2021 Maintainer Author

thoni56 Oct 17, 2021 Maintainer

Rich15 Oct 17, 2021 Maintainer

tajmone Oct 24, 2021 Maintainer Author

New Rake Functionality on The Way!

tajmone Nov 2, 2021 Maintainer Author

Lib 0.3.1 — Added Support for Feminine Nouns taking "EL"

Rich15 Nov 4, 2021 Maintainer

tajmone Nov 4, 2021 Maintainer Author

tajmone
Sep 19, 2021
Maintainer

Replies: 7 comments 12 replies

Rich15
Sep 20, 2021
Maintainer

tajmone Sep 20, 2021
Maintainer Author

tajmone
Oct 6, 2021
Maintainer Author

tajmone
Oct 8, 2021
Maintainer Author

The `ser` Attribute

The `term_n` Attribute

tajmone
Oct 9, 2021
Maintainer Author

Rich15
Oct 12, 2021
Maintainer

`ser`

`term_n` and `term_s`

tajmone Oct 16, 2021
Maintainer Author

thoni56 Oct 16, 2021
Maintainer

tajmone Oct 17, 2021
Maintainer Author

thoni56 Oct 17, 2021
Maintainer

Rich15 Oct 17, 2021
Maintainer

tajmone
Oct 24, 2021
Maintainer Author

tajmone
Nov 2, 2021
Maintainer Author

Rich15 Nov 4, 2021
Maintainer

tajmone Nov 4, 2021
Maintainer Author