-
Notifications
You must be signed in to change notification settings - Fork 0
Compoglot Language Tutorial
In language, often there are words that are spelt the same and yet have subtly different meanings. For instance, in the phrase I drank water from my glass
, "glass" refers to the rounded cup used for drinking out of. On the other hand, in the phrase My windows are made of glass
, the word "glass" refers to the transparent substance that commonly used to make windows. These subtly different but look-alike words are one of the sources of error in tools like Google translate. By specifying which meaning of the word glass
you chose, Compoglot can generate translations far better than it would be able to by attempting to guess the meaning.
The smallest possible segment of compoglot is {}
. This code basically creates a new sentence, and all the code for that sentence should be contained within the braces. You may expect this to simply create an empty string, however (for most languages) this is not the case. Instead, since sentences in many languages finish their sentences with a period / decimal point / full stop, this will be outputted, to signify that the sentence you have created has finished. To test this in a terminal, use the following:
> compoglot cmd en "{}"
.
Many simple sentences are split up into three separate parts - subject, verb, object. In English, these three separate parts come in that order. The subject is the person, place or thing that is completing an action, the verb is the action that is being completed, and the object is the person, place or thing that is having the action done unto it. There may however be multiple subjects and objects, or even no objects at all. For example, in English:
The dog | bites | the man
He | ate | pasta
My friend and I | walked |
Subject | verb | object
In compoglot, these three separate parts are specified with three different constructs:
- The subjects of a sentence are specified with
Subject[<subjects>]
. - The verbs of a sentence are specified with
Verb1(<verb_id>)
- The (main) objects of a sentence are specified with
Object{1[<objects>]}
Each individual subject and object is created in the exact same way. We group multiple things that describe the subject / object, within braces {}
, hence giving the name NounGroup. In most cases, you'll want to specify a noun, which you can do with Noun(<noun_id>,<determiner>,<plural>)
.
-
<plural>
is simply a boolean value, which specifies if there is more than 1 of the noun. -
<determiner>
is the ID of what determiner should be prefixed to the noun, common values are: (complete list here)
-
0
for no determiner -
-1
for the definite articlethe
-
-2
for the indefinite articlea
oran
Both the <verb_num>
, and <noun_num>
properties can be looked up in the verb and noun lists respectively. However, when you do so, make sure that not only the word matches what you wish to say, but the meaning also matches.
Now let's try to create the sentence The dog bites the man
with compoglot. The first thing that we'll need to do is look up the ID numbers of the various words we wish to use.
Looking at the noun list we can see that the noun dog
has an ID of 397
, and the noun man
has an ID of 724
. Both these nouns in the sentence are not plural, and use the determiner the
. Hence they can be represented with { Noun(397,-1,F) }
and { Noun(724,-1,F) }
respectively.
Next we can look up the verb bite`` in the [verb](https://github.com/CoolAs/Compoglot/wiki/List-of-verbs) list, finding that it has an ID of
35```.
Putting this all together we create the compoglot code:
{ Subject[ { Noun(397,-1,F) } ] Verb1(35) Object{1[ { Noun(724,-1,F) } ]} }
If we run this through compoglot, we can create this sentence in a variety of different languages:
> compoglot cmd en "{ Subject[ {Noun(397,-1,F)} ] Verb1(35) Object{1[ { Noun(724,-1,F) } ]} }"
The dog bites the man.
> compoglot cmd de "{ Subject[ {Noun(397,-1,F)} ] Verb1(35) Object{1[ { Noun(724,-1,F) } ]} }"
Der Hund beißt der Mann.
> compoglot cmd eo "{ Subject[ {Noun(397,-1,F)} ] Verb1(35) Object{1[ { Noun(724,-1,F) } ]} }"
La hundo mordas la viron.
Violà, perfect machine localisation.
It is worth noting, that whitespace is (mostly) ignored within compoglot code, and that the order within a group of braces is irrelevant. Hence the below codes are equivalent
{ Verb1(35) Subject[ { Noun(397,-1,F) } ] Object{1[ { Noun(724,-1,F) } ]} }
--------------------------------------------------------------------------------
{Object{1[{Noun(724,-1,F)}]}Verb1(35)Subject[{Noun(397,-1,F)}]}
--------------------------------------------------------------------------------
{
Subject[
{Noun(397,-1,F)}
]
Verb1(35)
Object{
1[
{Noun(724,-1,F)}
]
}
}