Skip to content

Project Structure

Adrian Zhao edited this page May 8, 2021 · 1 revision

The root folder contains 2 folders: app and pyscript

app folder contains the web application (frontend) powered by ReactJS.

pyscript folder contains the python scripts (backend) that contain the main logic for phonogenesis.


pyscript/app contains the backend API (routes.py) and the database models (models.py), powered by Flask and sqlalchemy. Note that the database models are mostly unrelated to the core phonogenesis functionalities. They were designed for supporting extensive web features that requires login (deprecated at the moment).


pyscript/script contains the main logic scripts and the data files.

pyscript/script/data contains the data files:

  • defaultgloss.txt contains the program default gloss data. The data is split into different sections, each section starts with ### <category>. Gloss words on the same line have similar meanings and types. Gloss words on the same line are separated by |.
  • defaultipa.csv default IPA sound dataset. Each column corresponds to a feature, denoted by the column header. Each row, starting with the first column being the sound symbol, contains all the features that uniquely identify that sound. There should not be two different sounds with exact same features.
  • defaultpresetphoneme.txt contains a set of sounds that are used by the program by default as the full phoneme set.
  • defaultphonemerandomization.txt contains a set of rules for randomly generating a phoneme set (see file for details)
  • defaultrules.csv contains a set of rules denoted by a special syntax system, identified by its name and grouped by its rule family.
  • defaulttemplate.txt contains a set of templates used by the program by default.
  • paradigmtransdata.txt contains all possible column header combinations for each gloss word category in a paradigm question.
  • paradigmtranstemplates.txt contains templates that are used to generated morphemes in a paradigm question.

pyscript/script scripts:

  • data_factory.py provides utility function for getting formatted data from data files.
  • doubleRule.py logic for generating question that involves two different rules applied one after the other.
  • feature_lib.py contains Particle class (data structure for representing a set of features) and logic for fetching raw data from defaultipa.csv.
  • generator.py contains Generator class that generates a simple question (UR, SR and single rule).
  • glossgroup.py contains GlossGroup and GlossFamily class and logic for fetching gloss data from defaultgloss.txt
  • main.py test script. used only for testing purposes
  • morphology.py contains classes (Paradigm and ParadigmGenerator) and logic for generating a morphology question.
  • phonemes.py fetches phoneme data and applies phoneme randomization from data files (defaultpresetphoneme.txt, defaultphonemerandomization.txt).
  • rules.py contains class Rule (data structure for representing a rule) and logic for fetching rules from defaultrules.csv
  • sound.py contains class Sound (data structure for representing an IPA sound)
  • templates.py contains class Template (data structure for representing a template) and logic for fetching rules from defaulttemplate.txt
  • word.py contains class Word (data structure for representing a word - list of sounds)
Clone this wiki locally