Skip to content

Commit

Permalink
Merge pull request #23 from neocl/dev
Browse files Browse the repository at this point in the history
Dev
  • Loading branch information
letuananh authored Apr 15, 2021
2 parents ec9e876 + a33056a commit 5245068
Show file tree
Hide file tree
Showing 20 changed files with 820 additions and 190 deletions.
5 changes: 5 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
2021-04-15
- Make `lxml` optional
- Data package can be installed via PyPI with `jamdict_data` package
- Make configuration file optional as data files can be installed via PyPI.

2020-05-31
- [Version 0.1a7]
- Added Japanese Proper Names Dictionary (JMnedict) support
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
Python library for manipulating Jim Breen's JMdict & KanjiDic2

[![ReadTheDocs Badge](https://readthedocs.org/projects/jamdict/badge/?version=latest&style=plastic)](https://jamdict.readthedocs.io/)

# Main features

* Support querying different Japanese language resources
Expand Down
20 changes: 20 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
27 changes: 27 additions & 0 deletions docs/api.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
jamdict APIs
============

An overview of jamdict modules.

.. module:: jamdict

.. autoclass:: jamdict.util.LookupResult
:members:
:member-order: groupwise

.. autoclass:: jamdict.util.Jamdict
:members:
:member-order: groupwise
:exclude-members: get_ne, has_jmne, import_data, jmnedict

.. module:: jamdict.jmdict

.. autoclass:: JMDEntry
:members:

.. module:: jamdict.kanjidic2

.. autoclass:: Character
:members:

.. automodule:: jamdict.krad
53 changes: 53 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

# -- Path setup --------------------------------------------------------------

# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import os
import sys
sys.path.insert(0, os.path.abspath('../'))


# -- Project information -----------------------------------------------------

project = 'jamdict'
copyright = '2021, Le Tuan Anh'
author = 'Le Tuan Anh'


# -- General configuration ---------------------------------------------------

# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = ['sphinx.ext.autodoc', 'sphinx.ext.viewcode', 'sphinx.ext.doctest']
# -- Highlight code block -----------------
pygments_style = 'sphinx'

# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']


# -- Options for HTML output -------------------------------------------------

# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = 'bizstyle'

# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
150 changes: 150 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@

Jamdict's documentation!
========================

`Jamdict <https://github.com/neocl/jamdict>`_ is a Python 3 library for manipulating Jim Breen's JMdict, KanjiDic2, JMnedict and kanji-radical mappings.

Main features
-------------

- Support querying different Japanese language resources

- Japanese-English dictionary JMDict
- Kanji dictionary KanjiDic2
- Kanji-radical and radical-kanji maps KRADFILE/RADKFILE
- Japanese Proper Names Dictionary (JMnedict)

- Data are stored using SQLite database
- Console lookup tool
- jamdol (jamdol-flask) - a Python/Flask server that provides Jamdict
lookup via REST API (experimental state)

:ref:`Contributors are welcome! 🙇 <contributors>`

Installation
------------

Jamdict is `available on PyPI <https://pypi.org/project/jamdict/>`_ and
can be installed using pip command

.. code:: bash
pip install jamdict jamdict_data
Sample jamdict Python code
--------------------------

Looking up words

>>> from jamdict import Jamdict
>>> jam = Jamdict()
>>> result = jam.lookup('はな')
>>> for word in result.entries:
... print(word)
...
[id#1194500] はな (花) : 1. flower/blossom/bloom/petal ((noun (common) (futsuumeishi))) 2. cherry blossom 3. beauty 4. blooming (esp. of cherry blossoms) 5. ikebana 6. Japanese playing cards 7. (the) best
[id#1486720] はな (鼻) : nose ((noun (common) (futsuumeishi)))
[id#1581610] はし (端) : 1. end (e.g. of street)/tip/point/edge/margin ((noun (common) (futsuumeishi))) 2. beginning/start/first 3. odds and ends/scrap/odd bit/least
[id#1634180] はな (洟) : snivel/nasal mucus/snot ((noun (common) (futsuumeishi)))

Looking up kanji characters

>>> for c in result.chars:
... print(repr(c))
...
花:7:flower
華:10:splendor,flower,petal,shine,luster,ostentatious,showy,gay,gorgeous
鼻:14:nose,snout
端:14:edge,origin,end,point,border,verge,cape
洟:9:tear,nasal discharge

Looking up named entities

>>> result = jam.lookup('ディズニー%')
>>> for name in result.names:
... print(name)
...
[id#5053163] ディズニー : Disney (family or surname/company name)
[id#5741091] ディズニーランド : Disneyland (place name)

See :ref:`recipes` for more sample code.

Command line tools
------------------

Jamdict can be used from the command line.

.. code:: bash
python3 -m jamdict lookup 言語学
========================================
Found entries
========================================
Entry: 1264430 | Kj: 言語学 | Kn: げんごがく
--------------------
1. linguistics ((noun (common) (futsuumeishi)))
========================================
Found characters
========================================
Char: 言 | Strokes: 7
--------------------
Readings: yan2, eon, 언, Ngôn, Ngân, ゲン, ゴン, い.う, こと
Meanings: say, word
Char: 語 | Strokes: 14
--------------------
Readings: yu3, yu4, eo, 어, Ngữ, Ngứ, ゴ, かた.る, かた.らう
Meanings: word, speech, language
Char: 学 | Strokes: 8
--------------------
Readings: xue2, hag, 학, Học, ガク, まな.ぶ
Meanings: study, learning, science
No name was found.
To show help you may use

.. code:: bash
python3 -m jamdict --help
Documentation
-------------

.. toctree::
:maxdepth: 2

install
tutorials
recipes
api

Other info
==========

.. _contributors:

Contributors
------------

- `Matteo Fumagalli <https://github.com/matteofumagalli1275>`__
- `Reem Alghamdi <https://github.com/reem-codes>`__

Useful links
------------

- jamdict on PyPI: https://pypi.org/project/jamdict/
- jamdict source code: https://github.com/neocl/jamdict/
- Documentation: https://jamdict.readthedocs.io/
- Dictionaries
- JMdict: http://edrdg.org/jmdict/edict_doc.html
- kanjidic2: https://www.edrdg.org/wiki/index.php/KANJIDIC_Project
- JMnedict: https://www.edrdg.org/enamdict/enamdict_doc.html
- KRADFILE: http://www.edrdg.org/krad/kradinf.html

Indices and tables
------------------

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
70 changes: 70 additions & 0 deletions docs/install.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
Installation
=============

jamdict and jamdict dictionary data are both available on PyPI and can be installed using `pip`.

.. code-block:: bash
pip install --user jamdict jamdict_data
# pip script sometimes doesn't work properly
# so you may want to try this instead
python3 -m pip install jamdict jamdict_data
Download database file manually
-------------------------------

This should not be useful anymore from version 0.1a8 with the release of the `jamdict_data <https://pypi.org/project/jamdict_data/>`_ package on PyPI.
If for some reason you want to download and install jamdict database by yourself, here are the steps:

1. Download the offical, pre-compiled jamdict database
(``jamdict-0.1a7.tar.xz``) from Google Drive
https://drive.google.com/drive/u/1/folders/1z4zF9ImZlNeTZZplflvvnpZfJp3WVLPk
2. Extract and copy ``jamdict.db`` to jamdict data folder (defaulted to
``~/.jamdict/data/jamdict.db``)
3. To know where to copy data files you can use `python3 -m jamdict info` command via a terminal:

.. code:: bash
python3 -m jamdict info
# Jamdict 0.1a8
# Python library for manipulating Jim Breen's JMdict, KanjiDic2, KRADFILE and JMnedict
#
# Basic configuration
# ------------------------------------------------------------
# JAMDICT_HOME : ~/local/jamdict
# jamdict_data availability: False
# Config file location : /home/tuananh/.jamdict/config.json
#
# Custom Data files
# ------------------------------------------------------------
# Jamdict DB location: ~/local/jamdict/data/jamdict.db - [OK]
# JMDict XML file : ~/local/jamdict/data/JMdict_e.gz - [OK]
# KanjiDic2 XML file : ~/local/jamdict/data/kanjidic2.xml.gz - [OK]
# JMnedict XML file : ~/local/jamdict/data/JMnedict.xml.gz - [OK]
#
# Others
# ------------------------------------------------------------
# lxml availability: False
Build database file from source
-------------------------------

Normal users who just want to look up the dictionaries do not have to do this.
If you are a developer and want to build jamdict database from source,
copy the dictionary source files to jamdict data folder.
The original XML files can be downloaded either from the official website
https://www.edrdg.org/ or from `this jamdict Google Drive folder <https://drive.google.com/drive/folders/1ZMM6Xb46XcwwQGWBZnY3gj637exWPWuU>`_.

To find out where to copy the files or whether they are recognised by jamdict,
you may use the command `python3 -m jamdict info` as in the section above.

You should make sure that all files under the section `Custom data files` are all marked [OK].
After that you should be able to build the database with the command:

.. code:: bash
python3 -m jamdict import
Note on XML parser: jamdict will use `lxml` instead of Python 3 default `xml` when it is available.


35 changes: 35 additions & 0 deletions docs/make.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
@ECHO OFF

pushd %~dp0

REM Command file for Sphinx documentation

if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=.
set BUILDDIR=_build

if "%1" == "" goto help

%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.http://sphinx-doc.org/
exit /b 1
)

%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end

:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%

:end
popd
Loading

0 comments on commit 5245068

Please sign in to comment.