Skip to content

Commit

Permalink
Merge pull request #38 from letuananh/main
Browse files Browse the repository at this point in the history
jamdict version 0.1a11 ready
  • Loading branch information
letuananh authored May 25, 2021
2 parents 886b2c2 + 1b1b90c commit 21242da
Show file tree
Hide file tree
Showing 22 changed files with 963 additions and 324 deletions.
2 changes: 0 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,6 @@
* Fast look up (dictionaries are stored in SQLite databases)
* Command-line lookup tool [(Example)](#command-line-tools)

Homepage: [https://github.com/neocl/jamdict](https://github.com/neocl/jamdict)

[Contributors](#contributors) are welcome! 🙇. If you want to help, please see [Contributing](https://jamdict.readthedocs.io/en/latest/contributing.html) page.

# Try Jamdict out
Expand Down
13 changes: 11 additions & 2 deletions docs/api.rst
Original file line number Diff line number Diff line change
@@ -1,18 +1,27 @@
.. _api_index:

jamdict APIs
============

An overview of jamdict modules.

.. warning::
👉 ⚠️ THIS SECTION IS STILL UNDER CONSTRUCTION ⚠️ Help is much needed.

.. module:: jamdict

.. autoclass:: jamdict.util.Jamdict
:members:
:member-order: groupwise
:exclude-members: get_ne, has_jmne, import_data, jmnedict

.. autoclass:: jamdict.util.LookupResult
:members:
:member-order: groupwise

.. autoclass:: jamdict.util.Jamdict
.. autoclass:: jamdict.util.IterLookupResult
:members:
:member-order: groupwise
:exclude-members: get_ne, has_jmne, import_data, jmnedict

.. module:: jamdict.jmdict

Expand Down
17 changes: 16 additions & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,16 @@ Jamdict's documentation!

`Jamdict <https://github.com/neocl/jamdict>`_ is a Python 3 library for manipulating Jim Breen's JMdict, KanjiDic2, JMnedict and kanji-radical mappings.

Welcome
-------

Are you new to this documentation? Here are some useful pages:

- Want to try out Jamdict package? Try `Jamdict online demo <https://replit.com/@tuananhle/jamdict-demo>`_
- Want some useful code samples? See :ref:`recipes`.
- Want to look deeper into the package? See :ref:`api_index`.
- If you want to help developing Jamdict, please visit :ref:`contributing` page.

Main features
-------------

Expand All @@ -27,14 +37,18 @@ If you want to help developing Jamdict, please visit :ref:`contributing` page.
Installation
------------

Jamdict is `available on PyPI <https://pypi.org/project/jamdict/>`_ and
Jamdict and `jamdict-data <https://pypi.org/project/jamdict/>`_ are both `available on PyPI <https://pypi.org/project/jamdict/>`_ and
can be installed using pip.
For more information please see :ref:`installpage` page.

.. code:: bash
pip install jamdict jamdict-data
Also, there is an online demo Jamdict virtual machine to try out on Repl.it

https://replit.com/@tuananhle/jamdict-demo

Sample jamdict Python code
--------------------------

Expand Down Expand Up @@ -125,6 +139,7 @@ Documentation
recipes
api
contributing
updates

Other info
==========
Expand Down
50 changes: 45 additions & 5 deletions docs/recipes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,9 @@
Common Recipes
==============

- Search words using wildcards.
- Searching for kanji characters.
- Decomposing kanji characters into components, or search kanji characters by components.
- Search for named entities.
.. contents::
:local:
:depth: 2

.. warning::
👉 ⚠️ THIS SECTION IS STILL UNDER CONSTRUCTION ⚠️
Expand All @@ -20,14 +19,55 @@ High-performance tuning
-----------------------

When you need to do a lot of queries on the database, it is possible to load the whole database
into memory to boost up querying performance (This will takes about 400 MB of RAM) by using the ``memory_mode``
into memory to boost up querying performance (This will takes about 400 MB of RAM) by using the :class:`memory_mode <jamdict.util.Jamdict>`
keyword argument, like this:

>>> from jamdict import Jamdict
>>> jam = Jamdict(memory_mode=True)

The first query will be extremely slow (it may take about a minute for the whole database to be loaded into memory)
but subsequent queries will be much faster.

Iteration search
----------------

Sometimes people want to look through a set of search results only once and determine which items to keep
and then discard the rest. In these cases :func:`lookup_iter <jamdict.util.Jamdict.lookup_iter>` should be used.
This function returns an :class:`IterLookupResult <jamdict.util.IterLookupResult>` object immediately after called.
Users may loop through ``result.entries``, ``result.chars``, and ``result.names`` exact one loop for each
set to find the items that they want. Users will have to store the desired word entries, characters, and names
by themselves since they are discarded after yield.

>>> res = jam.lookup_iter("花見")
>>> for word in res.entries:
... print(word) # do somethign with the word
>>> for c in res.chars:
... print(c)
>>> for name in res.names:
... print(name)

Part-of-speeches and named-entity types
---------------------------------------

Use :func:`Jamdict.all_pos <jamdict.util.Jamdict.all_pos>` to list all available part-of-speeches
and :func:`Jamdict.all_ne_type <jamdict.util.Jamdict.all_pos>` named-entity types:

>>> for pos in jam.all_pos():
... print(pos) # pos is a string
>>> for ne_type in jam.all_ne_type():
... print(ne_type) # ne_type is a string

To filter words by part-of-speech use the keyword argument ``pos``
in :func:`loookup() <jamdict.util.Jamdict.lookup>` or :func:`lookup_iter() <jamdict.util.Jamdict.lookup_iter>`
functions.

For example to look for all "かえる" that are nouns use:

>>> result = jam.lookup("かえる", pos=["noun (common) (futsuumeishi)"])

To search for all named-entities that are "surname" use:

>>> result = jam.lookup("surname")

Kanjis and radical/components (KRAD/RADK mappings)
--------------------------------------------------
Expand Down
82 changes: 49 additions & 33 deletions docs/updates.rst
Original file line number Diff line number Diff line change
@@ -1,49 +1,65 @@
.. _updates:

Updates
=======
Jamdict Changelog
=================

2021-04-19
----------
jamdict 0.1a11
--------------

- [Version 0.1a9]
- Fix data audit query
- Enhanced Jamdict() constructor. ``Jamdict('/path/to/jamdict.db')``
works properly.
- Code quality review
- Automated documentation build via
`readthedocs.org <https://jamdict.readthedocs.io/en/latest/>`__
- 2021-05-25

.. _section-1:
- Added ``lookup_iter()`` for iteration search
- Added ``pos`` filter for filtering words by part-of-speeches
- Added ``all_pos()`` and ``all_ne_type()`` to Jamdict to list part-of-speeches and named-entity types
- Better version checking in ``__version__.py``
- Improved documentation

2021-04-15
----------
jamdict 0.1a10
--------------

- Make ``lxml`` optional
- Data package can be installed via PyPI with ``jamdict_data`` package
- Make configuration file optional as data files can be installed via
PyPI.
- 2021-05-19

.. _section-2:
- Added ``memory_mode`` keyword to load database into memory before querying to boost up performance
- Improved import performance by using puchikarui's ``buckmode``
- Tested with both puchikarui 0.1.* and 0.2.*

2020-05-31
----------
jamdict 0.1a9
-------------

- [Version 0.1a7]
- Added Japanese Proper Names Dictionary (JMnedict) support
- Included built-in KRADFILE/RADKFile support
- Improved command line tools (json, compact mode, etc.)
- 2021-04-19

.. _section-3:
- Fix data audit query
- Enhanced ``Jamdict()`` constructor. ``Jamdict('/path/to/jamdict.db')``
works properly.
- Code quality review
- Automated documentation build via
`readthedocs.org <https://jamdict.readthedocs.io/en/latest/>`__

2017-08-18
----------
jamdict 0.1a8
-------------

- Support KanjiDic2 (XML/SQLite formats)
- 2021-04-15

.. _section-4:
- Make ``lxml`` optional
- Data package can be installed via PyPI with ``jamdict_data`` package
- Make configuration file optional as data files can be installed via PyPI.

2016-11-09
----------
jamdict 0.1a7
-------------

- Release first version to Github
- 2020-05-31

- Added Japanese Proper Names Dictionary (JMnedict) support
- Included built-in KRADFILE/RADKFile support
- Improved command line tools (json, compact mode, etc.)

Older versions
--------------

- 2017-08-18

- Support KanjiDic2 (XML/SQLite formats)

- 2016-11-09

- Release first version to Github
2 changes: 1 addition & 1 deletion jamdict/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@
from . import __version__ as version_info
from .__version__ import __author__, __email__, __copyright__, __maintainer__
from .__version__ import __credits__, __license__, __description__, __url__
from .__version__ import __version_major__, __version_long__, __version__, __status__
from .__version__ import __version__, __version_long__, __status__

from .jmdict_sqlite import JMDictSQLite
from .kanjidic2_sqlite import KanjiDic2SQLite
Expand Down
28 changes: 24 additions & 4 deletions jamdict/__version__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,30 @@
__copyright__ = "Copyright (c) 2016, Le Tuan Anh"
__credits__ = []
__license__ = "MIT License"
__description__ = "Python library for manipulating Jim Breen's JMdict, KanjiDic2, KRADFILE and JMnedict"
__description__ = "Python library for using Japanese dictionaries and resources (Jim Breen's JMdict, KanjiDic2, KRADFILE, JMnedict)"
__url__ = "https://github.com/neocl/jamdict"
__maintainer__ = "Le Tuan Anh"
__version_major__ = "0.1"
__version__ = "{}a10".format(__version_major__)
__version_long__ = "{} - Alpha 10".format(__version_major__)
# ------------------------------------------------------------------------------
# Version configuration (enforcing PEP 440)
# ------------------------------------------------------------------------------
__status__ = "3 - Alpha"
__version_tuple__ = (0, 1, 0, 11)
__version_status__ = '' # a specific value ('rc', 'dev', etc.) or leave blank to be auto-filled
# ------------------------------------------------------------------------------
__status_map__ = {'3 - Alpha': 'a', '4 - Beta': 'b', '5 - Production/Stable': '', '6 - Mature': ''}
if not __version_status__:
__version_status__ = __status_map__[__status__]
if len(__version_tuple__) == 3:
__version_build__ = ''
elif len(__version_tuple__) == 4:
__version_build__ = f"{__version_tuple__[3]}"
elif len(__version_tuple__) == 5:
__version_build__ = f"{__version_tuple__[3]}.post{__version_tuple__[4]}"
else:
raise ValueError("Invalid version information")
if __version_tuple__[2] == 0:
__version_main__ = f"{'.'.join(str(n) for n in __version_tuple__[:2])}"
else:
__version_main__ = f"{'.'.join(str(n) for n in __version_tuple__[:3])}"
__version__ = f"{__version_main__}{__version_status__}{__version_build__}"
__version_long__ = f"{__version_main__} - {__status__.split('-')[1].strip()} {__version_build__}"
Loading

0 comments on commit 21242da

Please sign in to comment.