Skip to content
This repository has been archived by the owner on Jul 16, 2022. It is now read-only.

Latest commit

 

History

History
138 lines (113 loc) · 3.85 KB

CHANGELOG.md

File metadata and controls

138 lines (113 loc) · 3.85 KB

Change Log

All notable changes to this project will be documented in this file.

Fixed

  • HtmlDslVisitor: detection of lang tag with name attribute(#82)

Changed

Added

  • Support dictionary encoded with UTF-16BE(Big Endian)(#73)

Fixed

Changed

  • Raise UnsupportedEncodingException for UTF16-BE files(#69,#73)
  • Check mandatory property after reading header(#70)

Changed

  • Behavior change
    • Now returns head words with subentry key when exist
  • Index cache file format
    • cache file version raised to v2
    • extend to hold head word block's offset and size
  • Bump versions
  • Rewrite dictionary entries loader
    • Allow UTF-16LE without BOM
    • Allow LF line terminators with UTF-16LE
    • Support records without empty line separator
    • Handle comment only head word line
    • Accept UTF-8 with BOM
    • Improve metadata loading

Added

  • Update README
    • support matrix
    • Unsupported syntax
  • Update test cases
    • test data variations
    • test case with proprietary data

Removed

  • StreamSearcher class
  • Fix parsing index that is seldom broken position.(#35)
  • Handle []...[/] properly.
  • Default HtmlDslVIsitor convert []...[/] as html comment.
  • Support index file
  • Allow load/save index that is compressed with GZIP
  • Introduce new API to accept index file path.
  • Fix bug raising exception when file end with double EOL terminator.
  • Improve charset detection
    • Accept UTF-16LE without BOM
    • Accept UTF-8 without BOM
    • Accept UTF-16LE but terminator is LF-only.
  • DSL4j now read only head words and article positions when loading. This improve performance many and reduce memory consumption.
  • Test with dictionary format variations, for encodings such as UTF-16LE, and Windows-1251 and End-of-Line terminators, CR+LF or LF-only.
  • Bump dependency
  • Accept UTF-16, UTF-8 and ANSI files
    • can accept Cp1250, Cp1251 and Cp1252 codepages
  • Refactoring DslArticle class
  • Handles header properties
    • dictionary name
    • index language
    • contents langauge
  • Support lang tag with name and id attribute
    • Provide acceptable LanguageName and LanguageCode map
  • Improve standard HtmlDslVisitor
  • Fix parsing "]]" bracket
  • Add support for "br" and "'" tag
  • Improve url handling
  • Fix checkstyle warnings and firebugs error
  • HtmlDslVisitor: convert media tags to hyperlink or img tags
  • Add code examples in README
  • Introduce visitor and data package
  • Introduce DslResult class to integrate loader and parser
  • Add HtmlDslVisitor
  • Add PlainDslVisitor

0.1.0

  • First internal release