A simple word counter for more complex text processor files.
On command (be it manual or croned):
- Retrieve list of known files from a database - waiting for database support
- Crawl the directory structure from a given starting point - in progress
- Count words in all supported files found in the directory - in progress
- Update (or create) records in a database, keep history - waiting for database support
- Display list of changed files and sum of the words - in progress
- Directory name - todo
- Specific files
- Changed database records
- Summary of changes on console
- The content is irrelevant; if a file lost 50 words but gained 150, the final result is to be increase by 100
- txt
- md
- docx
- odt
- Linking walking the directories with processing
- Better type recognition -
docx
vs.doc
, acceptingdot
s, etc. - Database handling
- Creating data structures
- Data retrieval
- Data saving
- Incorrect filename handling
- Counting algorithms verification - current results disagree with both Libre Office as with Microsoft Office
- Note that LO disagrees with MO as well
- Unit testing
- Mechanism
- Coverage for
processor
classes
- Exchangeable database interface
- May start with SQLite