-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added replacement facility for page content. #104
base: develop
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -293,6 +293,69 @@ folderA/ | |||||
``` | ||||||
</details> | ||||||
|
||||||
## Replacements | ||||||
|
||||||
By using either command line replacement pairs or specifying a file with json defintions of replacements an arbitrary regexp ("pattern") can be detected and replaced with another string, including expanding captured groups in the pattern. | ||||||
|
||||||
The replacement phase is taking place just before upsert, so all other textual manipulations are done by that time. | ||||||
|
||||||
Replacements happen in a deterministic sequence. There are ample opportunities to get unexpected (but logically consistent) results by inadvertently result of a previous replacement. | ||||||
<details> | ||||||
<summary>Format of json file</summary> | ||||||
|
||||||
```json | ||||||
{ | ||||||
"environment": [ | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The name is not particularly intuitive. I would maybe use something like |
||||||
{ | ||||||
"import": "<module name>", | ||||||
"path": "<source file>" | ||||||
} | ||||||
], | ||||||
"replacements":[ | ||||||
{ | ||||||
"name": "<name - optional>", | ||||||
"pattern": "<regexp>", | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
"new_value": "<string with optional group expansions>" | ||||||
"evaluate": <true|false - optional> | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is not particularly intuitive. I would suggest using two different fields instead of making the behaviour dependent on this flag (e.g. |
||||||
}, | ||||||
] | ||||||
} | ||||||
``` | ||||||
</details> | ||||||
|
||||||
### Advanced replacements | ||||||
|
||||||
The `environment` block is optional and used for very dynamic replacements. By specifying a python source file, it will be dynamically imported at run time. The `new_value` field can then specify a `<module>.<func>` that returns a string value. As an example, the following adds a replacement of "TODAY" to an iso-formatted datetime. | ||||||
|
||||||
```json | ||||||
{ | ||||||
"environment": [ | ||||||
{ | ||||||
"import": "funcs", | ||||||
"path": "funcs.py" | ||||||
} | ||||||
], | ||||||
"replacements":[ | ||||||
{ | ||||||
"name": "Todays date", | ||||||
"pattern": "TODAY", | ||||||
"new_value": "funcs.today" | ||||||
"evaluate": true | ||||||
}, | ||||||
] | ||||||
} | ||||||
``` | ||||||
|
||||||
Funcs.py | ||||||
```python | ||||||
import datetime | ||||||
|
||||||
def today(term): | ||||||
return datetime.datetime.now().isoformat() | ||||||
``` | ||||||
|
||||||
The parameter `term` is a Match object as per using [re.subn](https://docs.python.org/3/library/re.html#re.subn). | ||||||
|
||||||
## Terminal output format | ||||||
|
||||||
By default, `md2cf` produces rich output with animated progress bars that are meant for human consumption. If the output is redirected to a file, the progress bars will not be displayed and only the final result will be written to the file. Error messages are always printed to standard error. | ||||||
|
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -23,6 +23,7 @@ | |||||||||||||||||||||
minimal_output_console, | ||||||||||||||||||||||
) | ||||||||||||||||||||||
from md2cf.document import Page | ||||||||||||||||||||||
from md2cf.replacements import create_replacements | ||||||||||||||||||||||
from md2cf.tui import Md2cfTUI | ||||||||||||||||||||||
from md2cf.upsert import upsert_attachment, upsert_page | ||||||||||||||||||||||
|
||||||||||||||||||||||
|
@@ -258,6 +259,20 @@ def get_parser(): | |||||||||||||||||||||
help="number of retry attempts if any API call fails", | ||||||||||||||||||||||
) | ||||||||||||||||||||||
|
||||||||||||||||||||||
parser.add_argument( | ||||||||||||||||||||||
"--replace", | ||||||||||||||||||||||
nargs="+", | ||||||||||||||||||||||
action="append", | ||||||||||||||||||||||
dest="replacements", | ||||||||||||||||||||||
help="Specify replacements on the form <from>=<to>. Can be repeated many times", | ||||||||||||||||||||||
) | ||||||||||||||||||||||
|
||||||||||||||||||||||
parser.add_argument( | ||||||||||||||||||||||
"--replacements", | ||||||||||||||||||||||
dest="replacementfile", | ||||||||||||||||||||||
help="Filename with replacement definition in json format", | ||||||||||||||||||||||
) | ||||||||||||||||||||||
Comment on lines
+270
to
+274
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||||||||||
|
||||||||||||||||||||||
return parser | ||||||||||||||||||||||
|
||||||||||||||||||||||
|
||||||||||||||||||||||
|
@@ -299,6 +314,8 @@ def main(): | |||||||||||||||||||||
console.quiet = True | ||||||||||||||||||||||
json_output_console.quiet = False | ||||||||||||||||||||||
|
||||||||||||||||||||||
replacements = create_replacements(args.replacements, args.replacementfile) | ||||||||||||||||||||||
|
||||||||||||||||||||||
confluence = api.MinimalConfluence( | ||||||||||||||||||||||
host=args.host, | ||||||||||||||||||||||
username=args.username, | ||||||||||||||||||||||
|
@@ -398,6 +415,8 @@ def main(): | |||||||||||||||||||||
for page in pages_to_upload: | ||||||||||||||||||||||
pre_process_page(page, args, postface_markup, preface_markup, space_info) | ||||||||||||||||||||||
tui.start_item_task(page.original_title) | ||||||||||||||||||||||
for replacement in replacements: | ||||||||||||||||||||||
page = replacement.replace(page) | ||||||||||||||||||||||
upsert_page_result = None | ||||||||||||||||||||||
try: | ||||||||||||||||||||||
tui.set_item_progress_label(page.original_title, "Upserting") | ||||||||||||||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are no tests at all for this :( |
Original file line number | Diff line number | Diff line change | ||||||
---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,70 @@ | ||||||||
import importlib.util | ||||||||
import json | ||||||||
import re | ||||||||
from typing import List | ||||||||
|
||||||||
from md2cf.console_output import console | ||||||||
|
||||||||
|
||||||||
class Replacement: | ||||||||
def __init__( | ||||||||
self, name: str, pattern: str, new_value: str, evaluate: bool = False | ||||||||
) -> None: | ||||||||
self.name = name | ||||||||
self.pattern = pattern | ||||||||
self.new_value = new_value | ||||||||
self.evaluate = evaluate | ||||||||
|
||||||||
def replace(self, page): | ||||||||
console.print(f"Performing replacement '{self.name}'") | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't add console logging inside library classes/functions, please. This is better handled in |
||||||||
if self.evaluate: | ||||||||
new_value = eval(self.new_value) | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since this is implemented with an |
||||||||
else: | ||||||||
new_value = self.new_value | ||||||||
page.body, count = re.subn(f"({self.pattern})", new_value, page.body) | ||||||||
console.print(f">> {count} replacements made") | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same here. |
||||||||
return page | ||||||||
|
||||||||
def __repr__(self) -> str: | ||||||||
return self.name | ||||||||
|
||||||||
|
||||||||
def create_replacements(replacements, replacementfile: str) -> List[Replacement]: | ||||||||
result = [] | ||||||||
commandline_replacements = ( | ||||||||
[item for sublist in replacements for item in sublist] if replacements else [] | ||||||||
) | ||||||||
|
||||||||
# Create Replacement objects for the commandline replacements | ||||||||
for i, r in enumerate(commandline_replacements): | ||||||||
result.append(Replacement(f"CLI replacement {i}", *r.split("=", 1))) | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This can go wrong pretty easily if somebody forgets the
|
||||||||
|
||||||||
# Opt out if no file specified | ||||||||
if not replacementfile: | ||||||||
return result | ||||||||
|
||||||||
file_replacements = json.load(open(replacementfile)) | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||
# Do we need to load any modules? | ||||||||
for env in file_replacements.get("environment", []): | ||||||||
if env.get("import"): | ||||||||
spec = importlib.util.spec_from_file_location( | ||||||||
env["import"], env.get("path") | ||||||||
) | ||||||||
globals()[env["import"]] = importlib.util.module_from_spec(spec) | ||||||||
spec.loader.exec_module(globals()[env["import"]]) | ||||||||
|
||||||||
# Get the replacement definitions | ||||||||
for i, r in enumerate(file_replacements["replacements"]): | ||||||||
new_value = r["new_value"] | ||||||||
if isinstance(new_value, list): | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This behaviour is not documented anywhere. Is it really necessary? What's a use case for this vs. a string with newlines? |
||||||||
new_value = "\n".join(new_value) | ||||||||
result.append( | ||||||||
Replacement( | ||||||||
r.get("name", f"File replacement {i}"), | ||||||||
r["pattern"], | ||||||||
new_value, | ||||||||
r.get("evaluate", False), | ||||||||
) | ||||||||
) | ||||||||
|
||||||||
return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably recommend doing this in YAML instead of JSON. Most other configuration files in the documentation-adjacent space tend to be YAML, and JSON is a valid subset of YAML anyway.