Skip to content

Commit

Permalink
add: generation of the includes map (#27)
Browse files Browse the repository at this point in the history
* added includes map generation

* fix

* update

* fix

* update and add debug

* update debug

* fix case 3

* update

* add option

* fix import

* remove uncessary comments

* update

* add test includes map

* update README.md

* update python-test.yml

* test

* add: anchors into include map

* test: enabled by default

* fix: tests and remove anchors

* update: README

* bump version
  • Loading branch information
TOsmanov authored Oct 16, 2024
1 parent 9af3eed commit e13da3a
Show file tree
Hide file tree
Showing 7 changed files with 148 additions and 14 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/python-test.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: Python package
name: Python package tests

on: [push]

Expand Down
8 changes: 7 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[![](https://img.shields.io/pypi/v/foliantcontrib.includes.svg)](https://pypi.org/project/foliantcontrib.includes/) [![](https://img.shields.io/github/v/tag/foliant-docs/foliantcontrib.includes.svg?label=GitHub)](https://github.com/foliant-docs/foliantcontrib.includes)
[![](https://img.shields.io/pypi/v/foliantcontrib.includes.svg)](https://pypi.org/project/foliantcontrib.includes/) [![](https://img.shields.io/github/v/tag/foliant-docs/foliantcontrib.includes.svg?label=GitHub)](https://github.com/foliant-docs/foliantcontrib.includes) [![Tests](https://github.com/foliant-docs/foliantcontrib.includes/actions/workflows/python-test.yml/badge.svg)](https://github.com/foliant-docs/foliantcontrib.includes/actions/workflows/python-test.yml)

# Includes for Foliant

Expand Down Expand Up @@ -32,6 +32,7 @@ preprocessors:
- j2
aliases:
...
includes_map: true
```
`cache_dir`
Expand Down Expand Up @@ -79,6 +80,11 @@ Default `true`.

Note that in the second example the default revision (`develop`) will be overridden with the custom one (`master`).

`includes_map`
: Enables generation of the `includes_map.json` file containing information about files inserted using the includes preprocessor.

From this file, third-party services can receive information about the presence of inclusions in files, for example, to check links using a linter.

## Usage

The preprocessor allows two syntax variants for include statements.
Expand Down
7 changes: 6 additions & 1 deletion README_ru.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[![](https://img.shields.io/pypi/v/foliantcontrib.includes.svg)](https://pypi.org/project/foliantcontrib.includes/) [![](https://img.shields.io/github/v/tag/foliant-docs/foliantcontrib.includes.svg?label=GitHub)](https://github.com/foliant-docs/foliantcontrib.includes)
[![](https://img.shields.io/pypi/v/foliantcontrib.includes.svg)](https://pypi.org/project/foliantcontrib.includes/) [![](https://img.shields.io/github/v/tag/foliant-docs/foliantcontrib.includes.svg?label=GitHub)](https://github.com/foliant-docs/foliantcontrib.includes) [![Tests](https://github.com/foliant-docs/foliantcontrib.includes/actions/workflows/python-test.yml/badge.svg)](https://github.com/foliant-docs/foliantcontrib.includes/actions/workflows/python-test.yml)

# Препроцессор Includes для Foliant

Expand Down Expand Up @@ -34,6 +34,7 @@ preprocessors:
- j2
aliases:
...
includes_map: true
```
`cache_dir`
Expand Down Expand Up @@ -64,6 +65,10 @@ preprocessors:
`aliases`
: Сопоставление псевдонимов с URL-адресами репозитория Git. После определения этого параметра псевдоним может использоваться для ссылки на репозиторий вместо его полного URL-адреса.

`includes_map`
: Включает генерацию файла `includes_map.json`, содержащего информацию о файлах, вставленных с помощью препроцессора includes.
Из этого файла сторонние сервисы могут получать информацию о наличии текста вставленного в файл с помощью препроцессора, например, для проверки ссылок с помощью линтера.

>**Внимание!**
>
> Псевдонимы доступны только в рамках устаревшего синтаксиса инструкций include (см. ниже)
Expand Down
4 changes: 4 additions & 0 deletions changelog.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# 1.1.18

- Add: option for generation of the includes map containing information about files inserted using the preprocessor.

# 1.1.17

- Fix: fixed a link processing error for files with diagrams
Expand Down
121 changes: 111 additions & 10 deletions foliant/preprocessors/includes.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
from pathlib import Path
import socket
from subprocess import run, CalledProcessError, PIPE, STDOUT
from json import dump
from os import getcwd


from foliant.preprocessors.base import BasePreprocessor
Expand Down Expand Up @@ -43,6 +45,12 @@ def __init__(self, *args, **kwargs):

self._cache_dir_path = self.project_path / self.options['cache_dir']
self._downloaded_dir_path = self._cache_dir_path / '_downloaded_content'
self.src_dir = self.config.get("src_dir")
self.includes_map_enable = False
if 'includes_map' in self.options:
self.includes_map_enable = True
if self.includes_map_enable:
self.includes_map = []

self.logger = self.logger.getChild('includes')

Expand Down Expand Up @@ -162,7 +170,7 @@ def _download_file_from_url(self, url: str) -> Path:

for line in dict_new_link:
downloaded_content = downloaded_content.replace(line, dict_new_link[line])
# End of the conversion code block
# End of the conversion code block

with open(downloaded_file_path, 'w', encoding='utf8') as downloaded_file:

Expand Down Expand Up @@ -217,6 +225,8 @@ def _sync_repo(

except CalledProcessError as exception:
self.logger.warning(str(exception))
except Exception as exception:
self.logger.warning(str(exception))

else:
self.logger.error(str(exception))
Expand Down Expand Up @@ -684,7 +694,7 @@ def _get_included_file_path(
)

self.logger.debug(f'Finally, included file path: {included_file_path}')

return included_file_path

def _process_include(
Expand Down Expand Up @@ -723,8 +733,8 @@ def _process_include(
f'Included file path: {included_file_path}, from heading: {from_heading}, ' +
f'to heading: {to_heading}, sethead: {sethead}, nohead: {nohead}'
)


if included_file_path.exists():
included_file_path = included_file_path
else:
Expand Down Expand Up @@ -757,9 +767,9 @@ def _process_include(

old_found_link = regexp_find_link.findall(included_content)

for line in old_found_link:
for line in old_found_link:
relative_path = regexp_find_path.findall(line)

for ex_line in relative_path:
exceptions_characters = re.findall(r'https?://[^\s]+|@|:|\.png|\.jpeg|.svg', ex_line)
if exceptions_characters:
Expand All @@ -771,7 +781,7 @@ def _process_include(

for line in dict_new_link:
included_content = included_content.replace(line, dict_new_link[line])
# End of the conversion code block
# End of the conversion code block

if self.config.get('escape_code', False):
if isinstance(self.config['escape_code'], dict):
Expand Down Expand Up @@ -831,6 +841,50 @@ def _process_include(

return included_content

def _prepare_path_for_includes_map(self, path: Path) -> str:
donor_path = None
if path.as_posix().startswith(self.working_dir.as_posix()):
_path = path.relative_to(self.working_dir)
donor_path = f"{self.src_dir}/{_path.as_posix()}"
elif path.as_posix().startswith(getcwd()):
_path = path.relative_to(getcwd())
if _path.as_posix().startswith(self.working_dir.as_posix()):
_path = _path.relative_to(self.working_dir)
if _path.as_posix().startswith(self.working_dir.as_posix()):
donor_path = f"{self.src_dir}/{_path.relative_to(self.working_dir).as_posix()}"
else:
donor_path = f"{self.src_dir}/{_path.as_posix()}"
else:
donor_path = _path.as_posix()
return donor_path

def _exist_in_includes_map(self, map: list, path: str) -> bool:
for obj in map:
if obj["file"] == path:
return True
return False

def _find_anchors(self, content: str) -> list:
anchors_list = []

anchors = re.findall(r'\<anchor\>([\-\_A-Za-z0-9]+)\<\/anchor\>', content)
for anchor in anchors:
anchors_list.append(anchor)
custom_ids = re.findall(r'\{\#([\-A-Za-z0-9]+)\}', content)
for anchor in custom_ids:
anchors_list.append(anchor)
elements_with_ids = re.findall(r'id\=[\"\']([\-A-Za-z0-9]+)[\"\']', content)
for anchor in elements_with_ids:
anchors_list.append(anchor)
return anchors_list

def _add_anchors(self, l: list, content: str) -> list:
anchors = self._find_anchors(content)
if len(anchors) > 0:
for anchor in anchors:
l.append(anchor)
return l

def process_includes(
self,
markdown_file_path: Path,
Expand All @@ -850,6 +904,12 @@ def process_includes(
:returns: Markdown content with resolved includes
'''

if self.includes_map_enable:
if markdown_file_path.as_posix().startswith(self.working_dir.as_posix()):
recipient_md_path = f'{self.src_dir}/{markdown_file_path.relative_to(self.working_dir).as_posix()}'
else:
recipient_md_path = f'{self.src_dir}/{markdown_file_path.as_posix()}'

markdown_file_path = markdown_file_path.resolve()

self.logger.debug(f'Processing Markdown file: {markdown_file_path}')
Expand All @@ -867,6 +927,10 @@ def process_includes(
include_statement = self.pattern.fullmatch(content_part)

if include_statement:
if self.includes_map_enable:
donor_md_path = None
donor_anchors = []

current_project_root_path = project_root_path

body = self._tag_body_pattern.match(include_statement.group('body').strip())
Expand Down Expand Up @@ -950,6 +1014,10 @@ def process_includes(

included_file_path = repo_path / body.group('path')

if self.includes_map_enable:
donor_md_path = included_file_path.as_posix()
self.logger.debug(f'Set the repo URL of the included file to {recipient_md_path}: {donor_md_path} (1)')

if included_file_path.name.startswith('^'):
included_file_path = self._find_file(
included_file_path.name[1:], included_file_path.parent
Expand Down Expand Up @@ -1000,7 +1068,11 @@ def process_includes(
nohead=options.get('nohead')
)

else: # if body
if self.includes_map_enable:
donor_md_path = self._prepare_path_for_includes_map(included_file_path)
self.logger.debug(f'Set the path of the included file to {recipient_md_path}: {donor_md_path} (2)')

else: # if body is missing
self.logger.debug('Using the new syntax rules')

if options.get('repo_url') and options.get('path'):
Expand Down Expand Up @@ -1036,6 +1108,10 @@ def process_includes(
include_link=include_link
)

if self.includes_map_enable:
donor_md_path = include_link + options.get('path')
self.logger.debug(f'Set the link of the included file to {recipient_md_path}: {donor_md_path} (3)')

elif options.get('url'):
self.logger.debug('File to get by URL referenced')

Expand All @@ -1062,13 +1138,20 @@ def process_includes(
nohead=options.get('nohead')
)

if self.includes_map_enable:
donor_md_path = options['url']
self.logger.debug(f'Set the URL of the included file to {recipient_md_path}: {donor_md_path} (4)')

elif options.get('src'):
self.logger.debug('Local file referenced')

included_file_path = self._get_included_file_path(options.get('src'), markdown_file_path)

self.logger.debug(f'Resolved path to the included file: {included_file_path}')

if self.includes_map_enable:
donor_md_path = self._prepare_path_for_includes_map(included_file_path)
self.logger.debug(f'Set the path of the included file to {recipient_md_path}: {donor_md_path} (5)')

if options.get('project_root'):
current_project_root_path = (
markdown_file_path.parent / options.get('project_root')
Expand All @@ -1087,6 +1170,7 @@ def process_includes(
sethead=current_sethead,
nohead=options.get('nohead')
)

else:
self.logger.warning(
'Neither repo_url+path nor src specified, ignoring the include statement'
Expand Down Expand Up @@ -1144,6 +1228,15 @@ def process_includes(

processed_content_part = re.sub(r'\s+', ' ', processed_content_part).strip()

if self.includes_map_enable:
if donor_md_path:
if not self._exist_in_includes_map(self.includes_map, recipient_md_path):
self.includes_map.append({ 'file': recipient_md_path, "includes": []})

for i, f in enumerate(self.includes_map):
if f['file'] == recipient_md_path:
self.includes_map[i]['includes'].append(donor_md_path)

else:
processed_content_part = content_part

Expand Down Expand Up @@ -1179,7 +1272,7 @@ def _get_source_files_extensions(self) -> list:
return source_files_extensions

def apply(self):

self.logger.info('Applying preprocessor')

# Cleaning up downloads because the content of remote source may have modified
Expand All @@ -1202,4 +1295,12 @@ def apply(self):
with open(source_file_path, 'w', encoding='utf8') as processed_file:
processed_file.write(processed_content)

# Write includes map
if self.includes_map_enable:
output = f'{self.working_dir}/static/includes_map.json'
Path(f'{self.working_dir}/static/').mkdir(parents=True, exist_ok=True)
with open(f'{self.working_dir}/static/includes_map.json', 'w', encoding='utf8') as f:
dump(self.includes_map, f)
self.logger.debug(f'includes_map write to {output}')

self.logger.info('Preprocessor applied')
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
description=SHORT_DESCRIPTION,
long_description=LONG_DESCRIPTION,
long_description_content_type='text/markdown',
version='1.1.17',
version='1.1.18',
author='Konstantin Molchanov',
author_email='[email protected]',
url='https://github.com/foliant-docs/foliantcontrib.includes',
Expand Down
18 changes: 18 additions & 0 deletions test/test_includes.py
Original file line number Diff line number Diff line change
Expand Up @@ -243,3 +243,21 @@ def test_extensions(self):
'index.j2': '# My title\n\nIncluded content',
'sub/sub.md': 'Included content'
}

def test_includes_map(self):
self.ptf.options = {'includes_map': True }
input_map = {
'index.md': '# My title\n\n<include src="sub/sub-1.md"></include>\n\n<include src="sub/sub-2.md"></include>',
'sub/sub-1.md': 'Included content 1',
'sub/sub-2.md': 'Included content 2'
}
expected_map = {
'index.md': '# My title\n\nIncluded content 1\n\nIncluded content 2',
'static/includes_map.json': "[{\"file\": \"__src__/index.md\", \"includes\": [\"__src__/sub/sub-1.md\", \"__src__/sub/sub-2.md\"]}]",
'sub/sub-1.md': 'Included content 1',
'sub/sub-2.md': 'Included content 2'
}
self.ptf.test_preprocessor(
input_mapping=input_map,
expected_mapping=expected_map,
)

0 comments on commit e13da3a

Please sign in to comment.