-
-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse .
-prefixed reference identifiers (relative identifiers)
#37
Conversation
…ing off the square brackets
…ckage does not support <3.8 any more)
…d thus remove sys import
As suggested, it looks like we can get the reference from An initial debugging session with the code has highlighted the existence of Click to show a pysnooper.snoop debug dump of the relevant source codeNote: (autoref-docs) louis 🌟 ~/lab/docs/autoref-test $ pdmdoc -a localhost:8001
INFO - Building documentation...
INFO - Cleaning site directory
Source path:... /home/louis/dev/autorefs/src/mkdocs_autorefs/plugin.py
Starting var:.. self = <mkdocs_autorefs.plugin.AutorefsPlugin object at 0x7fe2750e4eb0>
Starting var:.. output = '\n<!doctype html>\n<html lang="en" class="no-js....js"></script>\n \n \n </body>\n</html>'
Starting var:.. page = Page(title='Home', url='/')
Starting var:.. kwargs = {'config': {'config_file_path': '/home/louis/lab...absolute_links': 20, 'unrecognized_links': 20}}}}
16:06:23.371158 call 203 def on_post_page(
16:06:23.371416 line 226 log.debug(f"Fixing references in page {page.file.src_path}")
16:06:23.371673 line 228 url_mapper = functools.partial(
16:06:23.371825 line 229 self.get_item_url, from_url=page.url, fallback=self.get_fallback_anchor
16:06:23.371957 line 228 url_mapper = functools.partial(
New var:....... url_mapper = functools.partial(<bound method AutorefsPlugin.g...andlers.base.Handlers object at 0x7fe2750e4e20>>)
16:06:23.372166 line 231 fixed_output, unmapped = fix_refs(output, url_mapper)
New var:....... fixed_output = '\n<!doctype html>\n<html lang="en" class="no-js....js"></script>\n \n \n </body>\n</html>'
New var:....... unmapped = []
16:06:23.372318 line 233 if unmapped and log.isEnabledFor(logging.WARNING):
16:06:23.372569 line 240 return fixed_output
16:06:23.372830 return 240 return fixed_output
Return value:.. '\n<!doctype html>\n<html lang="en" class="no-js....js"></script>\n \n \n </body>\n</html>'
Elapsed time: 00:00:00.001986
Starting var:.. self = <mkdocs_autorefs.plugin.AutorefsPlugin object at 0x7fe2750e4eb0>
Starting var:.. output = '\n<!doctype html>\n<html lang="en" class="no-js....js"></script>\n \n \n </body>\n</html>'
Starting var:.. page = Page(title='APIs', url='/api/')
Starting var:.. kwargs = {'config': {'config_file_path': '/home/louis/lab...absolute_links': 20, 'unrecognized_links': 20}}}}
16:06:23.374654 call 203 def on_post_page(
16:06:23.374821 line 226 log.debug(f"Fixing references in page {page.file.src_path}")
16:06:23.375039 line 228 url_mapper = functools.partial(
16:06:23.375212 line 229 self.get_item_url, from_url=page.url, fallback=self.get_fallback_anchor
16:06:23.375376 line 228 url_mapper = functools.partial(
New var:....... url_mapper = functools.partial(<bound method AutorefsPlugin.g...andlers.base.Handlers object at 0x7fe2750e4e20>>)
16:06:23.375526 line 231 fixed_output, unmapped = fix_refs(output, url_mapper)
New var:....... fixed_output = '\n<!doctype html>\n<html lang="en" class="no-js....js"></script>\n \n \n </body>\n</html>'
New var:....... unmapped = []
16:06:23.375715 line 233 if unmapped and log.isEnabledFor(logging.WARNING):
16:06:23.375965 line 240 return fixed_output
16:06:23.376203 return 240 return fixed_output
Return value:.. '\n<!doctype html>\n<html lang="en" class="no-js....js"></script>\n \n \n </body>\n</html>'
Elapsed time: 00:00:00.001890
Starting var:.. self = <mkdocs_autorefs.plugin.AutorefsPlugin object at 0x7fe2750e4eb0>
Starting var:.. output = '\n<!doctype html>\n<html lang="en" class="no-js....js"></script>\n \n \n </body>\n</html>'
Starting var:.. page = Page(title='Foo', url='/api/foo/')
Starting var:.. kwargs = {'config': {'config_file_path': '/home/louis/lab...absolute_links': 20, 'unrecognized_links': 20}}}}
16:06:23.381089 call 203 def on_post_page(
16:06:23.381283 line 226 log.debug(f"Fixing references in page {page.file.src_path}")
16:06:23.381513 line 228 url_mapper = functools.partial(
16:06:23.381690 line 229 self.get_item_url, from_url=page.url, fallback=self.get_fallback_anchor
16:06:23.381873 line 228 url_mapper = functools.partial(
New var:....... url_mapper = functools.partial(<bound method AutorefsPlugin.g...andlers.base.Handlers object at 0x7fe2750e4e20>>)
16:06:23.382057 line 231 fixed_output, unmapped = fix_refs(output, url_mapper)
Source path:... /home/louis/dev/autorefs/src/mkdocs_autorefs/plugin.py
Starting var:.. self = <mkdocs_autorefs.plugin.AutorefsPlugin object at 0x7fe2750e4eb0>
Starting var:.. identifier = '.x'
Starting var:.. from_url = 'api/foo/'
Starting var:.. fallback = <bound method Handlers.get_anchors of <mkdocstri...handlers.base.Handlers object at 0x7fe2750e4e20>>
16:06:23.382299 call 105 def get_item_url(
16:06:23.382355 line 121 url = self._get_item_url(identifier, fallback)
> /home/louis/dev/autorefs/src/mkdocs_autorefs/plugin.py(90)_get_item_url()
-> try:
(Pdb) p identifier
'.x'
(Pdb) p fallback
<bound method Handlers.get_anchors of <mkdocstrings.handlers.base.Handlers object at 0x7fe2750e4e20>>
(Pdb) n
> /home/louis/dev/autorefs/src/mkdocs_autorefs/plugin.py(91)_get_item_url()
-> return self._url_map[identifier]
(Pdb) p self._url_map
{'foo': 'api/foo/#foo', 'foo.main': 'api/foo/#foo.main', 'foo.main.A': 'api/foo/#foo.main.A', 'foo.main.A.x': 'api/foo/#foo.main.A.x', 'foo.main.A.y': 'api/foo/#foo.main.A.y'} |
I presume that the right place to 'intervene' (process the relative path) is here, but don't want to rush in with a thoughtless implementation: autorefs/src/mkdocs_autorefs/plugin.py Lines 80 to 97 in a6e373b
e.g. in my example above the (Pdb) pp self._url_map
{'foo': 'api/foo/#foo',
'foo.main': 'api/foo/#foo.main',
'foo.main.A': 'api/foo/#foo.main.A',
'foo.main.A.x': 'api/foo/#foo.main.A.x',
'foo.main.A.y': 'api/foo/#foo.main.A.y'} Looking at it, we don't have the information available to compute a relative path here: we would need to have already expanded it at an earlier stage into a full identifier from the We could however use a lookup (to save time for repeated operations) if we only target the relative prefix (as we are only using one or more dots), and then we could just do a simple replacement. What we're looking for would be something like this:
We would take the length of the So in the example of
|
The data flow is not the easiest one 🙂 What I envisioned was to take advantage of the intermediate data format used by autorefs, i.e. the HTML span with arbitrary attributes. In there we could store the path of the origin object (the one that has the relative ref in its docstring), as well as other helpful debugging information (line numbers, etc.). Then, in the process of "fixing" refs, we could use this information to rebuild absolute refs (and use additional info in error messages if needed). The issue is, how do we pass this extra information to the spans? We could infer some from the surrounding HTML context, but we definitely do not want to parse HTML to extract it (super costly). Maybe we could have a pre-processing step during Markdown rendering, thanks to an additional Markdown processor that would act only on refs (no links, no diagrams, let me know if it's unclear. I'd like to elaborate anyway and sometimes put a clear explanation of the data flow in the docs 🙂 ) |
It's not at all bad! The state is relatively simple in the demo I've made and there are only 2 modules. 😇 I just traced back where to hook into the identifier lookup and as you rightly say it would have to be where the In this function we get access to the HTML span, I just stuck a breakpoint on it: > /home/louis/dev/autorefs/src/mkdocs_autorefs/references.py(167)inner()
-> try:
(Pdb) p identifier
'.x'
(Pdb) p title
'.x'
(Pdb) p kind
'autorefs-identifier'
(Pdb) p match
<re.Match object; span=(20120, 20165), match='<span data-autorefs-identifier=".x">.x</span>'> I like your suggestion: there may well be further benefits we can get from populating this span with the location. For now, all we need is the qualname of the object the docstring is on (its "target" shall we say).
I don't fully follow this: I was proposing to keep it entirely in the It sounds like you're saying that isn't entirely possible: the handler needs to provide some more information (the target's qualname, at least). Tracing the origin of this span, it began in When it did this, all it passed along was Therefore we need to retrace our steps further: the
Click to show full traceback to the handleMatch call(Pdb) w
/home/louis/miniconda3/envs/autoref-docs/bin/mkdocs(8)<module>()
-> sys.exit(cli())
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/click/core.py(1157)__call__()
-> return self.main(*args, **kwargs)
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/click/core.py(1078)main()
-> rv = self.invoke(ctx)
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/click/core.py(1688)invoke()
-> return _process_result(sub_ctx.command.invoke(sub_ctx))
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/click/core.py(1434)invoke()
-> return ctx.invoke(self.callback, **ctx.params)
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/click/core.py(783)invoke()
-> return __callback(*args, **kwargs)
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocs/__main__.py(270)serve_command()
-> serve.serve(**kwargs)
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocs/commands/serve.py(86)serve()
-> builder(config)
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocs/commands/serve.py(67)builder()
-> build(config, live_server=None if is_clean else server, dirty=is_dirty)
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocs/commands/build.py(322)build()
-> _populate_page(file.page, config, files, dirty)
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocs/commands/build.py(175)_populate_page()
-> page.render(config, files)
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocs/structure/pages.py(271)render()
-> self.content = md.convert(self.markdown)
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/markdown/core.py(357)convert()
-> root = self.parser.parseDocument(self.lines).getroot()
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/markdown/blockparser.py(117)parseDocument()
-> self.parseChunk(self.root, '\n'.join(lines))
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/markdown/blockparser.py(136)parseChunk()
-> self.parseBlocks(parent, text.split('\n\n'))
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/markdown/blockparser.py(158)parseBlocks()
-> if processor.run(parent, blocks) is not False:
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocstrings/extension.py(124)run()
-> html, handler, data = self._process_block(identifier, block, heading_level)
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocstrings/extension.py(220)_process_block()
-> rendered = handler.render(data, options)
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocstrings_handlers/python/handler.py(334)render()
-> return template.render(
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/jinja2/environment.py(1299)render()
-> return self.environment.concat(self.root_render_func(ctx)) # type: ignore
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocstrings_handlers/python/templates/material/module.html(15)root()
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocstrings_handlers/python/templates/material/_base/module.html(77)root()
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocstrings_handlers/python/templates/material/_base/module.html(129)block_contents()
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocstrings_handlers/python/templates/material/_base/module.html(162)block_children()
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocstrings_handlers/python/templates/material/children.html(15)root()
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocstrings_handlers/python/templates/material/_base/children.html(185)root()
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocstrings_handlers/python/templates/material/module.html(15)root()
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocstrings_handlers/python/templates/material/_base/module.html(77)root()
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocstrings_handlers/python/templates/material/_base/module.html(129)block_contents()
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocstrings_handlers/python/templates/material/_base/module.html(162)block_children()
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocstrings_handlers/python/templates/material/children.html(15)root()
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocstrings_handlers/python/templates/material/_base/children.html(125)root()
-> {% include attribute|get_template with context %}
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocstrings_handlers/python/templates/material/class.html(15)root()
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocstrings_handlers/python/templates/material/_base/class.html(79)root()
-> {% endfor -%}
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocstrings_handlers/python/templates/material/_base/class.html(186)block_contents()
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocstrings_handlers/python/templates/material/_base/class.html(233)block_docstring()
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocstrings_handlers/python/templates/material/docstring.html(15)root()
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocstrings_handlers/python/templates/material/_base/docstring.html(33)root()
-> {% endfor %}
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/mkdocstrings/handlers/base.py(276)do_convert_markdown()
-> return Markup(self._md.convert(text))
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/markdown/core.py(361)convert()
-> newRoot = treeprocessor.run(root)
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/markdown/treeprocessors.py(385)run()
-> self.__handleInline(text), child
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/markdown/treeprocessors.py(136)__handleInline()
-> data, matched, startIndex = self.__applyPattern(
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/markdown/treeprocessors.py(286)__applyPattern()
-> node, start, end = pattern.handleMatch(match, data)
> /home/louis/dev/autorefs/src/mkdocs_autorefs/references.py(66)handleMatch()
-> return self.makeTag(identifier, text), m.start(0), end ...
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/markdown/treeprocessors.py(385)run()
-> self.__handleInline(text), child
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/markdown/treeprocessors.py(136)__handleInline()
-> data, matched, startIndex = self.__applyPattern(
/home/louis/miniconda3/envs/autoref-docs/lib/python3.10/site-packages/markdown/treeprocessors.py(286)__applyPattern()
-> node, start, end = pattern.handleMatch(match, data) We can go read this code for It's not clear to me how to supply the required info, so I'm glad you can see! |
If you could give me a pointer to where to go next I'd appreciate it, I hit a bit of a dead end there. I'm going to take a break now and come back, I have other things to get onto, so please no rush! 😺 I modified the test repo in the demo repo (which broke the deployment build, obviously, as this feature isn't created yet): |
Of course, take your time 🙂
You got it right. I read again autorefs' code and had forgotten it already works with a Markdown inline processor. Put shortly, there are 4 steps:
So now I have a clearer idea where the hook between handlers and autorefs should happen: in the autorefs reference processor. Actually it already works with a hook to handlers in the post-page processing step, so we could do something similar in the Markdown reference processor: if handlers are hooked, use them to retrieve additional information, that is then stored in spans. If not, do as usual. Then, in the post-page processing step, use that information (as described in previous comments, to rebuild absolute refs). Or, instead of fetching info and storing it into spans, maybe we could ask the handlers themselves to directly resolve relative refs to absolute ones, so that we can simply store the absolute one in the span 🙂 |
Hi. I'd just like to point out one thing. The dot character has no significance to autorefs whatsoever. When moving in this direction, it may become best to put all the code of autorefs back into mkdocstrings and then proceed there. Because by now there's way too much data flow. And even still then, the dot will have no significance to mkdocstrings. I assume that you're assuming the Python handler, but why? What about the Crystal handler which uses Oh actually there's another thing. In Python you say |
Actually with the latter in mind, I think this should just be implemented in each handler itself. The Crystal handler implements relative refs directly and I saw no benefit from doing it any other way. Maybe only with some shared utilities. |
You may think that cross-language refs couldn't work that way.. but what would a relative ref from one language to another even mean? It is best that they don't work, that'd be a real surprise otherwise |
Thanks for sharing how this works in Crystal. I think with my latest suggestion
...this effectively moves the logic specific to each language into handlers themselves 👍 The benefit of "hooking" handlers to autorefs this way is that each handler does not have to re-implement a Markdown inline reference processor, or other less efficient parsing to transform references in Markdown docstrings. |
Regarding cross-language abilities: at some point it might be necessary to support it in autorefs (but not relative refs, you're right, it doesn't seem to make sense). For example a same project could declare stuff in a language, and have bindings in another, where both have the same identifier. In that case you'll get into trouble trying to reference the object from language A instead of B, depending in which order the headings were rendered. What I mean is that for projects with multiple languages, we'll possibly need a feature like Sphinx' domains, with a way to choose the domain (or other attributes) directly within the markup. |
Thanks for the response.
Actually that brings me to another point. Yes, this saves the need to re-implement the
So maybe sharing the implementation of the syntax is not even that good of a goal to have? The Python handler could consider a more convenient syntax as well, by the way. After all, it's already not pure Markdown, what with the Google-style sections and such. With this idea, the |
Thanks for chiming in @oprypin, I was aware there is also a Crystal handler too! 🙂
Wow! I didn’t intend to cause that sort of disruption to implement this feature, I saw it as an extension of existing functionality.
I was not: the use of
The preceding This therefore isn’t Python-specific. You would simply take the absolute path to the object being documented, which as you say might be
I am not familiar with Crystal syntax any further than this, but I don’t see any particular need to use a separate syntax. In any case it sounds like a per-handler approach is now being proposed (I just wanted to emphasise that this is not due to something I overlooked in the proposal above). I laid out a pseudo-code routine for how this would work in this comment above
This routine is language agnostic, for Crystal you’d just change the values in the dict of parent levels:
I see now why you thought that I’d overlooked Crystal, I just elided the separator as “+”. No matter what separator you want to use, it will always be possible to ascend a path, provided you have the separators to split it on, which for Python is { This was just meant to be a rough outline, so good to discuss a full spec for the work to be done. I’m happy to shift this into a per-handler if the data flow proves too taxing to do in a common way - though I don’t think it would be impossible or even difficult to do it for all languages (just specifying the separator in the handler would suffice from what I understand here). It is a bit of a shame to have to split this off per-language, I thought a simple universal approach would work. Hopefully this explains why If all languages have the If you feel strongly that’s unwise please say so, this is my first time developing with this code ecosystem. |
Hi @lmmx , thanks for the consideration. I have expressed why I think a language-specific approach could be better - with a more natural syntax for the target language and with relative lookup rules familiar to developers in that language. But yes, I definitely missed the fact that your proposal could still work universally.
I'll not block anything either way. Just to me seems that the design considerations are much harder even than the implementation considerations, so wanted to cover everything. |
@oprypin Interesting. I'm curious to see examples for many languages now.
It's nice and all that most languages have their way of writing cross-references, but I'd personally find it cumbersome to use different syntax depending on the language when the docs themselves are all written in Markdown. Or, in a more enthusiast way, it would be nice to have a single syntax based on Markdown for every language. @lmmx perfectly explained what are also my thoughts: the I also understand that if a language specifies documentation comments with non-Markdown markup, supporting just the cross-ref ability in a Markdown-like syntax does not make much sense. Writers will want to stay close to their language specs, to allow compatibility with other tools from their ecosystem, etc.. Unification of the syntax is therefore maybe just a sweet dream. Or maybe it's just too early, and we should reconsider when we have many more handlers. Even if touted as superior by design, rST did not manage to bring its cross-language abilities everywhere. Markdown is much, much more popular so maybe it has a chance.
I see "two birds with one stone" here. If we extend it to support language domains, it would be nice to also allow handlers to transform relative refs to absolute ones, without moving the entire responsibility on them. In the end, as you said @oprypin, it's a hard thing to design. I've not yet made up my mind and will need more exploration/discussion 🙂 |
🙂🙂 Nice discussion yes
The way I was thinking about it is .. you're writing the comments for the current programming language which surely you're familiar with so it can't be cumbersome to use its typical syntax. On the other side, having to bend to the particular doc generator in use seemed like the worse tradeoff to me, even if that doc generator is consistent with itself. I mean, it's easy to imagine a team lead putting a veto on a doc generator that forces developers to write comments in an un-idiomatic style. Whereas with what I'm proposing, you could freely switch between mkdocstrings and the native doc generator without needing to edit the original code comments.
But this is interesting actually. My thinking is biased by the fact that most languages these days already have Markdown-based documentation syntax so I started to think of mkdocstrings mainly as a way to insert docs into a Markdown page, not necessarily written in the exact same dialect of Markdown as the top-level page, or even Markdown at all. But of course for the case of Python that actually was whole point - mkdocstrings-python wouldn't want to be based on reStructuredText just because that's what the developers are used to - that would lose a huge point of the project. But still I think "mkdocstrings-python-rst" would be a valid and feasible project if someone really likes rst but not sphinx and wanted to develop such an alternative. It would render rst docs into an MkDocs site. In which case surely the refs would preferably use rst syntax and not this one, right? |
(continuing from the above) |
That's kinda part of the concept yes 😂 But I see your point 🙂 |
This PR intends to add support for relative references
evalId
produces in both cases, stripping off the square bracketsline-length
config in atool.ruff
block inpyproject.toml
so ruff-lsp will not auto-lint the lines incorrectly while editing [clash with pre-commit duty rule]Testing
I have set up a Vercel deployment to test the effect of the proposal as I work on it, using a branch on a Vercel/mkdocs "hello world" type demo repo:
docs-pkg
repo'sautoref-test
branch: https://docs-pkg-git-autoref-test-lmmx.vercel.app/foo.main.A