-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FeatureRequest/HelpNeeded: highlight is not an exact subset of the text content #179
Comments
Hi ! Just a quick bump as I would really like to wrap up my project while I got some free time :) But if you can't find the time to take a look it totally fine of course! |
Hi i think what you are seeing in the highlight text is raw text or at least markdown. Can you post a screenshot of the highlight itself? |
Here's the highlighted section of the text: The article link is that one: https://sebastianraschka.com/blog/2023/llm-finetuning-lora.html |
Hi, I decided to go the "most robust way" anyway and implement a function that finds the best substring in a corpus that matches the highlight. This is computationaly intensive and probably will be an issue for very long texts but at least I can move on towards finishing this. When I finish this project, if I think it's worth it I'll come back to you to see if that's worth a mention in a blog post or whatever :) In the meantime, although I still think my request is legit and someone might have a real need for more precise filter access in the API, I'll let you decide if you want to close this or not :) Have a nice day! |
Hi,
I'm the dev behind LogseqMarkdownParser and am working on a small script to directly turn highlights into anki flashcards.
It's not yet working because I'm running into an issue with text formats.
You see, I don't just want the highlight to be sent to anki, I want to grab the 1000 ish characters before and after the highlight, make a cloze card (= putting a hole in the text and you have to guess the content) with the highlight then sending that to anki.
The main issue I have is that for example I have this highlight:
For example, suppose ΔW is the weight update for a weight matrix W∈RA×B.
And the relevant section of text is this:
For example, suppose \\(\\Delta W\\) is the weight update for a weight ' 'matrix \\(W \\in \\mathbb{R}^{A \\times B}\\).
I'm guessing this is mathjax.
I can't seem to find a good python lib to parse mathjax into text, or text into mathjax, let alone reliably.
So is it possible to:
{{{rawText}}}
for the highlight, that would not be parsed (so would still contain the mathjax)Thanks!
The text was updated successfully, but these errors were encountered: