Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion #1

Open
Kishlay-notabot opened this issue Apr 16, 2024 · 5 comments
Open

Discussion #1

Kishlay-notabot opened this issue Apr 16, 2024 · 5 comments
Assignees

Comments

@Kishlay-notabot
Copy link

Hey there, Kishlay this side,
I want to know that what kind of features do you want to add to this webpage? And what's the roadmap.
I can help you with the development, if the domain matches my skillset.

@Kishlay-notabot
Copy link
Author

Kishlay-notabot commented Apr 16, 2024

Edit: I think the pdf segmentation could be done without AI, but it needs to be specific. Will discuss more soon.

@b4apple
Copy link
Owner

b4apple commented Apr 16, 2024

Hi @Kishlay-notabot thanks for your interest.

  1. Immediate Features : Matrix Match 2 x 2 UI element and subsequent response display. 2. 50% of the page on the left to show the Question UI, 50% on the right to display the pdf in a pdf viewer element.

If you can implement these two features, extremely helpful! The roadmap beyond this is in the making and I shall get back to you in 2-3 days.

@b4apple b4apple self-assigned this Apr 16, 2024
@Kishlay-notabot
Copy link
Author

I think, running OCR on the pdfs imported, and then seperating each question with its own set of MCQ radio options as A,B,C,D and making them be in a "gallery" just like NTA exams where you click next to get to the next question.
I can work with the OCR part. But the algorithm which will seperate the questions maybe can be based on the numerical numbering? I haven't brainstormed much into that side. But I think using OCR can help.
Doing this would be better suiting for students than a 50/50 content and pdf page. Because then you manually need to scroll both the sides, MCQs and the Pdf too.
Let's see how this goes, I'm just seeding an idea to you.

@b4apple
Copy link
Owner

b4apple commented Apr 17, 2024

sounds good, can you prototype the OCR part?

@Kishlay-notabot
Copy link
Author

Running OCR on imported pdf is dead easy.. I've worked a lot with Tesseract engine and it can be done easily even in the browser.
But working on the algorithm where we seperate each an every question and assign it to html elements should be tricky. That part is very specific to the type of pdfs kids generally use. You should get some samples of what sources do they take it from, and analyze the physical structure of how the elements are placed. It would be hard to generalize an algorithm for variety of pdfs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants