Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: (UI) add syntax highlighting and latex math rendering #10808

Merged
merged 21 commits into from
Dec 15, 2024

Conversation

VJHack
Copy link
Contributor

@VJHack VJHack commented Dec 12, 2024

Make sure to read the contributing guidelines before submitting a PR

Fixes #10246 and #10758

The changes in this PR include:

  • Added support for syntax highlighting in the code fragments of the model output

Screenshots:

Syntax highlighting
image

Light theme:
image

Latex rendering:
image

Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This frontend code is too simple and I don't want to add formatter or linter, it's overkill. Just take 2 or 5 minutes to read the code before committing, is it difficult to do so?

This is an open source project, not a school, so I can't spend there time fixing every possible thing. Because you've already done this way many times before, the next time, I may not review your PR if it takes me too much time. Hope you understand that...

For now, this PR can be merged once the CI passes

Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm one problem, the bundle size gets too big after this change, going from 500kb to more than 2MB now.

I temporarily block merging of this PR, but will see how we can improve that.

@ngxson
Copy link
Collaborator

ngxson commented Dec 13, 2024

katex takes 592KB and highlight.js takes 1.55MB, no tree shaking can be done:

image

@ngxson ngxson changed the title server: (UI) add syntax highlighting and math rendering server: (UI) add syntax highlighting Dec 13, 2024
@ngxson
Copy link
Collaborator

ngxson commented Dec 13, 2024

I removed math rendering (katex) because it is too big, we can only add it back in the future once we find a way to reduce bundle size

@ngxson
Copy link
Collaborator

ngxson commented Dec 13, 2024

merging this after #10803 to avoid conflict

@ngxson
Copy link
Collaborator

ngxson commented Dec 13, 2024

Added back katex with gzip to reduce bundle size, as discussed: #10758 (comment)

Bundle size is now:

dist/index.html  2,266.28 kB │ gzip: 1,204.25 kB

@ngxson ngxson changed the title server: (UI) add syntax highlighting server: (UI) add syntax highlighting and latex math rendering Dec 13, 2024
@slaren
Copy link
Collaborator

slaren commented Dec 13, 2024

I think this is missing some triggers for latex. Using Qwen2.5:

image

Sure! The softmax function can be mathematically formulated as follows:

Given an input vector \(\mathbf{x} = [x_1, x_2, \ldots, x_n]\), the softmax function computes the output vector \(\mathbf{y} = [y_1, y_2, \ldots, y_n]\) where each element \(y_i\) is given by:

\[
y_i = \frac{e^{x_i}}{\sum_{j=1}^{n} e^{x_j}}
\]

This ensures that the output vector \(\mathbf{y}\) is a probability distribution, i.e., all elements are non-negative and sum to 1.

Here is the LaTeX representation of the softmax function:

```latex
y_i = \frac{e^{x_i}}{\sum_{j=1}^{n} e^{x_j}}
```

For the entire vector \(\mathbf{y}\):

```latex
\mathbf{y} = \text{softmax}(\mathbf{x}) = \left[ \frac{e^{x_1}}{\sum_{j=1}^{n} e^{x_j}}, \frac{e^{x_2}}{\sum_{j=1}^{n} e^{x_j}}, \ldots, \frac{e^{x_n}}{\sum_{j=1}^{n} e^{x_j}} \right]
```

This formulation ensures that each element \(y_i\) is a valid probability, and the sum of all elements in \(\mathbf{y}\) is 1.

@ngxson
Copy link
Collaborator

ngxson commented Dec 13, 2024

I think this is missing some triggers for latex. Using Qwen2.5:

Yeah it depends on the model. It must be trained so that it output \[ to start the latex (instead of [). Ref: https://github.com/SchneeHertz/markdown-it-katex-gpt

I'll add a switch in setting for that, so user can switch between \[ and [

@slaren
Copy link
Collaborator

slaren commented Dec 13, 2024

Note that some of this output started \[, or with ```latex, and it also failed to render as latex (see the details block for the raw text).

@ngxson
Copy link
Collaborator

ngxson commented Dec 13, 2024

For the one starts with ```latex , it should be an easy fix.

Could you point me to the one starts with \[ ? I can't see those from your screenshot

@slaren
Copy link
Collaborator

slaren commented Dec 13, 2024

You should have the full raw text in the details block below the screenshot:
image

@ngxson
Copy link
Collaborator

ngxson commented Dec 14, 2024

This should support pretty much every cases now:

image
This is the formula:
$\frac{e^{x_i}}{\sum_{j=1}^{n}e^{x_j}}$

Given an input vector \(\mathbf{x} = [x_1, x_2, \ldots, x_n]\)

\[
y_i = \frac{e^{x_i}}{\sum_{j=1}^n e^{x_j}}
\]

Code block latex:
```latex
\frac{e^{x_i}}{\sum_{j=1}^{n}e^{x_j}}
```

Test dollar sign: $1234 $4567

Invalid latex syntax: $E = mc^$ and $$E = mc^$$

@slaren
Copy link
Collaborator

slaren commented Dec 14, 2024

Looking back, I think rendering the latex code block as latex is a mistake, it should be just latex code with syntax highlighting if available. I asked the model to show me softmax with latex, and that's what it did.

@ngxson
Copy link
Collaborator

ngxson commented Dec 14, 2024

yeah that makes sense, as user may ask the chatbot "how to write latex" for example. the code should be shown as-it in codeblock in this case.

image

also, in the future, we can have a switch to enabled/disable rendering latex in codeblock if someone need that

@ngxson
Copy link
Collaborator

ngxson commented Dec 15, 2024

@ggerganov I'm gonna merge this PR now. This will produce a conflict when you merge your PR.

To resolve it, you just need to delete examples/server/public/index.html (because it's now a .gz file) and re-run npm run build, it will re-generate index.html.gz

@ngxson ngxson merged commit 5478bbc into ggerganov:master Dec 15, 2024
48 checks passed
netrunnereve pushed a commit to netrunnereve/llama.cpp that referenced this pull request Dec 16, 2024
…nov#10808)

* add code highlighting and math formatting

* code cleanup

* build public/index.html

* rebuild public/index.html

* fixed coding style

* fixed coding style

* style fixes

* highlight: smaller bundle size, fix light & dark theme

* remove katex

* add bundle size check

* add more languages

* add php

* reuse some langs

* use gzip

* Revert "remove katex"

This reverts commit c0e5046.

* use better maintained @vscode/markdown-it-katex

* fix gzip non deterministic

* ability to add a demo conversation for dev

* fix latex rendering

* add comment

* latex codeblock as code

---------

Co-authored-by: Xuan Son Nguyen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

web UI : support syntax highlighting
3 participants