New Sampler #3068

huangzhengxiang · 2024-10-31T03:31:19Z

Change

jxt1234 · 2024-10-31T08:56:50Z

source/backend/cpu/CPUAttention.cpp

@@ -25,7 +25,7 @@
 #endif

 // reduce the value of 'query' to 'query * FP16_QSCALE', avoid fp16 overflow
-#define FP16_QSCALE 0.5
+#define FP16_QSCALE 0.25


这个是否会影响计算精度?

从测试效果来看不会，已在Llama3.2, Qwen2.5, Qwen2-VL上进行了测试

jxt1234 · 2024-10-31T08:58:52Z

transformers/llm/engine/include/llmconfig.hpp

+    // }
+
+    // std::string prompt_template() const {
+    //     return llm_config_.value("prompt_template", "");


无效代码删除

jxt1234 · 2024-10-31T09:00:31Z

include/MNN/expr/NeuralNetWorkOp.hpp

@@ -58,6 +58,7 @@ MNN_PUBLIC VARP _Relu(VARP x, float slope = 0.0f);
 MNN_PUBLIC VARP _Relu6(VARP x, float minValue = 0.0f, float maxValue = 6.0f);
 MNN_PUBLIC VARP _PRelu(VARP x, std::vector<float> &&slopes);
 MNN_PUBLIC VARP _Softmax(VARP logits, int axis = -1);
+MNN_PUBLIC VARP _TempratureSoftmax(VARP logits, float temperature, int axis = -1);


加到这里会影响 mnn express 大小，只是 llm 使用的话，移到 llm 模块里面

jxt1234 · 2024-10-31T09:12:53Z

TODO.md

@@ -0,0 +1,59 @@
+## Change Log
+- [x] implement an independent `Sampler` Module.


这个文件移到 transformer/llm 里面吧

…ompatibility

…eGPT

huangzhengxiang · 2024-11-26T10:44:50Z

pull request has been revised to merge MNN-3.0.0

huangzhengxiang added 5 commits October 29, 2024 19:32

first commit for Sampler

93656a3

resolve conflicts

656dc18

update docs and remove penalize_ngram with penalty

beea99f

fix the bug of reset

06736c3

add android demo

215a21c

wangzhaode self-assigned this Oct 31, 2024

wangzhaode requested review from jxt1234 and wangzhaode October 31, 2024 08:38

jxt1234 reviewed Oct 31, 2024

View reviewed changes

Delete docs/transformers/optimizations.md

f8d0ddc

jxt1234 reviewed Oct 31, 2024

View reviewed changes

huangzhengxiang added 10 commits October 31, 2024 18:44

remove some commented lines:

69e1187

move _TemperatureSoftmaxto sampler

17329be

add android demo

f960a9c

debug android

0be9c8e

refactor llm project to retain the previous interfaces for backward c…

f3dcb29

…ompatibility

merge future code for perplexity

a16fc02

add perplexity and llm dataset processing, supports wikitext and shar…

028f09a

…eGPT

update time performance experiments and visualization modules

a6c6298

not runnable yet

5079f7e

merge MNN-3.0.0

74d63e1

huangzhengxiang added 3 commits December 2, 2024 22:35

merge MNN-3.0.1

9f116b8

merge MNN-3.0.1

f09908e

fix mps, onnx-slim bugs

3612945

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Sampler #3068

New Sampler #3068

huangzhengxiang commented Oct 31, 2024 •

edited

Loading

jxt1234 Oct 31, 2024

huangzhengxiang Oct 31, 2024

jxt1234 Oct 31, 2024

huangzhengxiang Oct 31, 2024

jxt1234 Oct 31, 2024

huangzhengxiang Oct 31, 2024

jxt1234 Oct 31, 2024

huangzhengxiang Oct 31, 2024

huangzhengxiang commented Nov 26, 2024

		@@ -0,0 +1,59 @@
		## Change Log
		- [x] implement an independent `Sampler` Module.

New Sampler #3068

Are you sure you want to change the base?

New Sampler #3068

Conversation

huangzhengxiang commented Oct 31, 2024 • edited Loading

Change

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

huangzhengxiang commented Nov 26, 2024

huangzhengxiang commented Oct 31, 2024 •

edited

Loading