Sea AI Lab

All

73 repositories

oat
Public
🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.
thompson-sampling alignment reasoning distributed-training ppo dueling-bandits dpo distributed-rl llm online-rl
Python
•
Apache License 2.0
•11•167•2•0•Updated Feb 8, 2025Feb 8, 2025
zero-bubble-pipeline-parallelism
Public
Zero Bubble Pipeline Parallelism
Python
•
Other
•2.5k•334•19•0•Updated Feb 7, 2025Feb 7, 2025
oat-zero
Public
A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.
Python
•
MIT License
•2•97•1•0•Updated Feb 6, 2025Feb 6, 2025
Rigging-ChatbotArena
Public
Improving Your Model Ranking on Chatbot Arena by Vote Rigging
Python
•0•16•0•0•Updated Jan 31, 2025Jan 31, 2025
autofd
Public
Automatic Functional Differentiation in JAX
automatic-differentiation jax neural-operator variational-calculus
Python
•
Apache License 2.0
•1•63•6•0•Updated Jan 17, 2025Jan 17, 2025
I-FSJ
Public
Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses (NeurIPS 2024)
Jupyter Notebook
•
MIT License
•9•53•1•0•Updated Jan 11, 2025Jan 11, 2025
InfNeRF
Public
InfNeRF: Towards Infinite Scale NeRF Rendering with O(log n) Space Complexity
Python
•
Apache License 2.0
•1•5•1•0•Updated Jan 7, 2025Jan 7, 2025
sailcraft
Public
🚢 Data Toolkit for Sailor Language Models
data-deduplication data-cleaning
Python
•8•85•0•0•Updated Dec 21, 2024Dec 21, 2024
sailor-llm
Public
[EMNLP-2024] ⚓️ Sailor: Open Language Models for South-East Asia
indonesia thai language-model sea vietnam lao malay
Python
•
MIT License
•10•125•0•0•Updated Dec 21, 2024Dec 21, 2024
sailor2
Public
🔱 Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
vietnamese indonesia thai tamil tagalog cebuano language-model burmese khmer lao
2•31•0•0•Updated Dec 16, 2024Dec 16, 2024
Meta-Unlearning
Public
Python
•1•22•1•0•Updated Dec 8, 2024Dec 8, 2024
closer-look-LLM-unlearning
Public
The official code of the paper "A Closer Look at Machine Unlearning for Large Language Models".
Python
•5•21•0•0•Updated Dec 4, 2024Dec 4, 2024
sailcompass
Public
🧭 SailCompass: Towards Reproducible and Robust Evaluation for Southeast Asian Languages
Python
•0•12•0•0•Updated Dec 3, 2024Dec 3, 2024
inceptionnext
Public
InceptionNeXt: When Inception Meets ConvNeXt (CVPR 2024)
convolutional-neural-networks
Python
•
Apache License 2.0
•22•275•14•1•Updated Dec 2, 2024Dec 2, 2024
stde
Public
Official implementation of Stochastic Taylor Derivative Estimator (STDE) NeurIPS2024
Python
•5•98•2•0•Updated Nov 27, 2024Nov 27, 2024
optim4rl
Public
Optim4RL is a Jax framework of learning to optimize for reinforcement learning.
reinforcement-learning optimization optimizer reinforcement-learning-algorithms optimization-algorithms meta-learning jax learning-to-learn optimizers meta-learning-algorithms
Python
•
Apache License 2.0
•2•24•0•0•Updated Nov 27, 2024Nov 27, 2024
VocabularyParallelism
Public
Vocabulary Parallelism
Python
•
Other
•2.5k•16•0•0•Updated Nov 11, 2024Nov 11, 2024
sdft
Public
[ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".
language-model self-distillation supervised-finetuning
Shell
•4•107•4•0•Updated Nov 2, 2024Nov 2, 2024
Cheating-LLM-Benchmarks
Public
[SafeGenAi @ NeurIPS 2024] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates
Jupyter Notebook
•
MIT License
•0•68•0•0•Updated Oct 23, 2024Oct 23, 2024
P-DoS
Public
[ArXiv 2024] Denial-of-Service Poisoning Attacks on Large Language Models
Python
•2•16•0•0•Updated Oct 22, 2024Oct 22, 2024
CPO
Public
[NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.
Python
•4•92•3•1•Updated Oct 18, 2024Oct 18, 2024
SimLayerKV
Public
The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.
Python
•0•43•3•0•Updated Oct 18, 2024Oct 18, 2024
Attention-Sink
Public
[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View
language-model attention-mechanism large-language-models attention-sink
Python
•
MIT License
•1•48•0•0•Updated Oct 17, 2024Oct 17, 2024
regmix
Public
[ICLR 2025] 🧬 RegMix: Data Mixture as Regression for Language Model Pre-training
Jupyter Notebook
•
MIT License
•5•101•0•0•Updated Oct 3, 2024Oct 3, 2024
scaling-with-vocab
Public
[NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623
Python
•5•76•1•0•Updated Sep 26, 2024Sep 26, 2024
envpool
Public
C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
robotics gym high-performance-computing cpp17 box2d vizdoom parallel-processing threadpool pybind11 atari-games
C++
•
Apache License 2.0
•104•1.1k•63•11•Updated Aug 12, 2024Aug 12, 2024
dice
Public
Official implementation of Bootstrapping Language Models via DPO Implicit Rewards
alignment preference-learning large-language-models rlhf
Python
•
MIT License
•3•42•0•0•Updated Jul 29, 2024Jul 29, 2024
lorahub
Public
[COLM 2024] LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
Python
•
MIT License
•36•614•3•1•Updated Jul 22, 2024Jul 22, 2024
Adan
Public
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
deep-learning optimizer pytorch artificial-intelligence moe resnet vit diffusion mae fairseq
Python
•
Apache License 2.0
•63•774•3•0•Updated Jul 2, 2024Jul 2, 2024
zero-bubble-megatron-deepspeed
Public archive
Zero Bubble Pipeline Parallelism implemented on Megatron-Deepspeed
Python
•
Other
•2.5k•3•0•0•Updated Jun 27, 2024Jun 27, 2024