Date | Paper | Key Words |
---|---|---|
2023.4.17 | Visual Instruction Tuning | LLaVa |
2024.3.8 | DeepSeek-VL: Towards Real-World Vision-Language Understanding | DeepSeek-VL: Dense & VLM |
2024.7.10 | PaliGemma: A versatile 3B VLM for transfer | Google small VLM: Paligemma |
2024.10.8 | Aria: An Open Multimodal Native Mixture-of-Experts Model | First MoE VLM: Aria |
2024.12.6 | Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling | VLM: InternVL 2.5 |
2024.12.13 | DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding | DeepSeek-VL2: MOE & VLM |
2024.12.27 | DeepSeek-V3 Technical Report | DeepSeek-V3 Technical Report |
Date | Paper | Key Words |
---|---|---|
2021.1.11 | Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity | Mixture of Expert (MoE) |
2024.1.11 | DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models | DeepSeekMoE |
Date | Paper | Key Words |
---|---|---|
2020.10.22 | An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale | Vision Transformer (ViT) |
Date | Paper | Key Words |
---|---|---|
2019.2.24 | Language Models are Unsupervised Multitask Learners | GPT-2 |
2019.10.2 | DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter | Bert distilled version & knowledge distillation |
2019.10.23 | Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer | Unified Text-to-Text Transformer & T5 (Encoder-Decoder) |
2020.05.22 | Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks | Retrieval-Augmented Generation (RAG) |
2020.5.28 | Language Models are Few-Shot Learners | GPT-3 |
2021.4.20 | RoFormer: Enhanced Transformer with Rotary Position Embedding | RoPE |
Date | Paper | Key Words |
---|---|---|
1972 | Karp's 21 NP-complete problems | Karp's 21 NP-complete problems |
1973 | An n^{5/2} algorithm for maximum matchings in bipartite graphs | Hopcroft-Karp Algorithm |
2002 | A 27/26-Approximation Algorithm for the Chromatic Sum Coloring of Bipartite Graphs | Chromatic Sum Coloring of Bipartite Graphs |
2015.6.16 | An Efficient Data Structure for Processing Palindromes in Strings | Palindromic Tree |
2017.8.11 | An Introduction to Quantum Computing, Without the Physics | Quantum Computing, Without the Physics |
2018.7.30 | A Simple Near-Linear Pseudopolynomial Time Randomized Algorithm for Subset Sum | A Simple Near-Linear Pseudopolynomial Time Randomized Algorithm for Subset Sum |
2021.2.11 | Hybrid Neural Fusion for Full-frame Video Stabilization | Video Stabilization Algorithm |
2022.11.21 | The Berlekamp-Massey Algorithm revisited | Berlekamp-Massey Algorithm |
Date | Paper | Key Words |
---|---|---|