Skip to content

Pull requests: swiss-ai/nanotron

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Continued pretraining example
#23 opened Dec 4, 2024 by wendlerc Loading…
Validation step3
#21 opened Nov 19, 2024 by kylematoba Loading…
Adding SFT training
#14 opened Jul 30, 2024 by TJ-Solergibert Loading…
2 of 7 tasks
FA3 Tracking
#11 opened Jul 12, 2024 by TJ-Solergibert Loading…
MoEs in src/ and proper load balancing losses
#8 opened Jul 3, 2024 by haeggee Loading…
ProTip! Follow long discussions with comments:>50.