Feat/sort FSRSItem by length to speed up training #32

L-M-Sherlock · 2023-08-25T07:37:01Z

Note: this code remove shuffle of items. It would break stochastic gradient descent. I wander whether it is possible to shuffle at the level of batchs rather than items.

dae · 2023-08-26T06:04:49Z

Is the change I pushed what you had in mind? For a given review length, the items should now appear in a random order.

L-M-Sherlock · 2023-08-26T06:11:05Z

It's effect is nuanced to the training. Because the order of item in the same batch is not important. I plan to implement a shuffle in batch-level.

https://github.com/burn-rs/burn/blob/main/burn-core/src/data/dataloader/strategy.rs

dae · 2023-08-26T06:15:37Z

So you want a batch of size 3, then a batch of size 10, then a batch of size 2, etc?

L-M-Sherlock · 2023-08-26T06:18:51Z

So you want a batch of size 3, then a batch of size 10, then a batch of size 2, etc?

I want a batch with seq_len = 3, then a batch with seq_len = 10, then a batch with seq_len = 2.

dae · 2023-08-26T07:02:25Z

Ok, how about now? A test is currently failing due to #31, as I rebased this PR on the main branch.

L-M-Sherlock · 2023-08-26T07:02:52Z

I have seen your code. But I think we should write a new dataloader rather than manipulate the dataset before build the dataloader. I'm coding now.

L-M-Sherlock · 2023-08-26T07:05:05Z

The shuffle only applies once in your implementation (if I'm wrong, please feel free to correct me). Ideally, the shuffle should apply before each epoch of training.

dae · 2023-08-26T07:06:12Z

Ok, I see. Hopefully the code will be useful as a reference anyway. :-)

L-M-Sherlock · 2023-08-26T09:16:02Z

@dae, I have a weird observation during my coding. The results of training always vary over time, even when I remove shuffle. Then I figure it out. It's caused by grouped.into_values() in the group_by_cid(). It's unstable in order. So I apply sorted_by_cached_key(|revlog| revlog.get(0).unwrap().cid) for that. I'm not sure that it's the best practice, but it works.

Then, I implement BatchShuffleDataset, which could Shuffle the indices by batch_size. But I'm not sure whether the training loop would shuffle the dataset before each epoch. I need to ask @nathanielsimard.

L-M-Sherlock · 2023-08-26T20:06:12Z

Wait for tracel-ai/burn#703

dae · 2023-08-27T00:28:22Z

In most languages, HashMaps don't retain the order items are put into them - Python's dictionaries are a bit special in that regard. I think we can do this a bit more efficiently by sorting on the SQL end and then using .group_by() - I've pushed an update and some other tidyups.

dae · 2023-08-27T00:34:32Z

Nice work on the batch shuffling! And thanks @nathanielsimard for the upstream fix.

dae · 2023-08-27T04:46:12Z

Would you like me to rebase this or would you like to?

L-M-Sherlock · 2023-08-27T05:38:16Z

Would you mind doing it? Thanks. I'm out right now.

This allows easily running all tests in one file at once, and allows tests to share code that is not used in production. Also removed the duplicate test_next_stability/difficulty tests.

Co-authored-by: Asuka Minato <[email protected]>

dae · 2023-08-28T03:36:12Z

If you guys are happy with the current state of this, I'd suggest we merge it in, even though the PR mentioned above hasn't landed yet.

L-M-Sherlock · 2023-08-28T03:39:21Z

Should we update the dependencies of burn in Cargo.toml before merging?

L-M-Sherlock · 2023-08-28T03:42:12Z

Ignore my last comment. I confused this RR as another PR.

L-M-Sherlock added the enhancement New feature or request label Aug 25, 2023

L-M-Sherlock linked an issue Aug 25, 2023 that may be closed by this pull request

[TODO] feature: custom sampler for batch #22

Closed

dae force-pushed the Feat/sort-FSRSItem-by-length branch 3 times, most recently from 0a1f15a to d99d925 Compare August 26, 2023 07:01

dae mentioned this pull request Aug 27, 2023

add test into ci #31

Merged

L-M-Sherlock and others added 10 commits August 27, 2023 16:27

sort FSRSItem by length to speed up training

c1143c1

correct shape of logits

de573a4

cargo fmt

19dc7b5

Create batches with random seq_len

790d5c3

implement BatchShuffledDataset

48191f9

cargo fmt

c2e86cf

rename batch_shuffle & add English comments

61c798d

Use an SQL sort and .group_by() for separating by card id

bebedb9

We don't need InMemDataset

a34f5e4

Move all tests into test modules

b6b0604

This allows easily running all tests in one file at once, and allows tests to share code that is not used in production. Also removed the duplicate test_next_stability/difficulty tests.

dae and others added 4 commits August 27, 2023 16:30

Remove redundant test_ prefix/suffix

9f978ec

Apply cosine_annealing patch from Asuka to fix test

a955aeb

Co-authored-by: Asuka Minato <[email protected]>

Run tests in CI, except training

c45aab4

Co-authored-by: Asuka Minato <[email protected]>

Limit checks to pull requests; bust cache

d6c71f6

dae force-pushed the Feat/sort-FSRSItem-by-length branch from f6e2c90 to d6c71f6 Compare August 27, 2023 06:31

L-M-Sherlock requested a review from dae August 28, 2023 03:42

dae approved these changes Aug 28, 2023

View reviewed changes

L-M-Sherlock merged commit ee7f6be into main Aug 28, 2023
1 check passed

L-M-Sherlock deleted the Feat/sort-FSRSItem-by-length branch August 28, 2023 03:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/sort FSRSItem by length to speed up training #32

Feat/sort FSRSItem by length to speed up training #32

L-M-Sherlock commented Aug 25, 2023

dae commented Aug 26, 2023

L-M-Sherlock commented Aug 26, 2023

dae commented Aug 26, 2023

L-M-Sherlock commented Aug 26, 2023

dae commented Aug 26, 2023

L-M-Sherlock commented Aug 26, 2023

L-M-Sherlock commented Aug 26, 2023

dae commented Aug 26, 2023

L-M-Sherlock commented Aug 26, 2023 •

edited

Loading

L-M-Sherlock commented Aug 26, 2023

dae commented Aug 27, 2023

dae commented Aug 27, 2023

dae commented Aug 27, 2023

L-M-Sherlock commented Aug 27, 2023 •

edited

Loading

dae commented Aug 28, 2023

L-M-Sherlock commented Aug 28, 2023

L-M-Sherlock commented Aug 28, 2023

Feat/sort FSRSItem by length to speed up training #32

Feat/sort FSRSItem by length to speed up training #32

Conversation

L-M-Sherlock commented Aug 25, 2023

dae commented Aug 26, 2023

L-M-Sherlock commented Aug 26, 2023

dae commented Aug 26, 2023

L-M-Sherlock commented Aug 26, 2023

dae commented Aug 26, 2023

L-M-Sherlock commented Aug 26, 2023

L-M-Sherlock commented Aug 26, 2023

dae commented Aug 26, 2023

L-M-Sherlock commented Aug 26, 2023 • edited Loading

L-M-Sherlock commented Aug 26, 2023

dae commented Aug 27, 2023

dae commented Aug 27, 2023

dae commented Aug 27, 2023

L-M-Sherlock commented Aug 27, 2023 • edited Loading

dae commented Aug 28, 2023

L-M-Sherlock commented Aug 28, 2023

L-M-Sherlock commented Aug 28, 2023

L-M-Sherlock commented Aug 26, 2023 •

edited

Loading

L-M-Sherlock commented Aug 27, 2023 •

edited

Loading