-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support non-identical file names between .wav and .eaf, and recognise media offsets #215
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a big part of the original tickets was implementing a UI feature that would highlight (in particular) if audio or eaf were uploaded without the corresponding eaf or audio file (respectively). The easiest way I can envisage to accomplish this is aligning the audio files horizontally in the UI with their transcriptions, which would make it obvious that a pair was missing either component (you could also highlight rows with a missing file, or something to that effect). The unfortunate downside to this is that you'll have to replicate the verification on the front end (This might help: https://www.npmjs.com/package/elan-parser ).
Ah okay, this might be a bit more work to do on the uploading side of things, as it won't just be a file drop anymore. But I can look into it 👌 |
This reverts commit 09247da.
Resolves #191, #193.
This implementation doesn't give the user any choice as to whether to match the file name of the corresponding
.eaf
file or to just get it fromRELATIVE_MEDIA_URL
. It defaults to the former behaviour and falls back to the later.This implementation also ignores
MEDIA_URL
as it is difficult to wrestle it (e.g."file:///Users/bbb/Desktop/abui/abui-audio-1.wav"
) into a format that the rest of the application will be able to handle easily. In other words, it assumes thatRELATIVE_MEDIA_URL
is well formed.This also fixes any
line = wer_lines[0] IndexError: list index out of range
errors that may have been happening before, although please double check they are actually fixed.Offsets are directly
int()
-ed from the.eaf
file.