forked from kachok/hitman
-
Notifications
You must be signed in to change notification settings - Fork 1
epavlick/hitman
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Web - template and web app for creating and running Mechanical Turk HITs as External Questions Backend - command line tools for interacting with MTurk API and creating HITs in MTurk RUN ONCE: load_data_to_db.py populates langugages table in database loads data from wikilanguages-pipeline into database to vocabulary and dictionary tables respectively prepare_images_files.py generates images descriptor files for all vocabulary and dictionary items java -Djava.awt.headless=true generateimages sentences.txt segments.txt images #java -Djava.awt.headless=true generateimages test_sentences.txt test_segments.txt ../test_images #java -Djava.awt.headless=true generateimages pilot_sentences.txt pilot_segments.txt ../pilot_images renders PNG files 10MB of images per 1000 words+sentences obama4australia.py generates images for Barack Obama and Australia in all languages #java -Djava.awt.headless=true Str2ImgObama support_sentences.txt support_sequence.txt ../pilot_images/support generate_voc_hits.py creates hittypes in MTurk and populates hittypes table in database generates batches of vocabulary HITs in database based on vocabulary data populates hits table and voc_hits_data tables in database add_voc_hits_to_mturk.py generates HITs in MTurk and reference them to batches of words in database creates HITs in MTurk for every HIT in datbase with empty mturk_hit_id column updates hits table with non-empty mturk_hit_id value generate_synonyms_table.py generates list of synonyms to be used in Synonyms HIT as controls generate_non_synonyms_table.py generates list of non-synonyms to be used in Synonyms HIT as controls RUN CONTINUOUSLY: get_assignments.py tasks retreives all completed assignments from all HITs and loads results into proper hits_results tables (works both for vocabulary and synonyms HITs) generate_syn_hits.py takes processed results from Voc HIT (from vochitsresults table) and populates syn hits and syn hits data table task can be run multiple times without creating duplicates add_syn_hits_to_mturk.py generates HITs in MTurk and reference them to batches in database creates HITs in MTurk for every HIT in datbase with empty mturk_hit_id column updates hits table with non-empty mturk_hit_id value get_assignments.py -- run again to get results of new syn hits revew_step2.py reviews results of step2 (synonyms HITs) grade syn hits results (based on controls) select all assignments pending update in MTurk for each pending assignments grade assignments one-by-one update status in Mturk and create extra assignment for related hit if assignments data was bad (data_quality<=50%) close assignment in database review_step1.py select all syn hits that have 3 assignments completed with data_status>50% each CREATE OR REPLACE VIEW syn_hits_completed AS -- select all syn HITs that have 3 assignments with good quality of data. Those HITs considered completed -- and data can be pushed to Step 1 to Vocabulary HITs/Assignments for each hit get 3 assignment above (syn_hits_completed x assignments (by hit_id)) weight results for each word translation and multiply it by performance of worker -- weight+result per hit per assignment SYN_HITS_RESULTS_WIGHTED_PAIRS select voc_assignment_id, pair_id, are_synonyms, max(weight) from ( select voc_assignment_id, pair_id, are_synonyms, avg(worker_performance) as weight from ( select shd.voc_assignment_id, a.hit_id, a.id as assignment_id,shr.quality,shr.pair_id, shwp.quality as worker_performance, shr.are_synonyms from assignments a, syn_hits_completed shc, syn_hits_results shr, syn_hits_data shd, syn_hits_workers_performance shwp where shd.hit_id=a.hit_id and shd.id=shr.pair_id and a.hit_id=shc.id and shr.assignment_id=a.id and shr.is_control=0 -- remove controls (don't need them) and a.worker_id=shwp.id and a.data_status>0.5 -- cut off nonperforming assignments ) as r group by voc_assignment_id, pair_id, are_synonyms ) as rr group by voc_assignment_id, pair_id, are_synonyms sort result by weight? -- backtracking database from step2 to step1 select * from hits select * from assignments select * from vocabulary select * from dictionary select * from voc_hits_data where hit_id in (2,3) select * from voc_hits_results where assignment_id=3 select * from voc_hits_results where assignment_id=3 and is_control=0 select * from assignments where id in (11,12) select * from syn_hits -- hit_id 41/42 select * from syn_hits_data where id in (11,12) -- hit_id=42 select * from syn_hits_data where is_control=0 select * from syn_hits_results -- voc assignments that are completed select * from assignments where id in (8,4) -- hit_id 36/29 for assg_id 8/4 select * from voc_hits_data where id in (36,29) select * from syn_hits sh, assignments a, workers w, syn_hits_data shd where sh.id=a.hit_id and a.worker_id=w.id and shd.is_control=0 and shd.hit_id=sh.id -- trace from voc_assignment_id to select * from assignments a, hits h where a.hit_id=h.id and a.id=6 select * from voc_hits_results select * from voc_hits_data select * from assignments select * from syn_hits_data *review-step2.py review results of step2 HOW IT WORKS: for all assignments: +review if passed/failed based on control +mark assignments as passed/failed +go to MTurk and accept/reject assignments as needed +add extra assignment for each rejected assignment to corresponding HITs for all HITs: +select all HITS if # of accepted assignments=3 -calculate passed/failed translations for each HIT for each word_id select * from synonymshits h, synonymshitassignments a, synonymshitresults r, synonymshitsdata d where a.mturk_hit_id=h.mturk_hit_id and r.assignment_id=a.id and r.pair_id=d.id -- select words and 1/0 for correct/incorrect (need to be grouped) select word_id, d.assignment_id, case WHEN are_synonyms='no' THEN 0 else 1 end as correct from synonymshits h, synonymshitassignments a, synonymshitresults r, synonymshitsdata d where a.mturk_hit_id=h.mturk_hit_id and r.assignment_id=a.id and r.pair_id=d.id and d.is_control=0 -push fail/pass status to step 1 do something with all the QA data and pass it to step 1 *review-step1.py review results of step1 HOW IT WORKS: for all assignments: -review if passed/failed based on results pushed from step2 -mark assignments as passed/failed -go to MTurk and accept/reject assignments as needed -add extra assignment for each rejected assignment to corresponding HITs for all HITs: -select all HITS if # of accepted assignments=3 -calculate passed/failed translations for each HIT for each word_id -build dictionary -tada! --- tasks estimates 100 langs 10,000 words 1250 HITS per language 125,000 1 step HITs 375,000 1 step Assignments 375,000*2 synonym pair 750,000 pairs 100,000 step 2 HITS? (4 times less then step 1) 300,000 assignments for step 2 ---- Rules for grading tasks: First 10 assignments performed by worker are auto-approved After that assignments are rated based on scale: 100%-70% approved, results good 70%-50% approved, results are questionable 50%-0% rejected, results are thrown away give ? for questionable controls, see for majority vote
About
Template for creating and running Mechanical Turk HITs as External Questions
Resources
Stars
Watchers
Forks
Packages 0
No packages published