Thanks for your interest in joining the Network Dynamics Lab! As part of the interview process, we ask each applicant to conduct a small data analysis project. Our goal here is to understand more about how you approach research and your familiarity with various techniques and tools. That said, we hope this exercise will also be fun!
Your goal is to perform a classification task on Jeopardy questions (and answers) and share some interesting insight that comes out of it. The dataset of questions was first announced on Reddit. For convenience, we have included it in this Github repository.
Maybe you build a classifier for question categories or answers mentioning world leaders. The specific classification task you undertake is totally up to you. What we want to see is:
- a machine learning system (doesn't have to be fancy)
- an analysis of classifier performance
- a critical analysis of what the classifier or its output tells us about the underlying classes you were studying
In terms of deliverables, please submit the following by email to Derek.
- all your source code
- a Makefile that will run your analysis from start to finish (you may assume that
the jeopardy data file,
is located in the directory right above your submission directory) - A short (about a page) discussion that (1) analyzes your classifier performance and (2) reveals something new and interesting about the underlying classes you were studying
Final note: Your submission will be treated with complete confidentiality - it will not ever be shared with anyone beyond Derek and the NDL PhD students. That said, please do not submit any confidential or sensitive information.
Final, final note: While we hope you will take the project seriously, please don't spend too much time on it. If coding for the project is taking more than 4 hours, that's either a sign that you're being a bit too ambitious or that you need to spend more time building your core skillset (in which case, we'll look forward to your application in future years).