Teko is looking for a highly motivated hands-on Analytics engineer. This position is responsible for designing and implementing data analytics pipelines, ETL processes using both distributed and single node libraries such as Spark, Keras, Tensorflow, Sci-kit Learn, Pandas. Next to that you will maintain, groom and model data to best fit the ETL workload. You'll be collaborating closely with our engineers and must be able to work efficiently across various teams.
- Manage data objects, models and formats
- Groom, document and maintain semantics of data objects.
- Produce SQL and ML queries and ETL processing
- Build Analytics pipelines.
- Build and Deploy Analytical Dashboards and APIs
- BS in Computer Science or similar field of study
- Experience with ETL pipelines to transform data from a variety of data sources to a normalized form
- Extended experience of software development (Python, Scala)
- Experience with database design and SQL/NoSQL datastores
- Experience with API design (e.g. Flask)
- Experience with Cassandra.
- Experience with Redis.
- Experience with Pandas.
- Experience with Spark.
- Experience with Jupyter.
- Experience with Kubernetes is a plus.
- Experience with Agile and Scrum
Excellent verbal and written communication skills; strong attention to detail Passion to automate more, learn new software tools and technologies Natural aptitude for both teaching and learning from others in a collaborative team environment
Download the dataset of taxi trips in New York https://www.kaggle.com/kentonnlp/2014-new-york-city-taxi-trips
Write a notebook with descriptive queries in Spark and/or Pandas and visualize the results (matplotlib, bokeh, holoviews, seaborn). Bonus: Expose the result in a web dashboard.
Provide the repo (github, gitlab, etc) of your solution to the hiring test, attach your latest CV to your email and send it to [email protected] hashtag #datateam in the subject line