Responsibilities:
- Architect and implement the company's Big Data system using AWS Cloud services (Kinesis, EC2, S3, Auto-Scaling, Load-Balancer, Cloud Watch).
- Implement data models developed by our learning scientists for deeper understanding around learning in interactive digital spaces, including educational games, simulations, and technology enhanced assessment items.
- Build and interpret probabilistic models of complex, highly-dimensional datasets.
- Write semantic classification algorithms using a combination of Python, Java, C, C++, Scala, Scalding, R, Julia etc.
- Organize, analyze, retrieve, classify, and report on student json clickstream and log data streams.
- Design and execute software at production scale by utilizing machine learning and data mining techniques to estimate learning efficacy and clusters of learning styles.
- Develop and/or assist in the creation of advanced Data visualizations to transform information into insights (like D3.js or Famo.us) These visualizations will be utilized for learning domain experts to identify how to tackle student misconceptions, as well as to discover opportunities to optimize emergent learning pathways.
- Oversee the entire data processing cycle, including: data collection, normalization, extraction, and cleansing (ETL).
- Apply advanced data and parallel programming frameworks (Spark or Map/Reduce) to effect real-world solutions. We seek a skilled practitioner who can also advance the state of the art.
- Present to all levels of the organization including peers, senior management, and external customers.