What are the responsibilities and job description for the AI Engineer, Reinforcement Learning position at Mytra?
About Mytra
We're creating an entirely new way to solve the most ubiquitous problem in industry - moving and storing material. We're applying robotics and distributed software to create a new class of product for this $1T market. We're focused on the supply chain industry first. The industry is in a massive bind with the continued growth of e-commerce, sharp rise in costs, and supply chain disruptions. What has been a sleepy industry for decades is now at the epicenter of sustaining the global economy.
Role Overview
Mytra"s AI team seeks pioneering engineers to architect the decision-making core of our distributed robotic intelligence system. As a Reinforcement Learning AI Engineer, you"ll push the boundaries of multi-agent coordination by developing novel approaches for real-time adaptive control, hierarchical policy learning, and decision optimization. You"ll tackle fascinating challenges like dynamic resource allocation in uncertain environments, multi-robot task scheduling, collective behavior emergence, and real-time policy adaptation with safety-critical constraints. Working at the intersection of reinforcement learning and robotics, you"ll implement state-of-the-art algorithms for multi-agent coordination, efficient exploration strategies, and robust policy optimization while collaborating with a multidisciplinary team to bridge the gap between theoretical advances and production-grade systems that reliably operate in real-world industrial environments.
Example Projects
- Design and implement a multi-agent RL deep learning model for coordinating robot tasks.
- Implement a new algorithm or architecture from a newly published paper.
- Develop a simulation environment that accurately models real-world warehouse dynamics, including variable payload characteristics, battery management, and per-customer success metrics, to enable faster training and validation of RL policies.
- Implement and adapt state-of-the-art RL algorithms (like Proximal Policy Optimization or Soft Actor-Critic) to handle the partial observability and high-dimensional state spaces.
- Devise a hierarchical RL system that decomposes complex tasks into manageable sub-tasks, enabling more efficient learning and better generalization across different topologies.
- Build a reward shaping mechanism that balances multiple competing objectives.
- Implement a transfer learning approach to leverage knowledge from simulation training to real-world deployment.
The Ideal Candidate
A Final Note
If this is your dream role, but you can't take on every example project in its totality, we encourage you to apply. We work closely and collaboratively at Mytra - no one takes on projects alone. We seek people who are eager to work together, learn new things, and bring unique perspectives.
The pay range for this role is :
180,000 - 220,000 USD per year(South San Francisco)
PI259970612
Salary : $180,000 - $220,000