What are the responsibilities and job description for the Principal GPU/CUDA Engineer position at AI Startup in Stealth?
Principal CUDA Software Engineer
We are looking for a Principal CUDA Software Engineer to design and optimize GPU-based solutions for demanding AI/ML workloads. This role requires deep expertise in parallel computing, low-level hardware interactions, and high-level software frameworks. As a Principal CUDA Software Engineer, you will be responsible for ensuring maximum efficiency and performance in large-scale AI deployments.
This opportunity is with a rapidly growing company dedicated to advancing AI and parallel computing. The leadership team has a strong track record in delivering cutting-edge technologies, with a focus on efficient GPU architectures. The environment prioritizes autonomy, rapid iteration, and direct engineer impact on product development. This is a place where a Principal CUDA Software Engineer can make a tangible difference in shaping next-generation AI systems.
As a Principal CUDA Software Engineer, you will work closely with hardware engineers to fine-tune performance at the driver and runtime levels. Your focus will be on optimizing CUDA-based components, ensuring seamless AI/ML functionality, and maintaining critical libraries, toolchains, and frameworks. Your expertise in benchmarking, profiling, and analysis will play a key role in delivering high-performance GPU solutions. This is a highly technical role that directly impacts the efficiency and scalability of AI-driven workloads.
What We Can Offer You
- A role at the forefront of next-generation AI and GPU computing
- Competitive salary, equity, and comprehensive benefits
- A collaborative and engineering-focused environment that values technical excellence
- Opportunities for professional development through training and industry conferences
- Direct influence on architectural decisions and product evolution
Key Responsibilities
- As the Principal CUDA Software Engineer, develop and optimize CUDA-based components for AI/ML workloads
- Work closely with hardware engineers to implement low-level performance improvements
- Maintain drivers, runtime libraries, and compiler toolchains for GPU-based AI solutions
- Conduct benchmarking, profiling, and code path analysis to optimize performance
- Provide technical leadership and mentor junior engineers in CUDA development
Relevant keywords: CUDA, GPU computing, parallel processing, AI acceleration, compiler toolchains, runtime libraries, performance optimization, benchmarking, low-level programming.