Demo

GPU Research Engineer

Alldus
New York, NY Full Time
POSTED ON 2/8/2025
AVAILABLE BEFORE 5/7/2025

Role Overview

We are looking for a GPU Research Engineer to work on optimizing inference performance for large language models (LLMs) by developing and optimizing GPU kernels. This role involves low-level performance tuning, CUDA / Triton programming, and debugging deep learning workloads to maximize throughput and efficiency.

You will collaborate with ML engineers, systems researchers, and hardware teams to push the limits of GPU acceleration for AI workloads.

Responsibilities

  • Develop, optimize, and debug custom GPU kernels using CUDA, Triton, and other low-level performance libraries .
  • Profile and analyze deep learning inference workloads to identify bottlenecks and implement optimizations.
  • Improve memory bandwidth utilization, kernel fusion, tiling strategies, and tensor parallelism for efficient LLM execution.
  • Work closely with ML and infrastructure teams to enhance model execution across different GPU architectures (e.g., NVIDIA H100, A100, MI300).
  • Research and implement state-of-the-art techniques for reducing latency, improving throughput, and minimizing memory overhead.
  • Contribute to open-source deep learning frameworks or internal acceleration toolkits as needed.

Requirements

  • Strong experience in CUDA, Triton, or OpenCL for GPU programming.
  • Deep understanding of GPU architectures, memory hierarchy, and parallel computing.
  • Experience profiling and debugging GPU workloads using NVIDIA Nsight, cuDNN, TensorRT, or PyTorch / XLA .
  • Solid knowledge of ML frameworks such as PyTorch, JAX, or TensorFlow and their GPU execution models.
  • Familiarity with numerical precision trade-offs (FP16, BF16, INT8 quantization) and mixed-precision computation.
  • Proficiency in C and Python.
  • Prior experience working on inference optimizations for large-scale ML models is a plus.
  • Nice to Have

  • Experience with compiler optimizations, MLIR, or TVM.
  • Contributions to open-source deep learning libraries related to GPU acceleration.
  • Hands-on experience with distributed inference techniques (tensor / model parallelism).
  • Knowledge of hardware-specific optimizations for TPUs, NPUs, or FPGAs .
  • Why Join Us?

  • Work on cutting-edge AI infrastructure and shape the future of large-scale LLM inference.
  • Collaborate with world-class researchers and engineers optimizing AI workloads at scale.
  • Access to state-of-the-art hardware, including the latest GPUs and AI accelerators.
  • Competitive compensation, equity, and benefits package.
  • If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
    Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

    What is the career path for a GPU Research Engineer?

    Sign up to receive alerts about other jobs on the GPU Research Engineer career path by checking the boxes next to the positions that interest you.
    Income Estimation: 
    $113,077 - $147,784
    Income Estimation: 
    $135,356 - $164,911
    Income Estimation: 
    $153,902 - $198,246
    Income Estimation: 
    $103,114 - $138,258
    Income Estimation: 
    $118,163 - $145,996
    Income Estimation: 
    $120,777 - $151,022
    Income Estimation: 
    $129,363 - $167,316
    Income Estimation: 
    $86,891 - $130,303
    Income Estimation: 
    $73,784 - $86,677
    Income Estimation: 
    $90,372 - $103,622
    Income Estimation: 
    $61,825 - $80,560
    Income Estimation: 
    $90,032 - $105,965
    Income Estimation: 
    $85,996 - $102,718
    Income Estimation: 
    $85,996 - $102,718
    Income Estimation: 
    $111,859 - $131,446
    Income Estimation: 
    $110,457 - $133,106
    Income Estimation: 
    $105,809 - $128,724
    Income Estimation: 
    $122,763 - $145,698
    Income Estimation: 
    $105,809 - $128,724
    Income Estimation: 
    $136,611 - $163,397
    Income Estimation: 
    $135,163 - $163,519
    Income Estimation: 
    $131,953 - $159,624
    Income Estimation: 
    $150,859 - $181,127
    View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

    Job openings at Alldus

    Alldus
    Hired Organization Address Cambridge, MA Contractor
    Our client is a leading innovator in the biopharma industry and they are hiring a talented Golang Developer on join the ...
    Alldus
    Hired Organization Address Cambridge, MA Contractor
    Our client is an Elite ServiceNow partner and they are hiring a ServiceNow (CSM) Business Process Consultant on a 6-mont...
    Alldus
    Hired Organization Address Denver, CO Full Time
    Our client, a data-driven organization tacking some of the world’s toughest challenges, are hiring a Product Manager to ...
    Alldus
    Hired Organization Address Cambridge, MA Contractor
    Our client are an Elite ServiceNow Partner and they are hiring a ServiceNow (HRSD) Business Process Consultant for a 6 m...

    Not the job you're looking for? Here are some other GPU Research Engineer jobs in the New York, NY area that may be a better fit.

    Software Engineer, GPU

    Waymo, New York, NY

    Senior GPU Kernel Engineer

    Alldus, New York, NY

    AI Assistant is available now!

    Feel free to start your new journey!