What are the responsibilities and job description for the Senior Software Engineer, ML Performance & Systems position at fal?

Join our team at fal, where we are dedicated to pushing the boundaries of model performance for generative media models. You will play a vital role in designing and implementing innovative model serving architectures using our proprietary inference engine, with a clear focus on maximizing throughput while reducing latency and resource consumption.

As a key contributor, you will develop performance monitoring and profiling tools to pinpoint bottlenecks and discover optimization opportunities. Collaboration will be essential as you work closely with our Applied ML team and our customers in frontier labs within the media space, ensuring their workloads are optimized for our accelerator.

Key Responsibilities :

Drive the advancement of model performance for generative media models at fal.
Architect and implement cutting-edge solutions for model serving on our in-house inference engine, prioritizing throughput, latency, and resource efficiency.
Create tools for performance monitoring and profiling to detect bottlenecks and enhance optimization strategies.
Collaborate closely with our Applied ML team and customers, ensuring they derive maximum benefit from our accelerator solutions.

Requirements :

Robust background in systems programming with a proven track record of identifying and resolving performance bottlenecks.

Extensive knowledge of the latest ML infrastructure, including but not limited to PyTorch, TensorRT, TransformerEngine, and Nsight, with a keen interest in staying updated with developments in these areas.

Strong understanding of underlying hardware (currently Nvidia-based systems) and ability to dive deep into the stack to troubleshoot and optimize, including custom GEMM kernels with CUTLASS for common matrix shapes.

Experience with Triton or a strong willingness to learn, along with similar expertise in lower-level accelerator programming.

Familiarity with multi-dimensional model parallelism techniques utilizing a combination of parallelism methods such as tensor parallelism and context / sequence parallelism.

Understanding of the internals of Ring Attention, FA3, and FusedMLP implementations.

Compensation :

180,000 - $500,000 equity comprehensive benefits package

Location : San Francisco, CA

What we offer at fal :

Engaging and challenging projects.

Emphasis on work-life balance.

Attractive salary and equity options.

Employee-friendly equity terms, including early and extended exercise options.

Opportunity to work in our downtown San Francisco office, with remote options available for exceptional candidates.

Visa sponsorship available to assist with relocation to San Francisco.

Comprehensive health, dental, and vision insurance (US).

Regular team events and offsites.

Generous paid vacation policy of 4 weeks.

Salary : $180,000 - $500,000

Apply for this job

Receive alerts for other Senior Software Engineer, ML Performance & Systems job openings

Job openings at fal

Senior Software Engineer, ML Performance & Systems

fal

Fremont, CA Full Time

Join our team at fal, where we are dedicated to pushing the boundaries of model performance for generative media models....

Senior Software Engineer, ML Performance & Systems

fal

San Jose, CA Full Time

Join our team at fal, where we are dedicated to pushing the boundaries of model performance for generative media models....

Senior Software Engineer, ML Performance & Systems

fal

Alameda, CA Full Time

Join our team at fal, where we are dedicated to pushing the boundaries of model performance for generative media models....

Senior Software Engineer, ML Performance & Systems

fal

San Francisco, CA Full Time

Join our team at fal, where we are dedicated to pushing the boundaries of model performance for generative media models....

Not the job you're looking for? Here are some other Senior Software Engineer, ML Performance & Systems jobs in the Sunnyvale, CA area that may be a better fit.

Senior Software Engineer, ML Performance & Systems

What are the responsibilities and job description for the Senior Software Engineer, ML Performance & Systems position at fal?

What is the career path for a Senior Software Engineer, ML Performance & Systems?

Job openings at fal

Not the job you're looking for? Here are some other Senior Software Engineer, ML Performance & Systems jobs in the Sunnyvale, CA area that may be a better fit.

We don't have any other Senior Software Engineer, ML Performance & Systems jobs in the Sunnyvale, CA area right now.

AI Assistant is available now!