What are the responsibilities and job description for the Researcher, Machine Learning position at Samsung Research America?
Lab Summary:
Samsung Research America is looking for outstanding researchers to join our Emerging Technologies (ET) Group. The ET Group is uniquely positioned to be at the heart of Samsung Research America’s innovation engine aimed towards advancing Samsung’s product offerings across smartphones, wearables, TVs, XR devices, and identifying the next big growth drivers for Samsung across a range of emerging technologies.
Position Summary:
We are looking for highly skilled and motivated researchers/engineers who can contribute to the development of fast LLM inferencing techniques, AI technology stack optimization, neuro-symbolic AI models, temporal knowledge graphs, and multimodal reasoning.
Position Responsibilities
- Speculative decoding algorithms tailored for LLMs
- Optimization of GPU/TPU utilization to minimize latency in token generation pipelines
- Analyzing bottlenecks in model inference (e.g., memory bandwidth, compute constraints) and propose solutions to improve benchmark model performance pre- and post-optimization
- Lead integration of multiple projects at the intersection of Cognitive Modeling, Machine Learning, Knowledge Representation and Reasoning
- Collaborate with a multidisciplinary team of researchers across different teams
- Stay ahead of industry trends in LLM acceleration (e.g., dynamic batching, quantization, kernel fusion)
- Publish findings and contribute to open-source projects
- Generate creative solutions (patents), publish research results in top conferences (papers)
Required Skills:
- PhD in C.S., EE or related fields or equivalent combination of education, training and experience
- 2 years of work experience after PhD.
- Expertise in knowledge graph and retrieval augmented generation (RAG) models
- Experience in Contextual intelligence
- Experience in large language model (LLM), including Transformer model architecture, attention mechanisms, decoder only LLMs, and autoregressive model optimization
- Proficiency in PyTorch, CUDA, and distributed training/inference framework (e.g., DeepSpeed, vLLM)
- Hands-on experience profiling and optimizing LLMs on GPUs/TPU
- A strong publication record in top-tier AI and NLP conferences is a plus
Special Attributes:
- Ability to debug complex, latency-critical systems
- Strong analytical and problem-solving skills, with a keen attention to detail and a passion for pushing the boundaries of AI capabilities
- Excellent written and verbal communication skills, with the ability to present complex concepts and research findings in a clear and concise manner
- Demonstrated ability to work independently as well as collaboratively in a fast-paced research and development environment