What are the responsibilities and job description for the Generative AI Engineer LLM & Cloud position at Infinity Tech Group Inc?
Job Details
Job Title: Generative AI Engineer LLM & Cloud
Location: Weehawken, NJ (Hybrid-3 days)
Duration: 12 Months
Job Summary:
We seek a skilled and forward-thinking Generative AI Engineer with deep experience in Large Language Models (LLMs) such as ChatGPT, LLaMA, DeepSeek, and Vertex AI. The ideal candidate will have a solid understanding of prompt engineering, fine-tuning, embeddings, and model deployment, along with expertise in leveraging cloud platforms (AWS, Google Cloud Platform, or Azure) to scale AI solutions.
This role is perfect for someone passionate about cutting-edge AI research and eager to build intelligent systems that shape the future of user interaction, content generation, automation, and more.
Key Responsibilities:
- Design, fine-tune, and optimize LLMs (e.g., OpenAI GPT, LLaMA, DeepSeek, Vertex AI) for specific business use cases.
- Develop and implement prompt engineering strategies for improved LLM performance.
- Build intelligent applications and APIs using NLP, embeddings, transformers, and vector databases.
- Collaborate with data and software engineering teams to deploy models using cloud-native tools (SageMaker, Vertex AI, Azure ML).
- Manage LLM-based workflows, including inference pipelines, cost optimization, and performance tuning.
- Evaluate open-source and proprietary LLMs to recommend the best fit for business requirements.
- Stay current with the latest developments in AI/LLMs and contribute ideas for innovation.
Required Skills & Qualifications:
- Bachelor s or Master s in Computer Science, AI, Machine Learning, or a related field.
- 4 years of experience in Machine Learning/AI with a focus on LLMs or Generative AI.
- Hands-on experience with models like ChatGPT (OpenAI), LLaMA (Meta), DeepSeek, Cohere, Anthropic, or similar.
- Solid grasp of NLP, embeddings, transformers, tokenization, and attention mechanisms.
- Strong Python programming skills; experience with frameworks like LangChain, Hugging Face, or Transformers.
- Cloud experience: AWS, Google Cloud Platform, or Azure (working with tools like Vertex AI, SageMaker, or Azure OpenAI).
- Familiarity with vector databases (e.g., FAISS, Pinecone, and Weaviate).
- Excellent problem-solving skills and the ability to work independently or in a team setting.
Nice to Have:
- Experience building RAG (Retrieval-Augmented Generation) systems.
- Exposure to MLOps and CI/CD for AI workflows.
- Contributions to open-source AI projects or published research in NLP/LLM domains.