What are the responsibilities and job description for the LLM Model Evaluation Specialist position at Hexaware Technologies?
What Working at Hexaware offers:
Hexaware is a dynamic and innovative IT organization committed to delivering cutting-edge solutions to our clients worldwide. We pride ourselves on fostering a collaborative and inclusive work environment where every team member is valued and empowered to succeed.
Hexaware provides access to a vast array of tools that enhance, revolutionize, and advance professional profile. We complete the circle with excellent growth opportunities, chances to collaborate with highly visible customers, chances to work alongside bright brains, and the perfect work-life balance.
With an ever-expanding portfolio of capabilities, we delve deep into and identify the source of our motivation. Although technology is at the core of our solutions, it is still the people and their passion that fuel Hexaware’s commitment towards creating smiles.
“At Hexaware we encourage to challenge oneself to achieve full potential and propel growth. We trust and empower to disrupt the status quo and innovate for a better future. We encourage an open and inspiring culture that fosters learning and brings talented, passionate, and caring people together.”
We are always interested in, and want to support, the professional and personal you. We offer a wide array of programs to help expand skills and supercharge careers. We help discover passion—the driving force that makes one smile and innovate, create, and make a difference every day.
The Hexaware Advantage: Your Workplace Benefits
· Excellent Health benefits with low-cost employee premium.
· Wide range of voluntary benefits such as Legal, Identity theft and Critical Care Coverage
· Unlimited training and upskilling opportunities through Udemy and Hexavarsity.
Role: LLM Model Evaluation Specialist
Location: Mclean VA
Job Description:
We are seeking a highly motivated and detail-oriented LLM Model Evaluation Specialist to join our team. The ideal candidate will play a critical role in assessing, benchmarking, and improving large language models (LLMs) for our internal review process. This position requires a strong understanding of natural language processing (NLP), machine learning (ML), and the ability to design and execute evaluation frameworks to measure the performance, accuracy, and usability of LLMs.
Key Responsibilities:
- Must have patronus.ai experience.
- Design and implement evaluation frameworks to assess the performance of large language models (LLMs) across various tasks and benchmarks.
- Conduct qualitative and quantitative analyses of LLM outputs, including accuracy, relevance, coherence, and ethical considerations.
- Develop automated testing pipelines to streamline the evaluation process for LLMs.
- Collaborate with cross-functional teams, including data scientists, ML engineers, and product managers, to align evaluation metrics with business goals.
- Identify gaps in LLM performance and provide actionable insights to improve model quality.
- Stay up to date with the latest advancements in NLP, LLMs, and evaluation methodologies.
- Create detailed reports and presentations to communicate evaluation results and recommendations to stakeholders.
- Ensure the ethical use of LLMs by monitoring for biases, fairness, and compliance with organizational and industry standards.
Qualifications:
- Bachelor's or Master's degree in Computer Science, Data Science, Machine Learning, or a related field.
- Strong knowledge of natural language processing (NLP) and Machine Learning (ML) concepts.
- Hands-on experience with LLMs (e.g., GPT, BERT, LLaMA, etc.) and familiarity with their architecture and functionality.
- Proficiency in programming languages such as Python, with experience in libraries like TensorFlow, PyTorch, or Hugging Face.
- Experience in designing and implementing model evaluation metrics (e.g., BLEU, ROUGE, perplexity, etc.).
- Familiarity with ethical considerations in AI, including bias detection and mitigation.
- Strong analytical skills and the ability to process and interpret large datasets.
- Excellent communication skills, both written and verbal, to effectively present findings and recommendations.
Preferred Skills:
- Experience working with APIs for LLMs and fine-tuning pre-trained models.
- Knowledge of A/B testing and user feedback collection for model evaluation.
- Familiarity with prompt engineering and prompt evaluation techniques.
- Experience with data visualization tools to present evaluation results clearly.
If you're passionate about advancing the field of NLP and contributing to the development of cutting-edge LLMs, we encourage you to apply for this exciting opportunity!