Demo

LLM Model Evaluation Specialist

Hexaware Technologies
Mc Lean, VA Contractor
POSTED ON 3/25/2025
AVAILABLE BEFORE 5/2/2025

What Working at Hexaware offers:

Hexaware is a dynamic and innovative IT organization committed to delivering cutting-edge solutions to our clients worldwide. We pride ourselves on fostering a collaborative and inclusive work environment where every team member is valued and empowered to succeed.

Hexaware provides access to a vast array of tools that enhance, revolutionize, and advance professional profile. We complete the circle with excellent growth opportunities, chances to collaborate with highly visible customers, chances to work alongside bright brains, and the perfect work-life balance.

With an ever-expanding portfolio of capabilities, we delve deep into and identify the source of our motivation. Although technology is at the core of our solutions, it is still the people and their passion that fuel Hexaware’s commitment towards creating smiles.

“At Hexaware we encourage to challenge oneself to achieve full potential and propel growth. We trust and empower to disrupt the status quo and innovate for a better future. We encourage an open and inspiring culture that fosters learning and brings talented, passionate, and caring people together.”

We are always interested in, and want to support, the professional and personal you. We offer a wide array of programs to help expand skills and supercharge careers. We help discover passion—the driving force that makes one smile and innovate, create, and make a difference every day.


The Hexaware Advantage: Your Workplace Benefits

· Excellent Health benefits with low-cost employee premium.

· Wide range of voluntary benefits such as Legal, Identity theft and Critical Care Coverage

· Unlimited training and upskilling opportunities through Udemy and Hexavarsity.


Role: LLM Model Evaluation Specialist

Location: Mclean VA


Job Description:

We are seeking a highly motivated and detail-oriented LLM Model Evaluation Specialist to join our team. The ideal candidate will play a critical role in assessing, benchmarking, and improving large language models (LLMs) for our internal review process. This position requires a strong understanding of natural language processing (NLP), machine learning (ML), and the ability to design and execute evaluation frameworks to measure the performance, accuracy, and usability of LLMs.


Key Responsibilities:

  • Must have patronus.ai experience.
  • Design and implement evaluation frameworks to assess the performance of large language models (LLMs) across various tasks and benchmarks.
  • Conduct qualitative and quantitative analyses of LLM outputs, including accuracy, relevance, coherence, and ethical considerations.
  • Develop automated testing pipelines to streamline the evaluation process for LLMs.
  • Collaborate with cross-functional teams, including data scientists, ML engineers, and product managers, to align evaluation metrics with business goals.
  • Identify gaps in LLM performance and provide actionable insights to improve model quality.
  • Stay up to date with the latest advancements in NLP, LLMs, and evaluation methodologies.
  • Create detailed reports and presentations to communicate evaluation results and recommendations to stakeholders.
  • Ensure the ethical use of LLMs by monitoring for biases, fairness, and compliance with organizational and industry standards.


Qualifications:

  • Bachelor's or Master's degree in Computer Science, Data Science, Machine Learning, or a related field.
  • Strong knowledge of natural language processing (NLP) and Machine Learning (ML) concepts.
  • Hands-on experience with LLMs (e.g., GPT, BERT, LLaMA, etc.) and familiarity with their architecture and functionality.
  • Proficiency in programming languages such as Python, with experience in libraries like TensorFlow, PyTorch, or Hugging Face.
  • Experience in designing and implementing model evaluation metrics (e.g., BLEU, ROUGE, perplexity, etc.).
  • Familiarity with ethical considerations in AI, including bias detection and mitigation.
  • Strong analytical skills and the ability to process and interpret large datasets.
  • Excellent communication skills, both written and verbal, to effectively present findings and recommendations.


Preferred Skills:

  • Experience working with APIs for LLMs and fine-tuning pre-trained models.
  • Knowledge of A/B testing and user feedback collection for model evaluation.
  • Familiarity with prompt engineering and prompt evaluation techniques.
  • Experience with data visualization tools to present evaluation results clearly.


If you're passionate about advancing the field of NLP and contributing to the development of cutting-edge LLMs, we encourage you to apply for this exciting opportunity!

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a LLM Model Evaluation Specialist?

Sign up to receive alerts about other jobs on the LLM Model Evaluation Specialist career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$77,900 - $95,589
Income Estimation: 
$101,387 - $124,118
Income Estimation: 
$101,387 - $124,118
Income Estimation: 
$119,030 - $151,900
Income Estimation: 
$149,493 - $192,976
Income Estimation: 
$184,796 - $233,226
Income Estimation: 
$119,030 - $151,900
Income Estimation: 
$149,493 - $192,976
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Hexaware Technologies

Hexaware Technologies
Hired Organization Address Quincy, MA Full Time
Job Details Job Description As a Senior Java Developer, you will: - Java, Oracle, and Micro services skills - work on lo...
Hexaware Technologies
Hired Organization Address Raleigh, NC Full Time
Job Details Job Description Overall, 8-10 years of experience on data domain with at least 3-5 years of experience on Sn...
Hexaware Technologies
Hired Organization Address Chicago, IL Full Time
Job Details Role: Senior Azure Cloud Architect Location: Chicago, IL Mode of Work: Onsite Hybrid Hire Type : Fulltime Jo...
Hexaware Technologies
Hired Organization Address Mc Lean, VA Full Time
Job Details What Working at Hexaware offers: Hexaware is a dynamic and innovative IT organization committed to deliverin...

Not the job you're looking for? Here are some other LLM Model Evaluation Specialist jobs in the Mc Lean, VA area that may be a better fit.

GenAI/LLM Prompt Compliance & Evaluation Consultant

U.S. Tech Solutions Inc., Washington, DC

Evaluation specialist

Randstad, Washington, DC

AI Assistant is available now!

Feel free to start your new journey!