Demo

AI Evaluation Engineer

Trunk Tools
Austin, TX Full Time
POSTED ON 3/8/2025
AVAILABLE BEFORE 4/6/2025
At Trunk Tools, we are tackling the massive $13 trillion construction industry. We’re an exceptional team of serial entrepreneurs, brought together by our shared mission: automating construction. Our founding team (SpaceX, Stanford, MIT, Carta, etc.) has successfully built and deployed software in construction for 140k users, millions of users beyond the construction space, and worked on $2 billion of built-environment projects. We aren’t another out-of-touch tech startup, most of our team comes from construction.

We spent the last few years building the brain behind construction. Now we are deploying workflows/ agents, starting with Q&A document chatbot, to be ingrained in construction teams’ workflows, ultimately to automate construction. Given our immense traction with several Fortune 500 construction companies, we are doubling our team (currently 45 FTE) in order to deploy several more agents this year. You will have an opportunity to drive the transformation of a multi-trillion-dollar industry full of waste, risks and inefficiencies.

What you will do and achieve:

  • Design and implement rigorous evaluation frameworks and performance metrics for AI systems (including RAG and agent-based architectures)
  • Develop tools, dashboards, and processes that bring observability to every step of the AI development lifecycle
  • Collaborate cross-functionally to embed best-in-class monitoring and testing methodologies into production workflows
  • Identify bottlenecks and propose solutions to ensure high accuracy and reliability across all AI components
  • Stay at the forefront of industry trends in LLMs, measurement techniques, and agent architectures to enhance system evaluation capabilities

Who you are:

  • MS/PhD in Computer Science, Machine Learning, Artificial Intelligence or a related field
  • 2 years of experience evaluating AI and/or ML systems, with a focus on performance metrics and validation
  • Hands-on experience with observability, analytics platforms, or data engineering to create robust monitoring pipelines
  • Proficiency in Python and strong experience with machine learning frameworks such as scikit-learn, TensorFlow, PyTorch
  • Knowledge of retrieval-augmented generation (RAG) and agent-based workflows, including best practices for measuring their performance
  • Experience with synthetic data generation or test automation to validate model robustness
  • Strong problem-solving skills and a collaborative mindset, eager to work in a fast-paced environment

Preferred but not required:

  • Bonus: Experience with reinforcement learning, reward function design and policy optimization
  • Bonus: Construction industry knowledge or an interest in automating complex, large-scale processes

What We Offer 😎

  • 🎖️ A close-knit and collaborative early-stage startup environment where every voice is heard and every opinion matters.
  • 💰 Competitive salary and stock option equity packages.
  • 🏥 3 Medical Plans to choose from including 100% covered option. Plus Dental and Vision Insurance!
  • 💰 401K
  • 🤓 Learning & Growth stipend.
  • 🥨 Free lunch provided in NYC and Austin office - you’ll never go hungry with us!
  • 🛫 Unlimited PTO; We truly believe in work-life balance and that hard work should be balanced with time for rest and rejuvenation.
  • 🏝 IRL / In-Person retreats throughout the year.

We realize applying for jobs can feel daunting at times. We don’t expect you to check all the qualification boxes and encourage you to apply if you have experience in some of the areas.

At Trunk Tools, we’re working hard to build a more productive and safer environment within the construction industry, and we strive to live by these same values here at Trunk Tools HQ. As an equal-opportunity employer, we are committed to building an inclusive environment where you can be you. We work hard to evaluate all employees and job applicants consistently, without regard to race, color, religion, gender, national origin, age, disability, pregnancy, gender expression or identity, sexual orientation, or any other legally protected class.

Salary : $13

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a AI Evaluation Engineer?

Sign up to receive alerts about other jobs on the AI Evaluation Engineer career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$77,900 - $95,589
Income Estimation: 
$101,387 - $124,118
Income Estimation: 
$67,547 - $87,789
Income Estimation: 
$74,222 - $103,071
Income Estimation: 
$103,110 - $134,766
Income Estimation: 
$79,271 - $104,411
Income Estimation: 
$94,266 - $116,554
Income Estimation: 
$94,266 - $116,554
Income Estimation: 
$116,182 - $159,475
Income Estimation: 
$74,222 - $103,071
Income Estimation: 
$95,422 - $120,607
Income Estimation: 
$137,303 - $192,299
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Trunk Tools

Trunk Tools
Hired Organization Address New York, NY Full Time
Job Description Job Description At Trunk Tools, we are tackling the massive $13 trillion construction industry. We’re an...
Trunk Tools
Hired Organization Address New York, NY Full Time
Job Description Job Description At Trunk Tools, we are tackling the massive $13 trillion construction industry. We’re an...
Trunk Tools
Hired Organization Address Austin, TX Full Time
At Trunk Tools, we are tackling the massive $13 trillion construction industry. We’re an exceptional team of serial entr...
Trunk Tools
Hired Organization Address New York, NY Full Time
At Trunk Tools, we are tackling the massive $13 trillion construction industry. We're an exceptional team of serial entr...

Not the job you're looking for? Here are some other AI Evaluation Engineer jobs in the Austin, TX area that may be a better fit.

AI Engineer

Eidon AI, Austin, TX

AI Assistant is available now!

Feel free to start your new journey!