What are the responsibilities and job description for the Lead AI Safety / AI/ML Testing Lead/GenAI Safety Tester position at Trilyon, Inc.?
The Responsible AI Scaled Testing Team within Trust & Safety performs pre-launch structured testing for Google’s AI applications against safety, fairness and neutrality policies and standards. It is a global team with Responsible AI domain expertise and diverse backgrounds in operations, strategy, ethics, risk management, product management, and program management.
Overall Responsibilities: (Share 2-3 sentences about the POSITION)
Lead the end-to-end technical assessment of Google's GenAI products, focusing on pre-launch testing of safety, neutrality, and fairness. Develop and implement rigorous testing methodologies, including automated prompt generation and response analysis, to ensure compliance with defined standards. Leverage data-driven insights to identify potential risks and inform product development iterations, ensuring robust and reliable GenAI deployments.
Top 3 Daily Responsibilities: (3 bullets of the main responsibilities on the assignment)
- Automated Testing and Data Pipeline Management: Design and implement automated prompt generation strategies and data analysis solutions to efficiently collect and analyze GenAI model responses. Manage and optimize data pipelines for efficient processing and analysis of large datasets.
- Quantitative and Qualitative Analysis of Model Behavior: Conduct in-depth statistical analysis and qualitative evaluations of model outputs to identify deviations from defined standards. Develop and apply metrics for evaluating safety, neutrality, and fairness, and generate detailed reports with actionable insights.
- Technical Guideline Development and Execution: Translate abstract safety, neutrality, and fairness standards into precise technical guidelines and evaluation criteria. Develop and maintain clear documentation for vendor teams, including detailed instructions, edge case clarifications, and quality calibration protocols.
Mandatory Skills/Qualifications: (All skills, both technical and soft, required to be successful in the role)
- Bachelor's degree in Computer Science, Data Science, Statistics, or a related technical field, or equivalent practical experience.
- 4 years of experience in data analysis, AI/ML testing, cybersecurity, or related technical domains.
- Proficiency in data analysis tools and languages (e.g., SQL, Python, R) for processing and analyzing large datasets.
- Experience developing and implementing automated testing frameworks.
- Strong analytical and problem-solving skills, with the ability to interpret complex data and identify patterns.
- Excellent technical communication skills, with the ability to clearly articulate complex technical concepts to both technical and non-technical audiences.
This role may be exposed to graphic, controversial, and/or upsetting content.
Non-Essential Skills/Qualifications: (Skills that would be nice to have but are not essential in the role)
- 2 years of experience in AI testing, adversarial testing, red teaming, or related areas.
- Experience with LLM-based prompt generation and evaluation tools.
- Familiarity with machine learning concepts and algorithms.
- Experience with bug tracking systems (e.g., Buganizer).
- Experience in developing and maintaining technical documentation.
- Experience in defining and implementing process improvements.
- Ability to think strategically about emerging AI threats and vulnerabilities.