What are the responsibilities and job description for the AI/ML Engineer position at Mitchell Martin, Inc.?
Job Details
Location: Southeastern Region
Description:
The AI/ML Engineer will play a key role in developing AI-driven observability solutions that optimize incident management and system monitoring. This role involves designing and deploying machine learning models to enhance real-time monitoring, automate root cause analysis, and improve system performance. The ideal candidate will have experience with AI-powered observability tools, anomaly detection, and predictive analytics in production environments.
Responsibilities:
AI-Powered Observability:
Design and implement AI-driven tools to enhance real-time system monitoring and incident management.
Reduce false positives and provide actionable insights using AI models.
Integrate machine learning workflows with telemetry data sources such as Splunk, Dynatrace, and Data Lakes.
Automation and Innovation:
Automate root cause analysis (RCA) workflows using AI to reduce Mean Time to Identify (MTTI) and Mean Time to Restore (MTTR).
Collaborate with cross-functional teams to identify AI use cases and develop scalable solutions.
Model Development and Deployment:
Build and fine-tune machine learning models for anomaly detection, predictive maintenance, and platform restoration.
Deploy AI models in production environments using frameworks such as NVIDIA Triton, TensorFlow Servicing, and Kubernetes.
Data Engineering and Optimization:
Develop efficient data pipelines for ingesting and preprocessing logs, metrics, and traces from observability platforms.
Leverage vector databases for hybrid search, retrieval-augmented generation (RAG), and AI agent workflows.
Technical Skills:
Proficiency in Python and machine learning frameworks such as TensorFlow and PyTorch.
Experience with AI frameworks like LangChain and LlamaIndex.
Hands-on experience with observability tools like Splunk, Dynatrace, and Prometheus.
Strong knowledge of vector databases and AI model integration.
Familiarity with NVIDIA Triton or TensorRT for inference optimization.
AI/ML Expertise:
Proven ability to develop and deploy ML models in production environments.
Experience with anomaly detection, predictive analytics, and NLP-based solutions.
DevOps and Site Reliability Engineering (SRE):
Understanding of CI/CD pipelines and modern DevOps practices.
Familiarity with Site Reliability Engineering (SRE) best practices.
Collaboration:
Strong problem-solving skills and ability to work in cross-functional teams.
Ability to communicate technical concepts to non-technical stakeholders.
Employment Type:
Contract
Compensation:
$48.87-$69.82 per hour
Benefits:
Learn more about our benefits offerings .
EEO Statement
Learn more about our EEO policy .
#LI-AB1
Salary : $49 - $70