What are the responsibilities and job description for the Sr. Systems Engineer position at Laine Recruiting?
Laine Recruiting has been engaged by one of the most respected higher education institutions and largest employers in the Rochester area! We've partnered to fill several roles within their Center for Integrated Research Computing (CIRC). This team provides hardware, software, training and support to over 110 departments across the organization.
About the Role
The Sr Systems Engineer manages and administers advanced on-premise and cloud-based computing, networking, and storage for research. In addition to administrating servers and virtual systems, the position requires specialized skills for managing advanced computing architectures (e.g. high-performance computing systems and accelerators such as GPUs), specialized high-speed network technology (e.g. low-latency InfiniBand networks and research cluster topologies), and massive parallel file systems engineered for high-volume and high-velocity data for research. Responsible for deploying and managing specialized tools for configuring and controlling research computing environments (e.g. SLURM, service nodes, etc.). Responsible for the design, setup and maintenance of a University-wide research computing infrastructure with monitoring and security, and communicating about advanced research technology solutions with faculty from several departments, centers, and schools.
Overview of Responsibilities
- Leads analysis of research use case requirements, design solutions, and deployment of on premise or cloud based advanced infrastructure for University-wide research computing that involves application areas of existing and emerging areas of high performance computing, including artificial intelligence, big data, modeling, and simulation.
- Leads the design, development, deployment, and maintenance of systems that require creative assembly of specialized computing (e.g. GPUs and accelerators), advanced network (e.g. InfiniBand) and parallel file system (e.g. GPFS) infrastructure. Leverages specialized infrastructure to deploy research computing solutions that are typically unconventional or unorthodox in traditional information technology environments.
- Leads, advises, and completes proactive performance monitoring of high-performance computing and supporting resources, including the analysis, alerting, reporting, and tuning of computational accelerators, high-bandwidth and low-latency networks, parallel file systems, and scheduling / resource management software. Provides capacity analysis, maintenance and troubleshooting activities for advanced infrastructure in the research computing environment. Thinks creatively to respond to performance issues, system errors, and maintenance that are outside of standard information technology process controls and procedures.
- Leads the creation, review, and maintenance of technical documentation including solution designs and reference guides for institutional-wide research computing infrastructure. Prepares necessary paperwork and documentation to ensure compliance to federal funding agency standards and procedures. Leads discussions of new products and services to enhance the delivery of research computing infrastructure; engages vendors as appropriate.
- Maintains a broad knowledge of advanced technology, specialized equipment, and security and research compliance requirements. Remains mindful and vigilant of risks to the research enterprise while consulting with faculty, performing work, and planning activities.
Qualifications