Demo

Principal HPC System Administrator

Consortium for School Networking
Chicago, IL Full Time
POSTED ON 3/26/2025
AVAILABLE BEFORE 4/18/2025

Location : Chicago, IL

Check out the role overview below If you are confident you have got the right skills and experience, apply today.

Job Description :

Design, configure, deploy, and maintain large computer clusters, servers and software.

Perform day-to-day operations leadership, including systems administration, monitoring and storage performance up to and including network components. Management of the system's network switch, parallel file system and HPC software stack and tools.

Monitor, maintain, and optimize HPC systems and software to improve performance and resource utilization.

Serve as the technical lead on complex projects and system related tasks, as needed.

Configure, install, and maintain the job scheduler / workload manager.

Diagnose and resolve system operational problems promptly and effectively, coordinating with vendors to address hardware and software issues.

Use scripting / programming to enable system-level automation, monitoring, and problem detection.

Build and deploy open-source software as well as software from vendors / partners.

Develop and implement strategies for HPC data management, backup, disaster recovery, and security, ensuring reliable and efficient backup and restores for all managed systems.

Create standard operating procedures for routine and complex system tasks.

Maintain and monitor the security of HPC systems and servers, implementing robust security measures, as applicable.

Troubleshoot and identify failed hardware, implement parts replacement, and resolve system failures.

Stay updated with the latest developments in HPC technologies and apply this knowledge to improve RCC systems.

Solves complex problems to configure, install, upgrade and maintain server applications and hardware. Works to safeguard the integrity of computer software. Implements operating system enhancements to improve the reliability and performance of the system.

Provides expertise in planning and installing necessary patches and upgrades for servers and their associated storage, network, communications, and peripheral sub-systems. Installs and maintains an appropriate level of intrusion detection, monitoring, and auditing software as required.

Perform other related work as needed.

Preferred Qualifications

Education :

Bachelor's degree in Computer Science or closely related field.

Experience :

A minimum of seven years of full-time Linux system administration experience in a large distributed computing environment.

Technical Skills or Knowledge :

Experience with Linux system administration (e.g., RHEL, Rocky, CentOS).

Proficiency in the installation, maintenance, operation, tuning and troubleshooting of Linux and related systems and software.

Experience in installing, configuring, and maintaining a job scheduler / workload manager (such as SLURM, TORQUE, or PBS).

Experience configuring, installing and troubleshooting MPI and OpenMP.

Experience with at least one HPC cluster management tool (e.g. XCAT, Confluent, Warewulf, or Bright).

Experience in configuring, administering, and supporting network storage subsystems.

Hands-on experience with at least one parallel file system (e.g., Spectrum Scale-GPFS, Lustre, BeeGFS, or Ceph).

Direct experience working with Infiniband, including a working knowledge of Infiniband concepts, OFED layers, subnet managers, as well as Gigabit Ethernet.

Experience with networking and security.

Experience with systems automation tools such as Ansible or Puppet.

Experience with versioning tools such as Git or Subversion.

Experience configuring, installing, maintaining and using monitoring and optimization tools.

Strong knowledge of scripting languages such as Python or bash.

Preferred Competencies

Ability to work well with faculty and researchers.

Ability to identify and gain expertise in appropriate new technologies and / or software tools.

Ability to function as part of an interactive team while demonstrating self-initiative to achieve project's goals and Research Computing Center's mission.

Strong analytical skills and problem-solving ability.

Application Documents

Cover letter (preferred)

Resume (required)

J-18808-Ljbffr

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Principal HPC System Administrator?

Sign up to receive alerts about other jobs on the Principal HPC System Administrator career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$101,597 - $131,824
Income Estimation: 
$104,896 - $133,785
Income Estimation: 
$123,198 - $153,566
Income Estimation: 
$144,577 - $191,047
Income Estimation: 
$178,567 - $236,389
Income Estimation: 
$128,195 - $161,806
Income Estimation: 
$149,354 - $186,884
Income Estimation: 
$83,502 - $107,152
Income Estimation: 
$104,896 - $133,785
Income Estimation: 
$123,198 - $153,566
Income Estimation: 
$104,896 - $133,785
Income Estimation: 
$128,195 - $161,806
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Consortium for School Networking

Consortium for School Networking
Hired Organization Address Brockton, MA Full Time
Boston College Introduction Read on to find out what you will need to succeed in this position, including skills, qualif...
Consortium for School Networking
Hired Organization Address San Francisco, CA Full Time
The Network Administrator within the University of California, San Francisco's (UCSF) Information Technology (IT) depart...
Consortium for School Networking
Hired Organization Address San Francisco, CA Full Time
Having wide-ranging experience, the Director of Information Technology (IT) applies systems infrastructure concepts and ...
Consortium for School Networking
Hired Organization Address California, MO Full Time
To be considered, submit a cover letter as the first page of your application materials. In that letter, explain your in...

Not the job you're looking for? Here are some other Principal HPC System Administrator jobs in the Chicago, IL area that may be a better fit.

Senior HPC Administrator

Argonne National Laboratory, Lemont, IL

System Administrator

Akkodis, Joliet, IL

AI Assistant is available now!

Feel free to start your new journey!