What are the responsibilities and job description for the Infrastructure Solutions Architect - Locals only position at Skywalk Global?
The State of Michigan is looking for: Infrastructure Solutions Architect 5
Remote or On-site: Accepting local candidates ONLY within 90 minutes from Lansing, MI. Position will be hybrid, in office 3 days a week upon start and there is NO REMOTE ONLY option.
Please attach a separate Reference Page to your bid (not within resume) that includes at least 2 professional references! Be sure to include the reference’s full name, phone number, email, affiliation to the candidate (Company Name, Title, Relationship, etc).
Interview Process: Virtual Interview via MS Teams. A screenshot photo of candidate will be required for any interviews as well as a vendor present at beginning of virtual interview to validate candidate.
Duration: 1 year with possible extension.
Top Skills & Years of Experience:
- 10 years in Linux system administration (Ubuntu, CLI, security, networking)
- 10 years in Bash & Python scripting, with pipeline automation experience (e.g., Nextflow)
- 10 years in Slurm workload manager – installation, configuration, and job troubleshooting
- 10 years in cluster and storage management, including NAS (Qumulo), rsync, and mount strategies
Infrastructure Architect – JD 141965
- Basic HPC security
- Implementing and maintaining data management infrastructure
- Providing support for SAN and NAS storage, backup/recovery environments and virtualization infrastructure by implementing, managing, and monitoring the hardware and software
- Playing a major role in the security, disaster recovery and services continuity of a highly available enterprise storage and backup infrastructure by following established procedures and compliance requirements
- Technical support (installation, configuration, maintenance, upgrade, retirement, troubleshooting).
- Configuration management using frameworks such as Ansible, Puppet, and Chef.
- Administration of high-speed network storage systems including Mellanox switches, and NAS Cluster.
- Managing, configuring, and supporting cloud systems such as setting up, maintaining, and troubleshooting cloud compute engines and storage buckets
- Managing databases(eg:SQL Server, PostGreSQL , MySQL, Oracle)
- Assisting staff to access and utilize computing resources
- Co-ordinating with Labs and DTMB staff on maintaining and managing the computational resources
Skills & Experiences
10 years experience with the Linux CLI environment and coding languages such as R, Python, Bash- 10 years experience with workload management systems such as SLURM
- 10 years experience with setting up HPC systems including identifying suitable hardware and software needs
- 10 years experience with setting up and managing databases such as PostgreSQL
- 10 years experience performing System Administration including installation, configuration, and support software, packages, and libraries in various environments
- 10 years experience with Network Appliance clustered servers and applicable software
- 10 years experience with hands-on troubleshooting, issue resolution, discrepancy tracking, and report generation
- 10 years experience with Linux configuration regarding Storage, Networking, Load Balancing, Memory Management, VMs, Firewalls, and System Monitoring
- 10 years experience with computer security
- Knowledge of package management systems such as conda, Docker and Singularity
- Knowledge of automation tools such as Ansible or Puppet and NextFlow
- Experience with cloud computing (setting up compute engines, storage buckets)
- Strong knowledge of enterprise storage solutions
- Familiar with software frameworks used for searching, monitoring, and analyzing big data
- Ability to provide good recommendations, and guidance for storage and cost savings for Labs
- Knowledge and experience in HL7 messaging
- Ability to review and interpret web.config files for plugins and interpret them
- Ability to review logs(for eg:IIS logs, Dynatrace logs, etc) to make sure that there is no excess resource utilization and no peaks or spikes occurring on the web/app server
- Knowledge in ClouFlare, ForcePoint and the related rule(for eg.C86 rule) and the policies
- Ability to understand the existing junction configuration to the application and review those settings in case of a break
- Help the team with setting up Failover environment for Apps
- Help the team to complete the Disaster Recovery(DR) Plan and DR Testing
- Knowledge on CDC hosted apps(preferred, not a requirement)