What are the responsibilities and job description for the Infrastructure Architect position at Calsoft Labs?
Job Details
Please attach a separate Reference Page to your bid (not within resume) that includes at least 2 professional references! Be sure to include the reference s full name, phone number, email, affiliation to the candidate (Company Name, Title, Relationship, etc).
Top Skills & Years of Experience:
10 years in Linux system administration (Ubuntu, CLI, security, networking)
10 years in Bash & Python scripting, with pipeline automation experience (e.g., Nextflow)
10 years in Slurm workload manager installation, configuration, and job troubleshooting
10 years in cluster and storage management, including NAS (Qumulo), rsync, and mount strategies
- Basic HPC security
- Implementing and maintaining data management -
- Providing support for SAN and NAS storage, backup/recovery environments and virtualization infrastructure by implementing, managing, and monitoring the hardware and software
- Playing a major role in the security, disaster recovery and services continuity of a highly available enterprise storage and backup infrastructure by following established procedures and compliance requirements
- Technical support (installation, configuration, maintenance, upgrade, retirement, troubleshooting).
- Configuration management using frameworks such as Ansible, Puppet, and Chef.
- Administration of high-speed network storage systems including Mellanox switches, and NAS Cluster.
- Managing, configuring, and supporting cloud systems such as setting up, maintaining, and troubleshooting cloud compute engines and storage buckets
- Managing databases(eg:SQL Server, PostGreSQL , MySQL, Oracle)
- Assisting staff to access and utilize computing resources
- Co-ordinating with Labs and DTMB staff on maintaining and managing the computational resources
Skills & Experiences
- 10 years experience with the Linux CLI environment and coding languages such as R, Python, Bash
- 10 years experience with workload management systems such as SLURM
- 10 years experience with setting up HPC systems including identifying suitable hardware and software needs
- 10 years experience with setting up and managing databases such as PostgreSQL
- 10 years experience performing System Administration including installation, configuration, and support software, packages, and libraries in various environments
- 10 years experience with Network Appliance clustered servers and applicable software
- 10 years experience with hands-on troubleshooting, issue resolution, discrepancy tracking, and report generation
- 10 years experience with Linux configuration regarding Storage, Networking, Load Balancing, Memory Management, VMs, Firewalls, and System Monitoring
- 10 years experience with computer security
- Knowledge of package management systems such as conda, Docker and Singularity
- Knowledge of automation tools such as Ansible or Puppet and NextFlow
- Experience with cloud computing (setting up compute engines, storage buckets)
- Strong knowledge of enterprise storage solutions
- Familiar with software frameworks used for searching, monitoring, and analyzing big data
- Ability to provide good recommendations, and guidance for storage and cost savings for Labs
- Knowledge and experience in HL7 messaging
- Ability to review and interpret web.config files for plugins and interpret them
- Ability to review logs(for eg:IIS logs, Dynatrace logs, etc) to make sure that there is no excess resource utilization and no peaks or spikes occurring on the web/app server
- Knowledge in ClouFlare, ForcePoint and the related rule(for eg.C86 rule) and the policies
- Ability to understand the existing junction configuration to the application and review those settings in case of a break
- Help the team with setting up Failover environment for Apps
- Help the team to complete the Disaster Recovery(DR) Plan and DR Testing
- Knowledge on CDC hosted apps(preferred, not a requirement)
Salary : $70 - $80