What are the responsibilities and job description for the Lead Systems Engineer (DataDog) - W2 only position at nTech Solutions?
Job Details
Term of Employment
- W2 Contract, 9 months
- This position is primarily remote, however, the resource will be required to come into the office for collaboration meetings (Reston, VA, or D.C. office), which could be up to once a week in the office (there is no set schedule).
- The office is located in Washington, D.C.
- This position is full-time, 40 hours per week.
Overview
Our client is seeking a Lead Systems Engineer to support a DataDog implementation. The team is responsible for monitoring all applications for the entire enterprise. As the Lead Systems Engineer, you will be responsible for software tool administration for systems and applications monitoring tools. The Operations Center is seeking a Lead Systems Engineer to support the Systems Monitoring initiatives for several SOW in 2025 and beyond. Core responsibilities include script writing, installing, managing, and maintaining the monitoring tools, as needed, as well as integration with other tools and collaboration with other groups and their tools.
Required Skills & Experience
- Bachelor of Science in Computer Science or related field (i.e., Engineering, Applied Science, Math, etc.) or equivalent experience.
- DataDog Administration experience on Linux platform to instrument Java based applications running on Tomcat Application Server.
- Configuration experience in Infrastructure Monitoring, Network Monitoring and Centralized Logging. Or similar Administration experience with ELK Stack Elasticsearch (search and analytics engine), Logstash (ingest pipeline) and Kibana (visualization and creating dashboards).
- Strong Linux platform (Red Hat) background.
- Understanding of SSL setup on Linux servers. Installing CA certs etc.
- Experience with Network Monitoring and knowledge on Network components like Switches, Routers, Palo Alto Network utilization SNMP, F5 Load Balancers, WebSeal, Info Blocks, Gigamon, Network Mapping is a plus.
- Working knowledge of other monitoring tools like Big Panda, CloudBeat (Synthetic Monitoring) is desired. These tools are used to monitor applications and business transactions that impact the business and customers, currently.
- 8 plus years strong IT experience and good working knowledge of a variety of technology platforms in a distributed environment including: Microsoft systems (e.g. Windows 2012 and 2016 Server, Active Directory, Exchange, SharePoint), Linux/Unix, VMWare, SQL Server, database architectures, TCP/IP, VPNs, Mainframe, LAN/WAN technologies and architectures
- Minimum of 3 years hands-on experience installing, integrating, managing and maintaining monitoring tools like Data Dog administration and support.
- Similar Log Management experience with ELK Stack Elasticsearch (search and analytics engine), Logstash (ingest pipeline), and Kibana (visualization and creating dashboards)
- Experience in writing Shell, Python, Selenium, VuGen scripts
- Experience with SSL certs, encryption methods on Linux
- Experience in developing and implementing systems monitoring and alerting strategies in diverse, large-scale environments.
- Experience developing and documenting processes, procedures, and policies for tool usage and integration.
- Author tool maintenance and training documentation as well as support requests for training on tool usage.
- Knowledge and experience with configuring alerts, dashboards and ad-hoc reports.
- Strong understanding of service level management (SLAs, SLRs, etc.)
- Determine and document tool backup and recovery procedures.
- Experience with data management tools and databases (e.g., DB2, SQL -familiarity desired)
- Experience in systems and Java applications troubleshooting using monitoring tools like DataDog.
- Understanding and experience with both waterfall and agile Software Development Life Cycles (SDLC)
- Experience with SAFe agile methodologies.
Preferred Skills & Experience
- Experience with Network Monitoring and knowledge on Network components like Switches, Routers, Palo Alto Network utilization SNMP, F5 Load Balancers, WebSeal, Info Blocks, Gigamon, Network Mapping.
- Automation experience with scripting (Python, Shell, ANSIBLE).
- Working knowledge of other monitoring tools like Big Panda, CloudBeat (Synthetic Monitoring) or Moogsoft.