Demo

Incident Response Manager - Data Center

TikTok
San Jose, CA Full Time
POSTED ON 3/25/2025
AVAILABLE BEFORE 4/23/2025

DescriptionTikTok is the leading destination for short-form mobile video. At TikTok, our mission is to inspire creativity and bring joy. TikTok's global headquarters are in Los Angeles and Singapore, and its offices include New York, London, Dublin, Paris, Berlin, Dubai, Jakarta, Seoul, and Tokyo.Why Join UsCreation is the core of TikTok's purpose. Our platform is built to help imaginations thrive. This is doubly true of the teams that make TikTok possible.Together, we inspire creativity and bring joy - a mission we all believe in and aim towards achieving every day.To us, every challenge, no matter how difficult, is an opportunity; to learn, to innovate, and to grow as one team. Status quo? Never. Courage? Always.At TikTok, we create together and grow together. That's how we drive impact - for ourselves, our company, and the communities we serve.Join us.About the teamThe Data Systems Infrastructure (DSI) team sits within the ByteDance global technology structure and supports the company's fast growth by building and operating hyper-scale datacenters, managing the life cycle of server fleet, providing cloud solutions, and developing various infrastructure services, making sure they are scalable and are reliable.The Incident Response Center (IRC) is the first layer of defense responsible for quick detection and incident response using various monitoring and automation tools, conducting thorough investigation of alerts, classification and triage. The Incident Response Manager is responsible for delivering operations within the IROC across all ByteDance datacenter sites in the respective regions. IRC team is expected to respond to all alarms / alerts set in Server Automation Operations System (SAOS), Data Center Infrastructure Management (DCIM) to quickly discover anomalies and engage Subject Matter Expert (SME) teams to start issue triage. The IRC team provides business intelligence through rigorous analysis of alerts and issues which reduce and prevent recurring incidents .Responsibilities- Delivering global operations within the IROC (Incident Response Operation Center) ByteDance datacenter. - First responder and layer of defense responsible for quick detection and incident response using various monitoring and automation tools, conduct thorough investigation of alerts, classification and triage.- Respond to all infrastructure, facilities, security, and safety events notified via various means, such as alarms / alerts set in Server Operations and Maintenance, Datacenter Infrastructure Management, Network & Grafana, and other functions.- Respond to incidents and critical situations in a problem-solving manner, and conduct in-depth investigation of alerts.- Provide insights into the effectiveness of the incident response and recovery process through regular reports- Analyze trends and patterns in events to identify opportunities for improvement and optimization- Monitor the performance of incident response against the agreed-upon SLAs by alerting and notifying stakeholders- Escalation Management notifying or initiating discussions with higher-level support teams engaging in resolution processes- Identify, assess and communicate potential risks arising through event monitoring that could affect customer's service - Support program managers and facilitate project deliverables, improve overall operational security and engineering initiatives- The Incident Response team is expected to work at ByteDance datacenter site. This is an on-site role.QualificationsMinimum Qualifications- Knowledge of technical elements associated with systems such as Server Health, Datacenter Environment and IP Networks.- Outstanding verbal and written communication skills required, work with minimal direction, meeting goals, attention to details and an eye for continuous improvements.Preferred Qualifications- Degree in Information Technology. - 5 years experience in service center, or similar 24x7 operations center environment.- 3 years of experience in a technology company or experience as a team lead, and experience in operation program management.- 5 years experience as an incident and problem manager.- Good data analytics and presentation skills.- Ability to successfully interact at all levels of the organization, including with clients, while functioning as a team player.- Basic working knowledge of data protection policies such as GDPR and the need to keep sensitive information secure.- Working knowledge and / or certifications in ITIL, CompTIA Server , Schneider Electric Data Center Certified Associate (DCCA), Data Analytics and Visualization.- Willingness to be on call including weekends, nights, and holidays.- Works well under pressure and within time constraints to solve problems and complete deliverables.TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.TikTok is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs or other reasons protected by applicable laws. If you need assistance or a reasonable accommodation, please reach out to us at https : / / shorturl.at / cdpT2#LI-MZ3RegularExperienced

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Incident Response Manager - Data Center?

Sign up to receive alerts about other jobs on the Incident Response Manager - Data Center career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$115,647 - $153,495
Income Estimation: 
$186,685 - $265,377
Income Estimation: 
$186,685 - $265,377
Income Estimation: 
$217,783 - $309,543
Income Estimation: 
$152,958 - $200,151
Income Estimation: 
$186,685 - $265,377
Income Estimation: 
$87,466 - $114,731
Income Estimation: 
$114,790 - $146,930
Income Estimation: 
$115,647 - $153,495
Income Estimation: 
$114,790 - $146,930
Income Estimation: 
$142,618 - $183,267
Income Estimation: 
$115,647 - $153,495
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at TikTok

TikTok
Hired Organization Address San Jose, CA Full Time
Responsibilities TikTok is the leading destination for short-form mobile video. At TikTok, our mission is to inspire cre...
TikTok
Hired Organization Address San Jose, CA Full Time
Backend Software Engineer, TikTok MultiMedia Data Platform Do not pass up this chance, apply quickly if your experience ...
TikTok
Hired Organization Address San Jose, CA Full Time
Responsibilities About TikTok TikTok is the leading destination for short-form mobile video. At TikTok, our mission is t...
TikTok
Hired Organization Address Washington, DC Full Time
Responsibilities About the Team E-commerce's Governance and Experience is a global team responsible for ensuring our mar...

Not the job you're looking for? Here are some other Incident Response Manager - Data Center jobs in the San Jose, CA area that may be a better fit.

Sr. Manager, Incident Review Center (Public Safety, Incident Response)

SoundThinking (formerly ShotSpotter), Fremont, CA

Technical Program Manager - Incident Response

Source Technology, San Jose, CA

AI Assistant is available now!

Feel free to start your new journey!