Head of System Ops
Our client seeks a Head of System Ops to join their team.
- Oversee the administration and optimisation of Hyper-V and Docker Swarm environments, including the design, implementation, and maintenance of virtual infrastructure, storage, and networks. Develop, implement, and manage disaster recovery procedures for Hyper-V and Docker Swarm environments.
- Linux Administration: Supervise the operation and maintenance of Linux servers, troubleshooting and resolving issues as they arise. Ensure the stability, integrity, and efficient operation of Linux systems that support core organizational functions.
- Traefik Management: Manage and optimize the use of the Traefik load balancer for Docker configurations to facilitate the reliable operation of services and applications.
- System Maintenance and Monitoring: Conduct regular system maintenance tasks, such as updating and patching systems, and ensure optimal performance and uptime of production systems and applications. Implement and oversee a comprehensive system monitoring strategy, inclusive of Docker Swarm and Traefik environments.
- Security Oversight: Develop and enforce security protocols to protect the organization's IT infrastructure, including Hyper-V, Docker Swarm, and Linux systems. Ensure systems are safeguarded against known and potential vulnerabilities.
- Infrastructure Planning: In collaboration with Technical Leadership and other stakeholders, participate in long-term infrastructure strategy and planning, encompassing both on-premises and cloud-based elements, with a strong focus on containerization and load balancing strategies.
- Vendor Management: Manage relationships with hardware and software vendors, and ensure that the organization is getting optimal value and support from these partnerships.
- Documentation and Compliance: Maintain accurate and up-to-date system documentation, including architecture diagrams, procedures, and policies. Ensure compliance with relevant industry standards and regulatory requirements - particularly the implementation of the Telecommunications Security Act by 2025 deadlines.
- Incident Management: Manage incident response protocols, ensuring swift resolution of issues and minimal disruption to operations. Conduct post-incident reviews to identify root causes and prevent recurrence.
Key Skills & Experience:
- A minimum of 5 years experience in a similar role
- Provide a hands on approach to running the team
- Tackle technical issues directly
- High level strategic planning with a desire to succeed and grow
- The determination and skills to build, motivate and inspire their team
- A sense of urgency and the ability to work in a fast-paced office environment
- Proficiency in the use of Microsoft Office applications, specifically MS Word, MS Excel and PowerPoint
Your specialist: Debbie Amankwa
Quote job ref: 13782
Hi, I'm Debbie and I look forward to receiving your submission for this fantastic opportunity with this business.