About The Company
My client employs the best talent in the world to design, run, and manage the most advanced and dependable technology infrastructure. They consider the health of these critical technology ecosystems holistically. They are a focused, independent firm that builds their foundation of excellence by developing new systems.
The also take pride in embracing a sustainable environment and are taking immediate action to achieve in the coming years.
About the Job
My client is seeking for an experiences Security Site Reliability Engineer who will be responsible for keeping all CISO infra and SOC solutions running smoothly at a high level of availability. Also needs an expert who be responsible for support, automation, documentation, and continuous improvement of all aspects of a large-scale CISO infrastructure.
We're looking for people with a mix of practical operations and software development skills who can apply sound engineering principles, operational discipline, and mature automation to our systems.
Primary Job Responsibilities
* Be a responder of the SRE on-call rotation (PagerDuty) to respond to incidents that impact service availability.
* Prevent incidents from happening again through blameless post-mortems.
* Manage infrastructure on Azure and AWS
* Using IaC tools including Terraform and Ansible
* Build monitoring that alerts on symptoms before they become outages.
* Document every action so your findings turn into repeatable actions and then into automation.
* Improve operational processes (such as deployments and upgrades) to make them as simple and streamlined as possible.
* Design, build and maintain core infrastructure that enables scaling to many terabytes of data.
* Debug production issues across all services and levels of the stack.
* Plan the growth of our infrastructure.
* Think about systems: edge cases, failure modes, behaviours, specific implementations.
* Remain current and up to date with emerging technologies, business requirements and enhancements & develop proposals for changes that may be required.
* A manager of one, able to self-organise and report timely.
* Familiar with agile methodologies
* Strong understanding of Linux, network troubleshooting analysis, and current security methodologies.
* Strong understanding of cybersecurity technologies, protocols, and applications.
* Detailed technical experience in the installation, configuration, and operation of high-end security solutions.
* Experience in log management platforms experience, including Splunk, Elasticsearch, Logstash, Kibana - ELK, and Elastic Stack.
* Experience with container services, including Docker, and Kubernetes.
* Experience with IDS/IPS, SEIM, Endpoint solutions and technologies.
* Completing Root Cause Analysis (RCA) investigations and performing operational readiness reviews.
* Must have a thorough (advanced to expert) understanding of IT security and implementation of security related guidelines and impact on IT infrastructures.
* Problem solving abilities across enterprise multiple technology environments with complex integration's.
* Effective time management skills.
* Strong verbal and written communication skills; must be able to communicate effectively with a wide variety of audiences, both business and technical.
* Work collaboratively and cooperatively with diverse geographical and cultural groups.
If you are interested in this role kindly apply now or send me an email at [email protected]