Support and maintain Kubernetes-based infrastructure primarily on AWS EKS. Build and enhance automation for provisioning, configuration, monitoring, and scaling of cloud-native environments. Collaborate closely with engineering teams to ensure platform reliability, performance, and operational excellence. Implement and manage secure processes for data and secret rotation across environments. Develop tools and practices to improve observability, reliability, and incident response. Provide technical leadership, mentorship, and promote best practices in Kubernetes, automation, and cloud operations. Manage project priorities, milestones, and deliverables in a fast-paced environment. Qualifications: Deep expertise with Kubernetes (EKS preferred) in production environments. Strong hands-on experience with AWS services, including IAM, EKS, EC2, S3. Proficiency in data and secret rotation strategies and tooling. Proficient in scripting and automation with Python and Bash. Solid understanding of Linux fundamentals, including OS-level troubleshooting and performance tuning. Experience with infrastructure as code tools such as Terraform, Helm, or Argo CD. Familiarity with container networking, observability tooling, and CI/CD best practices. Proven ability to architect, develop, and troubleshoot distributed systems. Strong problem-solving mindset, ownership, and communication skills. Experience in high-scale, low-latency, or mission-critical environments is a plus. Seniority level Mid-Senior level Employment type Full-time Job function Engineering and Information Technology
#J-18808-Ljbffr
Site Reliability Engineer
1,000,000 HK$
Site Reliability Engineer
Hong Kong, Hong Kong, Hong Kong Island,
Modified May 18, 2025
Description
Job details:
⇐ Previous job |
Next job ⇒ |
Advertisement: