Lead/Senior Site Reliability Engineer

Zalo
Thành phố Hồ Chí Minh
Full time
4 tuần trước
Hồ Chí Minh

Full-time

As we provide services which serving million customers such as: Zalo, ZMP3, BaoMoi, Kiki ....We are looking for an experienced SRE who brings a unique perspective, a passion for collaborating with cross-functional teams, and the ability to derive real-time insights from massive-scale data to build practical solutions and deliver exceptional user experiences at every touchpoint.
  • Run the production environment by monitoring availability and taking a holistic view of system health;
  • Build software and systems to manage platform infrastructure and applications;
  • Improve reliability, quality, and time-to-market of our suite of software solutions;
  • Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating for continual improvement;
  • Provide primary operational support and engineering for multiple large-scale distributed software applications;

What you will do

  • Gather and analyze metrics from operating systems as well as applications to assist in performance tuning and fault finding;
  • Partner with development teams to improve services through rigorous testing and release procedures;
  • Participate in system design consulting, platform management, and capacity planning;
  • Create sustainable systems and services through automation and uplifts;
  • Balance feature development speed and reliability with well-defined service-level objectives

What you will need

  • Ability to program (structured and OOP) using one or more high-level languages, such as Python, Golang;
  • Experience with dynamic resource management frameworks (Kubernetes, Nomad, Yarn);
  • Experience manage infrastructure as code (Terraform,..);
  • Experience with source version control (git, svn...), as well as configuration management (Ansible, Puppet, Salt stack...);
  • Experience with distributed storage technologies such as NFS, HDFS, Ceph and Amazon S3;
  • Proactive approach to identifying problems, performance bottlenecks, and areas for improvement;

Preferred skills and qualifications
  • Previous success in technical engineering;
  • Coding experience beyond simple scripts.

Apply
Other Job Recommendations:

Senior Site Reliability Engineer

Pinnacle
Remote
  • Design, build, and maintain scalable, reliable, and secure...
  • Develop and implement CI/CD pipelines, automating...
3 tuần trước

Quality Control Engineer

Sai Digital
Thành phố Hồ Chí Minh
25.000.000 ₫ - 35.000.000 ₫
  • Review requirements, specifications, and technical documents...
  • Create detailed, comprehensive and well-structured test...
1 tuần trước

Lead/Senior Site Reliability Engineer

Zalo
Thành phố Hồ Chí Minh
  • Build software and systems to manage platform infrastructure...
  • Measure and optimize system performance, with an eye toward...
4 tuần trước

Account Manager

Zalo
Thành phố Hồ Chí Minh
  • Approach and liaise with new clients to discuss and learn...
  • Maintain relationships with existing clients on a regular...
1 ngày trước

Engineer, Maintenance Excellence

Colgate-Palmolive
Việt Nam
  • Developing and implementing the plant’s comprehensive...
  • Ensuring all maintenance operations strictly adhere to...
1 ngày trước

Nhân Viên Chăm Sóc Khách Hàng

Tập Đoàn Tân Á Đại Thành
Hà Nội
8.000.000 ₫ - 9.000.000 ₫
  • Ghi nhận các khiếu nại, vấn đề cần giải quyết của khách...
  • Thực hiện các cuộc gọi chăm sóc KH chủ động happycall...
1 ngày trước

Area Sale Manage

Công ty cổ phần công nghệ FINVIET
Bắc Ninh, Tỉnh Bắc Ninh
25.000.000 ₫ - 30.000.000 ₫
ECO Tiệm Số Hóa Tuyển Dụng Area Sale Manager 1 ASM quản lý: Bắc Giang, Bắc Ninh, Lạng Sơn, Hưng Yên, Hải Dương Hàng tiêu dùng...
17 giờ trước

Nhân Viên Kế Toán Nội Bộ

Công Ty TNHH Hà Trọng Minh
Hà Nội
10.000.000 ₫ - 12.000.000 ₫
  • Hạch toán chi phí, doanh thu nội bộ theo đúng quy định kế...
  • Kiểm soát hóa đơn, chứng từ nội bộ trước khi chuyển cho kế...
1 ngày trước

Product Operations Collaborator

Zalo
Thành phố Hồ Chí Minh
  • Review and ensure advertising content/ forms comply with the...
  • Commit to work for at least 6 months;...
1 ngày trước

Security Operations Analyst

OPSWAT
Thành phố Thủ Đức
  • Perform validation, investigation, and response to...
  • Enhance incident detection and alerting by refining...
13 giờ trước