Hồ Chí Minh
Full-time
As a communication platform, Zalo makes it easier for family, friends, and co-workers to connect. A Data Engineer working on Zalo products, you have a chance to make a positive impact on more than 50M users by leveraging the power of our data. This role specifically focuses on data pipeline for analytics and machine learning projects.
What you will do
Data Platform Development (70%)
1. Build and Scale Data Pipelines:
1. Build and Scale Data Pipelines:
- Design, build, and optimize robust ETL/ELT pipelines to ensure high-quality, reliable, and timely data.
- Work with large-scale data processing frameworks such as Spark and Kafka to enable real-time and batch data workflows.
- Leverage orchestration tools like Airflow to manage complex workflows.
2. Integrate and Maintain Data Systems:
- Connect diverse data sources to support analytics, product intelligence, and machine learning needs.
- Develop scalable and maintainable data architecture for data lakes and data warehouses.
- Implement data quality monitoring, validation, and alerting mechanisms using best practices and modern tools.
Data Insight & Collaboration (30%)
1. Translate Business Needs into Data Solutions:
1. Translate Business Needs into Data Solutions:
- Collaborate closely with Product Managers, Analysts, and Data Scientists to understand business rules and translate them into efficient data models and logic.
- Build and maintain metrics, dashboards, and reports that provide clear visibility into product and user performance.
2. Enable Stakeholder Self-service:
- Empower internal teams by building data marts, documentation, and scalable solutions that reduce dependency on engineering for basic queries.
What you will need
- 3+ years of hands-on experience in data engineering involving large-scale data.
- Proven track record of delivering scalable and maintainable data solutions in a production environment.
Technical Skills:
- Advanced proficiency in Python (pyspark) and SQL (Spark SQL).
- Strong experience with distributed data processing tools (e.g., Apache Spark, Kafka).
- Experience building data workflows with orchestration tools like Apache Airflow.
- Familiarity with cloud platforms (AWS, GCP, or Azure) and modern data warehousing solutions (e.g., BigQuery, Snowflake, Redshift).
- Understanding of CI/CD pipelines, containerization (Docker), and version control (Git).
- Solid understanding of supervised and unsupervised learning algorithms, feature engineering techniques, and model evaluation metrics is a plus.
Soft Skills:
- Excellent communication and stakeholder management skills.
- Ability to work independently and proactively in a fast-paced, dynamic environment.
- Strong analytical thinking, attention to detail, and problem-solving ability.
Report job