- Develop and maintain server-hosted infrastructure for data processing tools like Apache Spark, Apache Airflow, and DataHub.
- Build, monitor, and maintain ETL/ELT pipelines to process, clean, and transform data from various systems.
- Collaborate in the design and development of a data feature store, enabling consistent and reusable features for analytics and machine learning applications..
- Work closely with cross-functional teams to ensure data availability, discoverability, and quality.
- Write and maintain documentation for infrastructure components, workflows, and data assets.
Yêu cầu
- Bachelor’s degree in computer science, engineering, mathematics, or a related technical discipline.
- 1–2 years of experience in a data engineering or related role.
- Experience working with on-premise environments and tools like Apache Spark, Airflow, and DataHub.
- Proficiency in SQL and experience with database management systems
- Strong programming skills in Python
- Familiarity with ETL/ELT processes, distributed systems, and Linux-based server operations.
- Good understanding of data modeling, schema management, and data lifecycle.
- Good organizational and communication skills;
- Experience or knowledge of building a data feature store is a plus
Report job