We are looking for a Junior Data Engineer to join our team and contribute to building and scaling our Customer Data Platform (CDP) infrastructure. This role offers the opportunity to work on real-world, large-scale data challenges, while gaining exposure to modern data engineering practices and technologies. You will collaborate closely with senior engineers, but also take ownership of specific components, pipelines, and initiatives that support multiple critical business functions.
Key Responsibilities
Key Responsibilities
- Maintain and monitor the Customer Data Platform infrastructure to ensure reliability and performance
- Develop, optimize, and document data pipelines for Apollo Internal Products, enabling user segmentation and targeting features
- Ensure data availability and integrity for internal portals and customer service operations
- Contribute to data infrastructure that powers AI/ML initiatives across the organization
- Implement and follow data privacy best practices when handling sensitive/PII data
- Apply security measures and manage access controls for data systems
- Produce and maintain documentation for compliance and operational processes
- Support and independently fulfill internal data requests from multiple departments
- Create, maintain, and improve data extraction scripts and ETL processes
- Monitor data quality, detect anomalies, and contribute to validation frameworks
- Communicate technical insights effectively with both technical and non-technical stakeholders
Yêu cầu
- Bachelor’s degree in Computer Science, Information Technology, Engineering, or related field
- 1–2 years of experience in data engineering, software engineering, or related field
- Solid understanding of data structures, algorithms, and distributed systems fundamentals
-
Hands-on experience with at least 3 of the following technologies:
- Python programming (must have)
- Golang
- Apache Spark (data processing, optimization techniques)
- Hadoop ecosystem components (HDFS, Hive, etc.)
- Apache Kafka (streaming concepts, basic setup)
- SQL (queries, optimization, schema design) and relational databases
- Familiarity with Linux/Unix environments and shell scripting
- Basic containerization knowledge (Docker; Kubernetes is a plus)
- Familiarity with version control systems (e.g., Git) and CI/CD workflows is a plus
- Strong analytical and problem-solving skills with attention to detail
- Proactive learning mindset and curiosity for new data technologies
- Ability to work both independently and in collaborative team settings