emagine is looking for a Data Engineer / ETL Developer for one of our clients in the pharmaceutical industry.
Job description:
Technological Facilitation of Data Access:
Develop ETL pipelines and workflows with continuous delivery pipelines in the data product teams.
Use Infrastructure as Code (IaC) and CI/CD to deliver infrastructure components to AWS environments.
Improve data availability and validation by acting as a liaison between Data Science and IT teams through data products.
Data Wrangling:
Ingest, integrate, and curate data from various internal and external sources with varying degrees of maturity in availability, access requirements, data formats, and underlying technologies.
Identify and share best practices, increase awareness of data management standards and validation.
Create improvements in methods, techniques, and approaches to optimize the way we work and strive for simplicity.
Description of knowledge and experience:
Required Skills:
Proficiency in developing ETL pipelines with Apache Spark, Glue, and Delta Lake.
Strong SQL knowledge and experience with Data Warehouse concepts.
Experience with AWS services, including S3, Lambda functions, and Athena.
Hands-on experience with Python programming.
Working experience with Git, Azure pipelines, and Infrastructure-as-Code (IaC) using CDK.
Start: ASAP
Duration: Rest of the year
Allocation: Fulltime, time & material
Location: Måløv, hybrid (3 days onsite & 2 days remote)