Match score not available

Data Engineer

Remote:

Full Remote

Experience:

Junior (1-2 years)

Work from:

Portugal

Offer summary

Qualifications:

Proficiency in PySpark and Spark SQL, Experience with Databricks and Delta Live Tables, Familiarity with Azure Data Lake Storage, At least 1 year experience with Terraform and GitOps practices.

Key responsabilities:

Analyze business and technical challenges
Design, build, and deploy data pipelines

emagine Large https://www.emagine.org/

501 - 1000 Employees

See more emagine offers

Job description

Industry: E-commerce (Fashion & Home)

Location: Portugal

Work Model: 100% Remote

Assignment Type: B2B or Recibos Verdes

Start Date: ASAP

Project Language: English

Project Overview:

Join a leading e-commerce company in fashion and home décor, as they undertake critical data engineering projects to support their evolving business needs. This is an exciting opportunity to contribute to high-impact data solutions for an innovative, customer-focused brand, leveraging the latest in data engineering best practices.

Responsibilities:

Project Understanding and Communication:

Analyze business and technical challenges from a user perspective.
Collaborate with Data Architects and Project Managers to ensure solutions align with the client´s data architecture.

Data Pipeline Development:

Design, build, and deploy efficient data pipelines according to project requirements.
Apply best practices for performance, scalability, and maintainability.
Use Terraform to deploy and manage infrastructure efficiently.

Testing and Deployment:

Define test cases and conduct testing in collaboration with the Project Manager.
Present completed developments to Data Architects and Lead DataOps, ensuring smooth deployment and active monitoring post-deployment.

Documentation and Peer Review:

Document processes, tests, and results thoroughly.
Conduct peer reviews and participate in code reviews for quality assurance.

Requirements:

Hard Skills:

Proficiency in PySpark and Spark SQL for data processing.
Experience with Databricks and Delta Live Tables for ETL and workflow orchestration.
Familiarity with Azure Data Lake Storage for data storage and management.
At least 1 year of experience with Terraform and GitOps practices for infrastructure deployment.
Strong understanding of ETL/ELT processes, data warehousing, data lakes, and data modeling.
Knowledge of orchestration tools (e.g., Apache Airflow) for pipeline scheduling and management.
Experience with data partitioning and lifecycle management in cloud storage.

Optional: Experience with Databricks Asset Bundles, Kubernetes, Apache Kafka, and Vault is a plus.

Soft Skills:

English Fluency: Strong written and verbal English skills in a working environment.
Communication: Ability to convey technical concepts effectively and understand user needs.
Organizational Skills: Detail-oriented with the ability to maintain structured documentation.
Problem-Solving: Proactive approach to understanding and addressing data challenges.