Match score not available

Data Engineer (Spark)

fully flexible
Remote: 
Full Remote
Contract: 
Work from: 

Offer summary

Qualifications:

Bachelor's degree in Computer Science, Engineering, or a related field., Proficiency in data processing technologies such as Spark, Cloudera, and Airflow., Experience with cloud services, particularly AWS, for infrastructure management., Strong programming skills in Python, Java, or Scala for data processing tasks..

Key responsabilities:

  • Develop and maintain a high-performance data processing platform for automotive data.
  • Design and implement data pipelines for processing large volumes of data in streaming and batch modes.
  • Collaborate with cross-functional teams to understand data requirements and ensure seamless integration of data sources.
  • Monitor and troubleshoot the platform to ensure high availability and performance of data processing.

Addepto logo
Addepto Startup http://www.addepto.com
51 - 200 Employees
See all jobs

Job description

Addepto is a leading consulting and technology company specializing in AI and Big Data, helping clients deliver innovative data projects. We partner with top-tier global enterprises and pioneering startups, including Rolls Royce, Continental, Porsche, ABB, and WGU. Our exclusive focus on AI and Big Data has earned us recognition by Forbes as one of the top 10 AI companies.


As a Data Engineer, you will have the exciting opportunity to work with a team of technology experts on challenging projects across various industries, leveraging cutting-edge technologies. Here are some of the projects we are seeking talented individuals to join:

  • Development and maintenance of a large platform for processing automotive data. A significant amount of data is processed in both streaming and batch modes. The technology stack includes Spark, Cloudera, Airflow, Iceberg, Python, and AWS.
  • Design and development of a universal data platform for global aerospace companies. This Azure and Databricks powered initiative combines diverse enterprise and public data sources. The data platform is at the early stages of the development, covering design of architecture and processes as well as giving freedom for technology selection.
  • Centralized reporting platform for a growing US telecommunications company. This project involves implementing BigQuery and Looker as the central platform for data reporting. It focuses on centralizing data, integrating various CRMs, and building executive reporting solutions to support decision-making and business growth.


🚀 Your main responsibilities:

  • Develop and maintain a high-performance data processing platform for automotive data, ensuring scalability and reliability.
  • Design and implement data pipelines that process large volumes of data in both streaming and batch modes.
  • Optimize data workflows to ensure efficient data ingestion, processing, and storage using technologies such as Spark, Cloudera, and Airflow.
  • Work with data lake technologies (e.g., Iceberg) to manage structured and unstructured data efficiently.
  • Collaborate with cross-functional teams to understand data requirements and ensure seamless integration of data sources.
  • Monitor and troubleshoot the platform, ensuring high availability, performance, and accuracy of data processing.
  • Leverage cloud services (AWS) for infrastructure management and scaling of processing workloads.
  • Write and maintain high-quality Python (or Java/Scala) code for data processing tasks and automation.

Required profile

Experience

Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Decision Making
  • Collaboration
  • Problem Solving

Data Engineer Related jobs