Match score not available

Data Engineer - Optimization Solutions (Remote)

Remote:

Full Remote

Contract:

Full time

Experience:

Mid-level (2-5 years)

Work from:

India

Offer summary

Qualifications:

3+ years of experience as data engineer, High proficiency in Python required, Extensive SQL and database experience, Experience with data pipeline tools, Hands-on experience with cloud platforms.

Key responsabilities:

Design and maintain data pipelines
Collaborate with data scientists on needs
Develop data quality checks and systems
Optimize storage and retrieval processes
Stay updated on industry best practices

codvo.ai

51 - 200 Employees

See all jobs

Job description

Company Overview:

At Codvo, software and people transformations go together. We are a global, empathy-led technology services company with a core DNA of product innovation and mature software engineering. We uphold the values of Respect, Fairness, Growth, Agility, and Inclusiveness in everything we do.

Data Engineer - Optimization Solutions

About the Role: Responsibilities: Design, build, and maintain robust and scalable data pipelines to support the development and deployment of mathematical optimization models.

Collaborate closely with data scientists to deeply understand the data requirements for optimization models.

This includes: Data preprocessing and cleaning

Feature engineering and transformation

Data validation and quality assurance

Develop and implement comprehensive data quality checks and monitoring systems to guarantee the accuracy and reliability of the data used in our optimization solutions.

Optimize data storage and retrieval processes for highly efficient model training and execution.

Work effectively with large-scale datasets, leveraging distributed computing frameworks when necessary to handle data volume and complexity.

Stay up to date on the latest industry best practices and emerging technologies in data engineering, particularly in the areas of optimization and machine learning.

Experience:

3+ years of demonstrable experience working as a data engineer, specifically focused on building and maintaining complex data pipelines.

Proven track record of successfully working with large-scale datasets, ideally in environments utilizing distributed systems.

Technical Skills - Essential:

Programming: High proficiency in Python is essential. Experience with additional scripting languages (e.g., Bash) is beneficial.

Databases: Extensive experience with SQL and relational database systems (PostgreSQL, MySQL, or similar). You should be very comfortable with:

Writing complex and efficient SQL queries

Understanding performance optimization techniques for databases

Applying schema design principles

Data Pipelines: Solid understanding and practical experience in building and maintaining data pipelines using modern tools and frameworks. Experience with the following is highly desirable:
Workflow management tools like Apache Airflow
Data streaming systems like Apache Kafka
Cloud Platforms: Hands-on experience working with major cloud computing environments such as AWS, Azure, or GCP. You should have a strong understanding of:
Cloud-based data storage solutions (Amazon S3, Azure Blob Storage, Google Cloud Storage)
Cloud compute services
Cloud-based data warehousing solutions (Amazon Redshift, Google Big Query, Snowflake)
- Technical Skills - Advantageous (Not Required, But Highly Beneficial):
NoSQL Databases: Familiarity with NoSQL databases like MongoDB, Cassandra, and DynamoDB, along with an understanding of their common use cases.
Containerization: Understanding of containerization technologies such as Docker and container orchestration platforms like Kubernetes.
Infrastructure as Code (IaC): Experience using IaC tools such as Terraform or CloudFormation.
Version Control: Proficiency with Git or similar version control systems.

Additional Considerations: Industry Experience: While not a strict requirement, experience working in industries with a focus on optimization, logistics, supply chain management, or similar domains would be highly valuable.

Machine Learning Operations (MLOps): Familiarity with MLOps concepts and tools is increasingly important for data engineers in machine learning-focused environments.