Match score not available

Lead Data Engineer - Remote for candidates in Serbia

Remote:

Full Remote

Contract:

Full time

Experience:

Senior (5-10 years)

Work from:

Serbia

Offer summary

Qualifications:

8+ years of experience in Data Engineering and Big Data, Proven work experience with ETL Development or similar areas, Advanced knowledge in Database Engines, Data Lakes, and practices, Hands-on experience with programming languages like Python, Java, Scala, SQL, Bachelor's degree in a related field.

Key responsabilities:

Create Extraction Models using document processing products
Conduct comprehensive document analysis for improved extraction accuracy
Define labeling requirements and manage the labeling process
Utilize metalanguage to create flexible document descriptions
Possibly train neural networks to enhance performance of extraction models

Antal International Human Resources, Staffing & Recruiting Large https://www.antal.com/

1001 - 5000 Employees

See more Antal International offers

Job description

Your missions

About the company:

Founded in the early 2000s, the digital healthcare marketing agency is dedicated to enhancing HCP engagement. Our client's team boasts over 120 marketing experts who possess in-depth knowledge of the ever-evolving healthcare landscape, intricate content, and the strategic use of data to achieve significant results. Their continuous achievements are a result of their proficiency in omnichannel strategies and a comprehensive range of digital services, including Account Services, Creative, Digital and Brand Strategy, Digital Production and Project Management, 3D Animation, Digital Development, and Video production.

Job Responsibilities:

Create Extraction Models using document processing products (Advanced Designer, Vantage): This involves using software to design and implement extraction models for various document types.

Comprehensive Document Analysis: Conduct in-depth analysis of documents based on provided requirements. This may include classifying, sorting, and generalizing document layouts to improve extraction accuracy.

Creating Requirements for Document Labelling: Define the labeling requirements for documents and manage the labeling process. This is crucial for training and improving the accuracy of extraction models.

Creating Flexible Descriptions using Metalanguage: Utilize a document processing tool to create flexible document descriptions using metalanguage, as described in the provided link. This is a key part of designing extraction models.

Training Neural Network: Possibly involved in training neural networks to enhance the performance of document extraction models.

Strong Analytical Skills: Ability to analyze documents, requirements, and data to design effective extraction models.

Basic Understanding of Data Formats (XML, JSON, and CSV): Familiarity with common data formats for document processing.

Command Line and PowerShell Proficiency: Proficiency in using command line interfaces and PowerShell for automation and scripting.

Experience with Git: Knowledge of version control systems like Git for collaborative software development.

Readiness to Acquire New Information: Willingness and ability to learn new information and adapt to changing technologies and requirements.

Capacity to Comprehend Novel Content: Ability to understand complex and unfamiliar content.

Ability to Organize Substantial Volumes of Data: Effective organization and management of large datasets.

Team Collaboration: Ability to work effectively within a team, as indicated by the requirement for collaboration.

As an advantage:

Basic Knowledge of Programming: Familiarity with programming concepts may be advantageous for working with extraction models and automation.

8+ years of experience in Data Engineering and Big Data: The ideal candidate should have substantial hands-on experience in the field.

Proven work experience with ETL Development or similar areas: Experience in Extract, Transform, Load (ETL) processes or related data integration tasks is necessary.

Advanced knowledge in Database Engines, Data Lakes, Data Practices, and policies: A deep understanding of various database technologies, data lakes, and industry best practices is expected.

In-depth understanding of database structure principles, integration, ETL, ELT, data processing, and data management: Proficiency in database principles, ETL/ELT concepts, data processing, and data management is crucial.

Hands-on experience with programming languages: Proficiency in programming languages such as Python, Java, Scala, and SQL is necessary for building data pipelines and performing data analysis.

Experienced in assembling large, complex data sets: The candidate should be able to gather and work with large and complex datasets that fulfill both functional and non-functional business requirements.

Experienced in building the infrastructure for data extraction, transformation, and loading: Building the necessary infrastructure for efficient data extraction, transformation, and loading from various data sources is a key skill.

Bachelor's degree in a related field: A bachelor's degree in a relevant field is typically a minimum educational requirement.

Analytical skills and strong organizational abilities: Strong analytical skills and organizational abilities are essential for effective data engineering.