Title: Senior Data Engineer Duration: 6 months plus extension (multiyear) Location: Remote – client in CA
Description: We are looking for a talented Data Engineer to assist in migration, ETL and Data Integration.
Collaborate with product managers, data scientists, engineering, and program management teams to define a migration strategy, business deliverables and strategies for data products.
Collaborate with business partners, operations, senior management, etc. on day-to-day operational support
Support operational reporting, self-service data engineering efforts, production data pipelines, and business intelligence suite
Interface with multiple diverse stakeholders and gather/understand business requirements, assess feasibility and impact, and deliver on time with high quality
Help shape the technical direction of a domain within the Data organization
Build a reliable and scalable single source of truth data product for internal and external analytics and to enable self-service
Partner with product, data analytics, data science, engineering teams, and other data engineers to translate business requirements into data solutions
Develop and automate large-scale, high-performance data processing systems and visualization to ensure reliability and meet critical business requirements
Lead data engineering projects, and overall strategy for data governance, security, privacy, quality, and retention
Help hire and mentor data engineers and continue promoting data engineering and analytics tooling & standards
Use state of the art technologies to acquire, ingest and transform big datasets
Responsibilities
5 to 8+ years of experience within the field of data engineering or related technical work including business intelligence, analytics
Experience and comfort solving problems in an ambiguous environment where there is constant change. Have the tenacity to thrive in a dynamic and fast-paced environment, inspire change, and collaborate with a variety of individuals and organizational partners
Experience designing and building scalable and robust data pipelines to enable data-driven decisions for the business
Very good understanding of the full software development life cycle
Very good understanding of Data warehousing concepts and approaches
Experience in building Data pipelines and ETL approaches
Experience in building high-volume data workflows in a cloud environment
Experience in building Data warehouse and Business intelligence projects
Experience in data cleansing, data validation and data wrangling
Hands-on experience in AWS cloud and AWS native technologies such as Glue, Lambda, Kinesis, Lake Formation, S3, Redshift
Hands-on experience with Snowflake
Experience with Business Intelligence tools like Tableau, Cognos, ThoughtSpot, etc is a plus
Hands-on experience building complex business logics and ETL workflows
Proficient in SQL, PL/SQL, relational databases (RDBMS), database concepts and dimensional modeling
Deep experience in data modeling, data architecture, and other areas directly relevant to data engineering - sometimes this is measured in years (7+), sometimes in the quality of the experience itself solving complex data challenges.
Technical leadership; capable of handling mentorship, cross functional project execution, and solid individual contributions
Proficiency with programming languages (e.g. Python) and Data Warehouse technologies (NoSQL, logging, columnar, Snowflake, dbt, etc.), Big Data technologies (e.g Hadoop, Spark, etc.), analytics (Metabase, Looker, Tableau, etc.), data orchestration and schema management technologies (e.g. Avro, etc.). These are example technologies and not at all prescriptive
Bachelor's degree in Engineering, Computer Science, Statistics, Economics, Mathematics, Finance, or a related quantitative field
Role and Responsibilities
Foundation and Infrastructure – Assist, where needed, in system configuration and provisioning AWS technical environment
Data Discovery and Ingestion -- Identify tables and fields from legacy environments, extract and load into base storage in Amazon Studios landscape, define relevancy rules and filters based on business definitions
Data Profiling -- Perform data profiling on ingested fields, determine and define candidates for identifying duplicate records
Deduplication – Analyze and score base tables to recommend a final dataset with redundant records removed
Assessment – Reporting and reviews of proposed dataset, final resolution of deduplication logic and relevancy rules
Enrichment – Enrich final dataset with additional 3rd party details or determine custom grouping or classifications
Storage – Populate the final result into a target AWS Redshift database for integration into other downstream consumers or analytic applications
Required profile
Experience
Level of experience:Senior (5-10 years)
Spoken language(s):
English
Check out the description to know which languages are mandatory.