Senior Cloud Data Developer

Remote: 
Full Remote
Contract: 
Work from: 

Offer summary

Qualifications:

Minimum of 10 years of relevant experience in data engineering or related field., Strong skills in cloud-based data warehousing platforms, particularly Databricks and AWS., Proficiency in SQL (Databricks dialect preferred) and Python/PySpark for data processing., In-depth knowledge of data modeling, monitoring, and logging practices..

Key responsibilities:

  • Design, develop, and maintain data PySpark pipelines for genomic and healthcare data.
  • Optimize data warehousing schemas and conduct performance optimizations for efficient querying.
  • Implement and monitor data quality checks to ensure high data integrity across the pipeline.
  • Collaborate with team members to resolve issues and improve system performance, providing training and documentation as needed.

Geisinger logo
Geisinger XLarge https://www.geisinger.org/
10001 Employees
See all jobs

Job description

Location:

Work from home (Pennsylvania)

Shift:

Days (United States of America)

Scheduled Weekly Hours:

40

Worker Type:

Regular

Exemption Status:

Yes

Job Summary:

Geisinger Genomic Science is developing new software tools to support both research and clinical applications of population-scale genomic data with the goal of integrating genomic data into routine care. As part of a collaborative and innovative team, this is an opportunity to build systems from the ground up in a dynamic environment that marries the energy of a start-up to the stability and resources of a large healthcare system.

We are seeking a skilled Data Engineer with experience in cloud-based data warehouse solutions, particularly the Databricks platform and AWS. The ideal candidate will be responsible for modeling our genomic and other related data within the Databricks environment, setting up monitoring and logging, optimizing Spark operations, database schemas, and warehouse performance, and providing general Databricks computing support for scientific and AI/ML research.


Responsibilities:

Reviews, analyzes, modifies, creates, debugs, designs, and tests data ingestion and processing pipelines using modern development methodologies and tools. Provides key insight in design and application architecture discussions. A key player on a multidisciplinary team and will engage in all aspects of the application development life cycle as needed for each project to be successful.

Job Duties:

  • Design, develop, and maintain data PySpark pipelines for modeling genomic and healthcare data.

  • Define and optimize data warehousing schemas to support efficient querying, storage, and analytical workflows.

  • Conduct performance optimizations of the data warehouse by utilizing best practices for schema design, indexing, query optimization, and resource management.

  • Leverage Databricks for data engineering tasks, including building, managing, and orchestrating data pipelines and optimizing data processing workflows.

  • Ensure scalability and reliability of the data architecture to handle large datasets in genomic and healthcare domains.

  • Implement and monitor data quality checks to ensure high data integrity across the pipeline and data warehouse.

  • Provide general Databricks computing support, including helping users determine compute resource needs, debugging, and optimizing Spark/SQL code.

  • Stay up-to-date with advancements in data warehousing technologies and tools to drive continuous improvements in performance and efficiency.

  • Implements code and documents all system changes based on assignments and timelines.

  • Provides code review, testing, debugging, technical documentation, general testing instructions, go-live planning, assistance in go-live moves, and post-live support.

  • Communicates all progress, roadblocks, and issues to the team and management in a timely manner.

  • Collaborates with team members to identify and resolve issues, improve system performance, and ensure code quality.

  • Provides training and documentation to team members on new development assignments.

  • Trains and mentors staff members.

  • Assists in the development and documentation of a technical approach for a solution to the team stakeholders.

Work is typically performed in an office environment. Accountable for satisfying all job specific obligations and complying with all organization policies and procedures. The specific statements in this profile are not intended to be all-inclusive. They represent typical elements considered necessary to successfully perform the job.


*Relevant experience may be a combination of related work experience and degree obtained (Associate’s Degree = 2 years; Bachelor’s Degree = 4 years; Master’s Degree = 6 years).

Position Details:

Highly Preferred:

  • Experience with cloud-based data warehousing platforms (Databricks preferred).
  • Strong skills with data analytics engines for large-scale data ( Apache Spark preferred)
  • SQL (Databricks dialect preferred) and Python/PySpark
  • In-depth knowledge of data modeling, monitoring, and logging practices.
  • Familiarity with AWS or other cloud services.
  • Excellent communication skills to convey progress, roadblocks, and issues to the team and management.
  • Ability to work effectively as part of a multidisciplinary team.

 

Education:

High School Diploma or Equivalent (GED)- (Required)

Experience:

Minimum of 10 years-Relevant experience* (Required)

Certification(s) and License(s):

Skills:

OUR PURPOSE & VALUES: Everything we do is about caring for our patients, our members, our students, our Geisinger family and our communities.

  • KINDNESS: We strive to treat everyone as we would hope to be treated ourselves.
  • EXCELLENCE: We treasure colleagues who humbly strive for excellence.
  • LEARNING: We share our knowledge with the best and brightest to better prepare the caregivers for tomorrow.
  • INNOVATION: We constantly seek new and better ways to care for our patients, our members, our community, and the nation.
  • SAFETY: We provide a safe environment for our patients and members and the Geisinger family. 

We offer healthcare benefits for full time and part time positions from day one, including vision, dental and domestic partners. Perhaps just as important, we encourage an atmosphere of collaboration, cooperation and collegiality.

We know that a diverse workforce with unique experiences and backgrounds makes our team stronger. Our patients, members and community come from a wide variety of backgrounds, and it takes a diverse workforce to make better health easier for all.  We are proud to be an affirmative action, equal opportunity employer and all qualified applicants will receive consideration for employment regardless to race, color, religion, sex, sexual orientation, gender identity, national origin, disability or status as a protected veteran.

Required profile

Experience

Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Teamwork
  • Communication

Related jobs