Requirements
Job Overview
The primary goal of developers is
efficiency, consistency, scalability and reliability.
We are responsible for the
Platform … all the tooling integrations, security, access-control, data
classification/management, orchestration, self-service “lab” concept, observability,
reliability … as well as data availability (data ingestion)
We are NOT responsible for – Data
Modeling, Data Warehousing, Reporting (PowerBI)
… although we do
work with PBI team for access control from PBI to Snowflake.
Everything we do is achieved through code –
nothing is manual (or ClickOps) – everything is automated – through the
effectiveness of our CI/CD framework – GitHub, GitHub-Actions, Terraform,
Python.
Orchestration is centrally managed using
Managed Airflow
We manage RBAC / Access Control
We’re responsible for Tooling Integrations and all the
connectivity and authentication requirements
· Ingestion Methods/Patterns
o Fivetran
o Snowflake-Snowpipe (File-Based sources)
o Snowflake-Secure Data Share
· Solid Software Development (Full
SDLC) Experience with excellent Coding skills:
o Python (required).
o Good knowledge of Git and GitHub (required)
o Good code-management experience/best
practices (required).
· Understanding of CI/CD to automate
and improve the efficiency, speed, and reliability of software delivery.
- Best
Practices/principals
- Github
Actions
- automate
workflows directly from their GitHub repositories.
- Automation
of building, testing, and deploying code – inc. code linting, security
scanning, and version management.
- Experience
with testing frameworks
- Good
knowledge of IaC (Infrastructure as Code) using Terraform (required)
- Strong
verbal and written skills are a must, ideally with the
ability to communicate in both technical and some business language
- A good level
of experience with cloud technologies – AWS - namely S3, Lambda,
SQS, SNS, API Gateway (API Development), Networking (VPCs), PrivateLink
and Secrets Manager.
- Extensive
hands-on experience engineering data pipelines and a solid
understanding of the full data supply chain, from discovery &
analysis, data ingestion, processing & transformation, to
consumption/downstream data-integration.
· A passion for continuous improvement and
learning,
for optimization both in terms of cost and efficiency, as well as ways of
working. Obsessed with data observability (aka data reconciliation), ensuring
pipeline and data integrity.
- Experience
working with large structured/semi-structured datasets
- A good
understanding of Parquet, Avro, JSON/XML
- Experience
with Apache Airflow / MWAA or similar orchestration tooling.
- Experience
with Snowflake as a Data Platform
- Solid
understanding of Snowflake Architecture – compute, storage, partitioning
etc.
- Key features
- such as COPY-INTO, Snowpipe, object-level tagging and masking policies
- RBAC
(security model) – design and administration – intermediate skill
required
o query performance tuning, and zero copy clone –
nice to have
o virtual warehouse (compute) sizing
- TSQL experience – ability to understand complex queries and think about
optimisation - advantageous
- Data Modelling experience – advantageous
- Exposure to dbt (data build tool) for data transformations – advantageous
- Exposure to Alation or other Enterprise Metadata Management (EMM) tooling – advantageous
- Documentation: architectural
designs, operational procedures, and platform configurations to ensure
smooth onboarding and troubleshooting for team members.