JD for Spark developer
Experience:3 to 5 Years
Work Location: Bangalore
Responsibilities
. As a Spark developer you will manage the development of scalable distributed Architecture
defined by the Architect or tech Lead in our team.
. Analyse, assemble large data sets to designed for the functional and non-functional
requirements.
. You will develop ETL scripts for big data sources.
. Identify, design & optimise data processing & automate for reports and dashboards.
. You will be responsible for workflow optimizations, data optimizations and ETL optimization
as per the requirements elucidated by the team.
. Work with stakeholders such as Product managers, Technical Leads & Service Layer
engineers to ensure end-to-end requirements are addressed.
. Strong team player to adhere to Software Development Life cycle (SDLC) and
documentations needed to represent every stage of SDLC.
General Qualifications:
. BE/BTech/MCA/MTech/MSC with 3-5 years of experience with Apache Spark development
or in alternative Data Engineering & analytics frameworks.
. Overall experience in the field Java/Scala as a Backend developer could be between 3 years.
. Working experience in Bigdata is important.
Technical Skills
. Programming experience in Java/Scala is needed. Experience in Python also will have an
advantage.
. Writing ETL Stack in scalable & optimized fashion using Apache Spark/Hadoop/Kafka etc.
. Should have working experience in writing distributed & optimized Apache Spark jobs for
various Machine Learning Algorithms.
. Experience building and optimizing data pipeline and data sets is essential.
. Should have worked with Kafka, at least one of No SQL databases (e.g. Cassandra,
MongoDB, elastic DB) and at least one of RDBMS (e.g. MySQL)
. Working experience to use cache such as Redis, Apache Ignite, Hazelcast will advantage.
. Working knowledge of Docker and Kubernetes will be a plus.
Kindly find the below inputs from customer to get quality profiles
Include the criteria apart from Spark à Experience in No sql databases (elastic search OR
Cassandra OR MongoDB), Apache Kafka.
Look for Spark Streaming (people who have used Sqoop, likely doesn’t know Spark Streaming).
Scala or Java based work experience using Spark is a MUST. Python usage with Spark is
optional, not must.
Spark streaming ,Kafka experience is mandatory for this requirement. Kindly add more profiles with
Spark Streaming, Kafka, Nosql experience