Offer summary

Qualifications:

Any graduation with 3-10 years of experience., Proficiency in Python, especially with data libraries like Pandas and NumPy., Solid hands-on experience with PySpark and Spark SQL for distributed data processing., Expertise in writing complex SQL queries and designing scalable data pipelines..

Key responsibilities:

Develop and optimize PySpark jobs for big data processing.

Build scalable batch or streaming data pipelines using PySpark.

Develop REST APIs for data access and automation using frameworks like FastAPI or Flask.

Automate ETL workflows and integrate them into orchestration tools like Airflow.

Job description

Company Name :
Job Title :	Python Gen Pyspark API and SQL
Qualification :	Any graduation
Experience :	3 -10 YEARS
Must Have Skills :	Python (Core + Data Libraries) – Proficiency in Python, especially for data manipulation using Pandas, NumPy, etc. PySpark – Solid hands-on experience with distributed data processing using PySpark and Spark SQL. RESTful API Development – Ability to build and consume APIs using Flask, FastAPI, or Django. SQL (Advanced) – Expertise in writing complex SQL queries, tuning, and data modeling. ETL and Data Pipelines – Experience in designing and implementing scalable data pipelines in distributed environments.
Good to Have Skills :	Apache Airflow or Other Workflow Orchestration Tools – Knowledge of scheduling and monitoring data pipelines. Delta Lake / Apache Hudi / Data Lakehouse Architecture – Familiarity with modern data storage formats. Cloud Platforms (AWS/GCP/Azure) – Experience working with cloud-based data services like AWS EMR, Azure Synapse, or GCP Dataproc. Data Quality and Validation Frameworks – Use of tools like Great Expectations or custom validations. Containerization (Docker/Kubernetes) – Understanding of containerizing Spark or API services for deployment.
Roles and Responsibilities :	Develop and Optimize PySpark Jobs Build scalable batch or streaming data pipelines using PySpark for big data processing. API Design and Integration Develop REST APIs for data access and automation using Python frameworks like FastAPI or Flask. SQL Development and Tuning Write, optimize, and maintain complex SQL queries for data extraction, transformation, and reporting. Data Pipeline Automation Build automated ETL workflows and integrate them into orchestration tools like Airflow or cloud-native solutions. Collaboration and Documentation Work closely with data engineers, analysts, and business stakeholders; maintain clear documentation for code, APIs, and processes
Location :	Hyderabad ,Bangalore and chennai
CTC Range :	20 -30 LPA
Notice period :	Immediate
Shift Timings :
Mode of Interview :	VIRTUAL
Mode of Work :
Mode of Hire :
Note :