REQ-10075895
Apr 21, 2026
LOC_IN

About the Role

Key Responsibilities:

  • Design, develop, and optimize data pipelines using Python / PySpark to process and analyze large datasets.
  • Write complex SQL queries for data extraction, transformation, and loading (ETL).
  • Work with Databricks to build and maintain collaborative and scalable data solutions.
  • Implement and manage CI/CD processes for data pipeline deployments to ensure seamless and efficient integration and deployment.
  • Collaborate with data scientists and business analysts to understand data requirements and deliver appropriate solutions.
  • Ensure data quality, integrity, and security across all data processes.
  • Monitor and troubleshoot data pipelines and workflows to resolve issues promptly.
  • Continuously improve data and code quality through automation and best practices.
  • Ensure projects are delivered on schedule and within established deadlines.
  • Aid in the creation and maintenance of Standard Operating Procedures (SOPs). Support the development and upkeep of knowledge repositories that capture both qualitative and quantitative reports.

Qualifications:

  • Bachelor’s degree in computer science, Engineering, Information Technology, or a related field with 4+ years of relevant work experience.
  • Proven experience with PySpark, including developing and tuning data processing applications.
  • Advanced proficiency in SQL and experience in writing complex queries and optimizing them for performance.
  • Hands-on experience with Databricks, including notebooks, clusters, and integration with other data tools.
  • Strong understanding of CI/CD pipelines and experience with tools such as Jenkins, GitLab CI/CD, or Azure DevOps.
  • Familiarity with cloud platforms (e.g., AWS, Azure, Google Cloud) and related data services.
  • Good understanding on data quality management concepts
  • Ability to lead and own engagements independently with excellent problem-solving skills and attention to detail.
  • Strong communication and collaboration skills, with the ability to work effectively in an Agile team environment.

Preferred Skills:

  • Understanding of healthcare / life sciences domain data and know-how of pharma ecosystem
  • Knowledge of data warehousing concepts and tools (e.g., Snowflake, Redshift).
  • Good to have knowledge on kedro framework.
  • Understanding and applying effective data governance methods

Role Requirements

Why Novartis: Helping people with disease and their families takes more than innovative science. It takes a community of smart, passionate people like you. Collaborating, supporting and inspiring each other. Combining to achieve breakthroughs that change patients’ lives. Ready to create a brighter future together? https://www.novartis.com/about/strategy/people-and-culture

Benefits and Rewards: Learn about all the ways we’ll help you thrive personally and professionally.
Read our handbook (PDF 30 MB)

DIV_IM
Marketing
LOC_IN
Hyderabad (Office)
FCT_MM
Full time
Regular
No
careers default image
REQ-10075895

Data Steward

Apply to Job