Data Steward
REQ-10075895
Apr 21, 2026
LOC_IN
About the Role
Key Responsibilities:
- Design, develop, and optimize data pipelines using Python / PySpark to process and analyze large datasets.
- Write complex SQL queries for data extraction, transformation, and loading (ETL).
- Work with Databricks to build and maintain collaborative and scalable data solutions.
- Implement and manage CI/CD processes for data pipeline deployments to ensure seamless and efficient integration and deployment.
- Collaborate with data scientists and business analysts to understand data requirements and deliver appropriate solutions.
- Ensure data quality, integrity, and security across all data processes.
- Monitor and troubleshoot data pipelines and workflows to resolve issues promptly.
- Continuously improve data and code quality through automation and best practices.
- Ensure projects are delivered on schedule and within established deadlines.
- Aid in the creation and maintenance of Standard Operating Procedures (SOPs). Support the development and upkeep of knowledge repositories that capture both qualitative and quantitative reports.
Qualifications:
- Bachelor’s degree in computer science, Engineering, Information Technology, or a related field with 4+ years of relevant work experience.
- Proven experience with PySpark, including developing and tuning data processing applications.
- Advanced proficiency in SQL and experience in writing complex queries and optimizing them for performance.
- Hands-on experience with Databricks, including notebooks, clusters, and integration with other data tools.
- Strong understanding of CI/CD pipelines and experience with tools such as Jenkins, GitLab CI/CD, or Azure DevOps.
- Familiarity with cloud platforms (e.g., AWS, Azure, Google Cloud) and related data services.
- Good understanding on data quality management concepts
- Ability to lead and own engagements independently with excellent problem-solving skills and attention to detail.
- Strong communication and collaboration skills, with the ability to work effectively in an Agile team environment.
Preferred Skills:
- Understanding of healthcare / life sciences domain data and know-how of pharma ecosystem
- Knowledge of data warehousing concepts and tools (e.g., Snowflake, Redshift).
- Good to have knowledge on kedro framework.
- Understanding and applying effective data governance methods
Role Requirements
Why Novartis: Helping people with disease and their families takes more than innovative science. It takes a community of smart, passionate people like you. Collaborating, supporting and inspiring each other. Combining to achieve breakthroughs that change patients’ lives. Ready to create a brighter future together? https://www.novartis.com/about/strategy/people-and-culture
Benefits and Rewards: Learn about all the ways we’ll help you thrive personally and professionally.
Read our handbook (PDF 30 MB)
DIV_IM
Marketing
LOC_IN
Hyderabad (Office)
IN10 (FCRS = IN010) Novartis Healthcare Private Limited
FCT_MM
Full time
Regular
No