Senior Data Engineer
Are You in the 25%?
- Check if SentraAI will actually see your resume
- Get AI-rewritten bullet points
- Download Gulf-ready CV
60 seconds. $3.99 one-time.
As a Senior Data Engineer at SentraAI, you are responsible for designing, building, and owning production-grade data pipelines that underpin enterprise analytics, AI, and regulatory workloads.
You operate in complex, governed enterprise environments where reliability, data quality, performance, and auditability are non-negotiable. This is a senior hands-on engineering role with clear ownership expectations. You will lead by example, shape engineering standards, and act as a technical reference point for less experienced engineers.
This role is for engineers who want end-to-end accountability, not just implementation tasks.
About SentraAI
SentraAI is a specialist enterprise AI and data services firm, focused on helping large, regulated organisations move AI and data platforms from experimentation into production, safely and sustainably.
We work inside enterprise run-states, where governance, operational risk, change control, and long-term ownership are integral to delivery. Our teams are trusted to build platforms, systems, and operating models that can be run, audited, and evolved, not just launched.
We prioritise engineering discipline, clarity of responsibility, and delivery quality over speed theatre or hype.
Key Responsibilities
Data Engineering & Technical Ownership
• Design, develop, and optimise Spark-based data pipelines (PySpark preferred) for enterprise-scale workloads
• Own the implementation of Bronze, Silver, and Gold data layers, ensuring clear contracts between layers
• Lead the modernisation of legacy data workflows into robust, cloud-native pipelines
• Design and implement incremental, CDC, and event-driven ingestion patterns where appropriateData Quality, Performance & Reliability
• Define and enforce data quality rules, validation frameworks, and testing strategies
• Ensure pipelines are performant, cost-aware, and resilient under load
• Diagnose complex pipeline failures and performance issues using a root-cause approach
• Ensure all data assets are traceable, explainable, and suitable for regulated consumptionEngineering Standards & Leadership
• Set and uphold coding, testing, and documentation standards across the squad
• Lead code reviews with a focus on correctness, maintainability, and long-term operability
• Mentor Data Engineers, providing technical guidance and constructive feedback
• Contribute to architectural discussions with Tech Leads and Architects
• Proactively identify technical risk and escalate with clear recommendationsRequirements
Data Engineering & Technical Ownership
• Design, develop, and optimise Spark-based data pipelines (PySpark preferred) for enterprise-scale workloads
• Own the implementation of Bronze, Silver, and Gold data layers, ensuring clear contracts between layers
• Lead the modernisation of legacy data workflows into robust, cloud-native pipelines
• Design and implement incremental, CDC, and event-driven ingestion patterns where appropriateData Quality, Performance & Reliability
• Define and enforce data quality rules, validation frameworks, and testing strategies
• Ensure pipelines are performant, cost-aware, and resilient under load
• Diagnose complex pipeline failures and performance issues using a root-cause approach
• Ensure all data assets are traceable, explainable, and suitable for regulated consumptionEngineering Standards & Leadership
• Set and uphold coding, testing, and documentation standards across the squad
• Lead code reviews with a focus on correctness, maintainability, and long-term operability
• Mentor Data Engineers, providing technical guidance and constructive feedback
• Contribute to architectural discussions with Tech Leads and Architects
• Proactively identify technical risk and escalate with clear recommendationsRequired Qualifications
Core Technical Capability
• Strong proficiency in Python for data engineering and automation
• Advanced SQL skills, including complex analytical queries and optimisation techniques
• Deep practical experience with Apache Spark (PySpark or Scala) and distributed processing
• Strong understanding of data modelling and layered data architectures
• Proficient with Git-based workflows, code reviews, and collaborative development
• Confident working in Linux-based environmentsData Platform & Streaming
• Hands-on experience with event streaming platforms (e.g. Kafka)
• Experience working with lakehouse architectures and modern table formats
• Strong understanding of batch vs streaming trade-offs and data consistency patternsCloud Experience
• Strong hands-on experience with at least one major cloud platform: (AWS/Azure/GCP)
• Comfortable designing and operating cloud-native data pipelines in productionAdvantageous (Not Mandatory)
• Experience with orchestration tools such as Airflow or Dagster
• Experience with data quality frameworks (Great Expectations, dbt tests, Soda)
• Exposure to CI/CD pipelines for data assets
• Experience in regulated or highly governed enterprise environments
• Familiarity with containerisation (Docker)
• Cloud or data engineering certificationsBenefits
• Enterprise AI, done properly.
We exist to take AI and data out of experimentation and into production environments that are regulated, scrutinised, and expected to work every day.
• Quality is not optional.
SentraAI is built on the belief that engineering discipline, governance by design, and delivery rigour are competitive advantages, not overhead.
• Clear ownership and accountability.
You will be trusted with real responsibility, clear mandates, and meaningful outcomes, not diluted roles or performative activity.
• Work that survives contact with reality.
We design systems, operating models, and decisions that still stand up months and years after go-live, not just at demo time.
• Run-state matters as much as build-state.
We optimise for operability, auditability, and change control from day one, because that is where enterprise value is won or lost.
• Substance over hype.
We deliberately avoid delivery theatre, buzzwords, and novelty for novelty’s sake. Credibility is earned through execution.
• Learn from experienced practitioners.
You will work alongside people who have built, broken, fixed, and run enterprise systems, and who care deeply about doing the work properly.
• A firm with a point of view.
SentraAI is opinionated by design. We stand for doing fewer things, better, and we expect our people to take pride in that standard.
Requirements
- •Design, develop, and optimize Spark-based data pipelines (PySpark preferred) for enterprise-scale workloads
- •Own the implementation of Bronze, Silver, and Gold data layers, ensuring clear contracts between layers
- •Lead the modernization of legacy data workflows into robust, cloud-native pipelines
- •Design and implement incremental, CDC, and event-driven ingestion patterns
- •Define and enforce data quality rules, validation frameworks, and testing strategies
- •Ensure pipelines are performant, cost-aware, and resilient under load
- •Diagnose complex pipeline failures and performance issues using a root-cause approach
- •Ensure all data assets are traceable, explainable, and suitable for regulated consumption
Responsibilities
- •Set and uphold coding, testing, and documentation standards across the squad
- •Lead code reviews with a focus on correctness, maintainability, and long-term operability
- •Mentor Data Engineers, providing technical guidance and constructive feedback
- •Contribute to architectural discussions with Tech Leads and Architects
- •Proactively identify technical risk and escalate with clear recommendations
Related Jobs
- Check if SentraAI will actually see your resume
- Get AI-rewritten bullet points
- Download Gulf-ready CV
60 seconds. $3.99 one-time.
SentraAI develops and implements artificial intelligence solutions to help businesses improve operations and decision-making. They serve various industries.
Visit WebsiteView all jobs