Senior Data Engineer - Hybrid - KSA - (10-12 Months) - RTG
Before You Apply
- See if your CV survives robusta's ATS filters
- Get AI-rewritten bullet points
- Download Gulf-ready CV
60 seconds. $3.99 one-time.
Robusta assists organizations in transitioning to a digital-first approach, crafting unforgettable experiences for their customers. We provide strategy, design, product, and technology services to prominent businesses and brands, utilizing our go-to-market expertise to facilitate seamless customer experiences and enhance conversion rates.
About the Role
We are seeking a highly experienced Senior Data Engineer to lead the technical design, implementation, and delivery of an enterprise-grade, AI-ready Data Lakehouse platform. This role is critical in building the foundational data layer for a large-scale digital transformation initiative that will support AI agents, digital workers, and knowledge graph (ontology) systems.
The ideal candidate will have strong software engineering experience with a focus on data pipeline development, data architecture, and scalable distributed systems. You will play a key role in designing and maintaining robust data infrastructure that enables advanced analytics and AI capabilities.
This position also involves technical leadership, mentoring engineering teams in a collaborative co-building model, and ensuring long-term operational ownership.
Key Responsibilities
• Lakehouse Architecture & Implementation: Design and deploy a unified Data Lakehouse utilizing the Medallion architecture (Bronze, Silver, Gold) and open table formats (e.g., Delta Lake, Apache Iceberg) on cloud infrastructure hosted within Saudi Arabia.
• Data Ingestion & Pipeline Engineering: Build reusable, automated ingestion frameworks (batch and streaming) capable of processing both structured data (RDBMS, APIs) and unstructured data (PDFs, policy documents) to feed downstream AI models and semantic reasoning engines.
• Data Quality & Governance: Implement automated data quality "circuit breakers" (completeness, uniqueness, referential integrity) and end-to-end data lineage tracking frameworks.
• Optimization: Optimize data processing workflows for performance, scalability, and cost-efficiency.
• System Monitoring and Maintenance: Monitor and maintain data systems, responding to SEVs or other urgent issues to ensure continuous operations.
• Security & Compliance: Ensure the platform adheres strictly to NCA (National Cybersecurity Authority) and NDMO (National Data Management Office) standards. Implement AES-256 encryption at rest, TLS 1.2+ in transit, robust Key Management Systems (KMS), and centralized audit logging.
• Access Control Integration: Design and deploy granular Role-Based Access Control (RBAC) and Attribute-Based Access Control (ABAC), integrating seamlessly with existing enterprise Identity Providers (e.g., Active Directory).
• Capability Building & Handover: Lead hands-on knowledge transfer sessions, pair-programming with client engineers, creating operational runbooks, and conducting "Game Day" failure simulations to ensure the client’s team is fully ready to operate the platform independently.Requirements
• Education: Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
• Experience: 5+ years of proven experience in Data Engineering, Distributed Systems, or Big Data Architecture, with at least 2+ years specifically leading Data Lakehouse or Cloud Data Platform implementations.
Technical Skills & Core Technologies:
• Programming Languages: Proficiency in programming languages such as Python, Java, or Scala.
• Data Architecture & System Design: Strong expertise in designing data-intensive applications, complex data modeling, and schema design for enterprise environments.
• Distributed Systems & Lakehouse Technologies: Deep, hands-on experience with distributed processing engines (e.g., Apache Spark, Kafka, Hadoop) and modern open table formats (e.g., Delta Lake, Apache Iceberg, Apache Hudi).
• ETL/ELT & Orchestration: Experience designing and building robust data pipelines using modern transformation and orchestration tools (e.g., Apache Airflow, Prefect, dbt).
• Database Ecosystems: Proven track record with relational databases (e.g., PostgreSQL, MySQL), NoSQL platforms (e.g., MongoDB, Cassandra), and distributed SQL query engines like Hive and Trino.
• Cloud Infrastructure: Proven experience deploying enterprise data solutions on major cloud providers, specifically within localized Saudi cloud regions. Expertise in Oracle Cloud Infrastructure (OCI) or Google Cloud Platform (GCP) is highly preferred, though experience with AWS or Azure is acceptable.
• Analytical Skills: Strong problem-solving skills with a keen eye for detail and a passion for data.
• AI/Data Science Enablement: Prior experience building data pipelines optimized for Machine Learning, Natural Language Processing (NLP), vector embeddings, or Knowledge Graphs/Ontologies is highly desirable.
• Security & Networking: Strong understanding of enterprise network security, Private Endpoints, Identity & Access Management (IAM), and cryptographic key management.
• Communication: Excellent written and verbal communication skills, with the ability to articulate complex technical concepts to non-technical stakeholders.
• Leadership Skills: Demonstrated ability to lead technical teams, manage stakeholder expectations, and successfully transition complex systems to internal IT/Data teams.
• Regulatory Knowledge: Familiarity with Saudi Arabian data compliance frameworks (NCA CCC, NDMO, SDAIA) is highly preferred.
Requirements
- •Bachelor's or Master's degree in Computer Science or related field
- •Extensive experience in data engineering, data architecture, and scalable distributed systems
- •Proficiency in designing and implementing Data Lakehouse using Medallion architecture (Bronze, Silver, Gold) and open table formats (Delta Lake, Apache Iceberg)
- •Experience building batch and streaming data ingestion frameworks
- •Knowledge of data quality frameworks and data lineage tracking
- •Experience with cloud infrastructure (Saudi Arabia hosted)
- •Familiarity with NCA and NDMO security standards, AES-256 encryption, TLS 1.2+, and KMS
- •Experience with RBAC and ABAC access control integration with Identity Providers
Nice to Have
- •Experience with AI agents, digital workers, and knowledge graph systems
- •Experience in technical leadership and mentoring engineering teams
- •Hands-on experience with specific cloud platforms (e.g., Azure, AWS, GCP)
Responsibilities
- •Design and deploy a unified Data Lakehouse platform
- •Build and maintain automated data ingestion frameworks
- •Implement automated data quality and lineage tracking
- •Optimize data processing workflows for performance and scalability
- •Monitor and maintain data systems, responding to urgent issues
- •Ensure platform adherence to security and compliance standards
- •Design and deploy granular access control mechanisms
- •Lead knowledge transfer sessions and operational readiness for client teams
Related Jobs
Browse Similar
- See if your CV survives robusta's ATS filters
- Get AI-rewritten bullet points
- Download Gulf-ready CV
60 seconds. $3.99 one-time.
Robusta Studio is a digital transformation and product design company. They partner with businesses to create innovative digital solutions and enhance user experiences.
Visit WebsiteView all jobs