Data Engineer
Stand Out
- Get to the top of AI71's applicant pile
- Get AI-rewritten bullet points
- Download Gulf-ready CV
60 seconds. $3.99 one-time.
<h3><span style="font-family: arial, helvetica, sans-serif;"><strong>About AI71:</strong></span></h3>
<p><span style="font-family: arial, helvetica, sans-serif;">AI71 is an industry leader in artificial intelligence, delivering innovative solutions that empower developers, businesses and governments to solve complex challenges. AI71 builds secure, enterprise-ready applications powered by cutting-edge technology—tailored for knowledge workers and sector-specific needs. AI71 bridges the gap between advanced AI and real-world impact. Guided by a strong commitment to research and responsibility, we create transformative solutions that drive progress and empower communities.</span></p>
<h3><span style="font-family: arial, helvetica, sans-serif;"><strong>The Role:</strong></span></h3>
<p><span style="font-family: arial, helvetica, sans-serif;">As a Senior Data Engineer, you will be responsible for designing, developing, and maintaining advanced, scalable data systems that power critical business decisions. You will lead the development of robust data pipelines, ensure data quality and governance, and collaborate across cross-functional teams to deliver high-performance data platforms in production environments. This role requires a deep understanding of modern data engineering practices, real-time processing, and cloud-native solutions.</span></p>
<h3><span style="font-family: arial, helvetica, sans-serif;"><strong>What You'll Do:</strong></span></h3>
<ul>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Data Pipeline Development & Management: </span></li>
<ul>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Design, implement, and maintain scalable and reliable data pipelines to ingest, transform, and load structured, unstructured, and real-time data feeds from diverse sources. </span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Manage data pipelines for analytics and operational use, ensuring data integrity, timeliness, and accuracy across systems.</span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Implement data quality tools and validation frameworks within transformation pipelines. </span></li>
</ul>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Data Processing & Optimization: </span></li>
<ul>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Build efficient, high-performance systems by leveraging techniques like data denormalization, partitioning, caching, and parallel processing. </span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Develop stream-processing applications using Apache Kafka and optimize performance for large-scale datasets. </span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Enable data enrichment and correlation across primary, secondary, and tertiary sources. </span></li>
</ul>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Cloud, Infrastructure, and Platform Engineering: </span></li>
<ul>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Develop and deploy data workflows on AWS or GCP, using services such as S3, Redshift, Pub/Sub, or BigQuery. </span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Containerize data processing tasks using Docker, orchestrate with Kubernetes, and ensure production-grade deployment. </span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Collaborate with platform teams to ensure scalability, resilience, and observability of data pipelines. </span></li>
</ul>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Database Engineering: </span></li>
<ul>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Write and optimize complex SQL queries on relational (Redshift, PostgreSQL) and NoSQL (MongoDB) databases. </span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Work with ELK stack (Elasticsearch, Logstash, Kibana) for search, logging, and real-time analytics. </span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Support Lakehouse architectures and hybrid data storage models for unified access and processing. </span></li>
</ul>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Data Governance & Stewardship: </span></li>
<ul>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Implement robust data governance, access control, and stewardship policies aligned with compliance and security best practices. </span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Establish metadata management, data lineage, and auditability across pipelines and environments. </span></li>
</ul>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Machine Learning & Advanced Analytics Enablement: </span></li>
<ul>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Collaborate with data scientists to prepare and serve features for ML models. </span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Maintain awareness of ML pipeline integration and ensure data readiness for experimentation and deployment. </span></li>
</ul>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Documentation & Continuous Improvement: </span></li>
<ul>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Maintain thorough documentation including technical specifications, data flow diagrams, and operational procedures.</span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Continuously evaluate and improve the data engineering stack by adopting new technologies and automation strategies</span></li>
</ul>
</ul>
<h3><span style="font-family: arial, helvetica, sans-serif;"><strong>What You'll Bring:</strong></span></h3>
<ul>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">8+ years of experience in data engineering within a production environment.</span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Advanced knowledge of Python and Linux shell scripting for data manipulation and automation.</span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Strong expertise in SQL/NoSQL databases such as PostgreSQL and MongoDB.</span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Experience building stream processing systems using Apache Kafka.</span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Proficiency with Docker and Kubernetes in deploying containerized data workflows.</span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Good understanding of cloud services (AWS or Azure).</span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Hands-on experience with ELK stack (Elasticsearch, Logstash, Kibana) for scalable search and logging.</span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Familiarity with AI models supporting data management.</span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Experience working with Lakehouse systems, data denormalization, and data labeling practices.</span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science, or a related field. </span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Demonstrated success in designing, scaling, and operating data systems in cloud-native and distributed environments. </span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Proven ability to work collaboratively with cross-functional teams including product managers, data scientists, and DevOps.</span></li>
</ul>
<h3><span style="font-family: arial, helvetica, sans-serif;"><strong>Great Pluses / Preferred Experience </strong></span></h3>
<ul>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Working knowledge of data quality tools, lineage tracking, and data observability solutions.</span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Experience in data correlation, enrichment from external sources, and managing data integrity at scale.</span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Understanding of data governance frameworks and enterprise compliance protocols.</span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;">Exposure to CI/CD pipelines for data deployments and infrastructure-as-code.</span></li>
</ul>
<h3><span style="font-family: arial, helvetica, sans-serif;"><strong>Why AI71:</strong></span></h3>
<ul>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;"><strong>Mission-Driven Work:</strong> Work on cutting-edge AI applications with a talented and passionate team, solving real-world challenges in critical sectors.</span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;"><strong>Unparalleled Opportunity:</strong> This is a chance to innovate and solve real-world challenges using AI at a company with unique access to world-leading models and resources.</span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;"><strong>Career Growth: </strong>We offer competitive compensation, benefits, and significant career growth opportunities as a foundational member of the team.</span></li>
<li style="font-family: arial, helvetica, sans-serif;"><span style="font-family: arial, helvetica, sans-serif;"><strong>World-Class Environment:</strong> Enjoy a flexible working environment and the latest tools & technologies needed to do your best work.</span></li>
</ul>
<h3> </h3>
Requirements
- •Design, implement, and maintain scalable data pipelines
- •Manage data pipelines for analytics and operational use
- •Implement data quality tools and validation frameworks
- •Build efficient, high-performance systems
- •Develop stream-processing applications using Apache Kafka
- •Optimize performance for large-scale datasets
- •Collaborate across cross-functional teams
- •Deep understanding of modern data engineering practices
Responsibilities
- •Designing, developing, and maintaining advanced, scalable data systems
- •Leading the development of robust data pipelines
- •Ensuring data quality and governance
- •Delivering high-performance data platforms in production environments
Related Jobs
- Get to the top of AI71's applicant pile
- Get AI-rewritten bullet points
- Download Gulf-ready CV
60 seconds. $3.99 one-time.
AI71 offers a platform for creating and deploying advanced AI models. It serves businesses and developers seeking to integrate sophisticated artificial intelligence into their products.
Visit WebsiteView all jobs