Data Engineer
Get Noticed
- Make sure AI71 actually reads your resume
- Get AI-rewritten bullet points
- Download Gulf-ready CV
60 seconds. $3.99 one-time.
Role Summary
AI71 is seeking a Senior Data Engineer to architect and build the foundational data infrastructure for EDGE Group’s AI transformation. You will move beyond standard ETL tasks to design and deploy defense-grade Data Lakehouses that serve as the "Single Source of Truth" for our AI agents.
You will lead the technical execution of "Workstream 1: Data & Platform Foundations" for our flagship initiatives. Your mandate is to map rigid enterprise systems (e.g., SAP S/4HANA, Ariba, etc.) and collaborate with teams to understand complex unstructured data (technical drawings, regulatory text) so that we can utilise and improve high-performance pipelines and systems that power Intelligent Supply Chain forecasting and LeverEDGE generative AI tools. You will operate within a structured "Sprint Zero" environment, ensuring data lineage and security meet strict defense standards.
Key Responsibilities
• Data Lakehouse Architecture (Supply Chain)
• ERP Integration: Architect and deploy ingestion pipelines to extract high-volume transactional data from SAP S/4HANA, Ariba, and PLM systems, ensuring near real-time availability for forecasting models.
• External Feed Integration: Build connectors for external market intelligence feeds (e.g., S&P Global, Orbis, EcoVadis) to enrich internal procurement data with macroeconomic and geopolitical signals.
• Unified Data Model: Design and implement a standardized procurement data model and taxonomy across multiple entities, harmonizing fragmented datasets into a cohesive layer for analytics.
• Unstructured Data Pipelines (LeverEDGE)
• Complex Ingestion: Engineer pipelines to ingest and process unstructured technical data, including PDF tender documents, CAD metadata, and historical CONOPS, transforming them into vector-ready formats for RAG (Retrieval-Augmented Generation) applications.
• Vector Database Management: Manage and optimize Vector Databases (e.g., Weaviate) to store embeddings of archival proposals and engineering snippets, ensuring high-speed retrieval for AI drafting assistants.
• Digital Thread Implementation: Establish data lineage and traceability protocols that link requirements to physical components, supporting the Model-Based Systems Engineering (MBSE) "Digital Thread".
• Governance & Security
• Defense-Grade Security: Implement Role-Based Access Control (RBAC), audit logging, and data redaction policies to ensure compliance with export controls and strict on-premise security requirements.
• Quality Control: Deploy automated data quality frameworks to validate BOM (Bill of Materials) completeness and cost data accuracy before it reaches AI models.
• Infrastructure Optimization: Optimize pipelines for on-premise GPU clusters and air-gapped environments, ensuring efficiency within existing infrastructure limits.
Technical Requirements
• Core Stack: Expert proficiency in Python, SQL, and modern data engineering frameworks (Apache Spark, Kafka, Airflow).
• Enterprise ERP: Strong experience extracting data from complex ERP environments, specifically SAP S/4HANA and SAP Ariba. Familiarity with SAP BTP is a plus.
• Database Technologies: Deep understanding of Data Lakehouse architectures (Databricks/Delta Lake), Relational Databases (PostgreSQL), and Vector Databases (Weaviate/Milvus).
• Data Pipeline Development: Experience building pipelines for RAG solutions, Conversational agents and classical ML models with tools like dbt, dagster, or prefect
• DevOps/DataOps: Proficiency with containerization (Docker, Kubernetes) and CI/CD pipelines for deploying data workflows in secure environments.
Professional Qualifications
• Experience: 5+ years of experience in Data Engineering, with at least 2 years focused on building pipelines for Machine Learning or Generative AI applications in an enterprise setting.
• Domain Knowledge: Experience in Supply Chain, Manufacturing, or Defense sectors is highly desirable. Ability to understand "Bill of Materials" (BOM) structures and procurement lifecycles.
• Problem Solving: Ability to navigate the "Governance Collision" between agile data work and rigid systems engineering requirements, ensuring data deliverables meet formal Stage Gate reviews.
• Collaboration: Proven ability to work alongside Data Scientists and Backend Engineers to define data schemas that support predictive modeling and AI agents.
Why This Role?
You will be the architect of the data foundation that secures national defense capabilities. Your work will directly enable AI agents to negotiate procurement contracts and assist engineers in designing next-generation systems. If you are ready to build the robust infrastructure that turns raw data into strategic advantage, join AI71.
Requirements
- •Architect and deploy ingestion pipelines from SAP S/4HANA, Ariba, PLM systems
- •Build connectors for external market intelligence feeds
- •Design and implement a standardized procurement data model
- •Engineer pipelines for unstructured data ingestion (PDFs, CAD metadata)
- •Manage and optimize Vector Databases
- •Implement data lineage and traceability protocols
- •Implement Role-Based Access Control (RBAC), audit logging, data redaction
- •Deploy automated data quality frameworks
Related Jobs
- Make sure AI71 actually reads your resume
- Get AI-rewritten bullet points
- Download Gulf-ready CV
60 seconds. $3.99 one-time.
AI71 offers a platform for creating and deploying advanced AI models. It serves businesses and developers seeking to integrate sophisticated artificial intelligence into their products.
Visit WebsiteView all jobs