74
AI Data Engineer / Knowledge Graph Architect (m/f/d) – B2B AI SaaS
UpworkDENot specifiedexpert
PythonNeo4jCypherJSON/XMLVector SearchLLM-ChunkingFastAPIMCP
Location: Remote / Hybrid
Type: Full-time / Freelance (negotiable)
About Us & Our Vision:
We are revolutionizing how large enterprise companies understand their mission-critical ERP systems and complex, domain-specific business rules. Instead of relying on generic LLM guesses and hallucinations, we are building a high-precision, domain-specific Knowledge Graph queried by intelligent, specialized AI agents.
Your Role:
In the initial phase of our product, we work with massive, static data extracts (JSON/XML dumps of enterprise codebases, configuration tables, and complex regulatory manuals). Your job is to transform this data chaos into a structured, highly interconnected, and queryable Knowledge Graph (Neo4j) and to set up a robust RAG pipeline for our AI agents.
Your Core Responsibilities:
Data Parsing: Develop robust Python pipelines to ingest, clean, and structure complex, nested JSON/XML data extracts from legacy enterprise systems.
Graph Engineering: Design the data model and build a Neo4j Knowledge Graph (Cypher) that maps the relationships between system components, database schemas, and business configurations.
Mock-API Development: Build a Python service (FastAPI) that provides data to our LLM agents via the Model Context Protocol (MCP) – simulating a live connection to a real enterprise backend.
Vector Search / RAG: Chunk and embed domain-specific rulebooks and documents (PDFs/XMLs) into a vector database (e.g., AlloyDB pgvector or Pinecone).
Your Profile (Must-Haves):
Strong Python skills for complex data manipulation and backend development.
Deep, practical knowledge of Neo4j and the Cypher query language.
Experience in Data Engineering (ETL pipelines, handling large JSON/XML files).
Solid understanding of RAG architectures (Retrieval-Augmented Generation) and Vector Databases.
Analytical mindset: You can quickly grasp highly complex data structures and model them into efficient graphs.
Bonus (Nice-to-Have):
Basic understanding of general ERP systems (relational databases, foreign keys, system configurations, object-oriented code structures).
Experience with the Model Context Protocol (MCP).
What We Offer:
Greenfield Project: No legacy systems to maintain. You make the technological decisions from Day 1.
State-of-the-Art AI: Work at the absolute bleeding edge of current AI development (Multi-Agent Systems, 1M+ Token Contexts, GraphRAG).
Flexibility: 100% remote work and flexible hours, focused on output and results, not presence.
Impact: Build a product that solves a massive, expensive pain point in critical enterprise infrastructure.
Unlock AI intelligence, score breakdowns, and real-time alerts
Upgrade to Pro — $29.99/mo