Create a cron on task on a SQL DB + API call implementation

UpworkPLNot specifiedintermediate
SQLPython
It's a system to categorise invoices.
The prompt for LLM will be provided by us as it's in a foreign lang (so you're not responsible for polishing up the prompt)

Your part is to write the Python script and do all the connections.

Connect to SQL Server (creds from config file, no hardcoding) living on OUR OWN servers.
Query uncategorised invoices (SELECT where scheme IS NULL, filter by date 'greater than or equal to' last run, return: invoice ID, supplier name, tax ID, line items, amounts, doc type)
Persist last-run timestamp (flat file or single DB row, prevents reprocessing on restart)
Build prompt constructor (invoice fields → system prompt template loaded from external file, few-shot examples injected)
Call LLM API (It's probably gonna be Gemini TBD, API key from env var, structured JSON response: scheme, description, confidence 0–1, reasoning)
Parse + validate LLM response (handle malformed JSON, validate all fields present, retry once on bad response then flag)
Retry + rate limit handling (exponential backoff, max 3 retries per invoice, on final fail: skip + log, never write partial data)
Write approved results to DB (confidence 'greater than or equal to' 0.85: UPDATE scheme + description fields only, nothing else touched)
Flag low-confidence invoices (confidence UNDER 0.85: set review_flag = TRUE, do NOT write scheme/description)
Transactional safety (each invoice committed independently, failure on invoice N doesn't roll back previous ones)
Main runner loop (iterate over list of company DBs from config, sequential not parallel)
Windows Task Scheduler job (daily 06:00, lock file to prevent overlap with previous run)
Config file (DB connection, API key, confidence threshold, company list, log path, SMTP — nothing hardcoded)
Structured logging (per-invoice log: ID, result, confidence, timestamp — rotating, 90-day retention)
Daily summary email (counts: processed/auto/flagged/errors + list of flagged invoice IDs + run duration, sent via SMTP)
DB connection failure handling (log critical, send alert, abort — don't proceed without DB)
LLM API failure handling (skip invoice, log, continue run — don't abort everything)
Schema mismatch handling (expected columns missing → abort + alert, fail loudly)
Accuracy validation script (one-off: run LLM on ~300 pre-labelled invoices, compare vs human labels, output accuracy % — used to tune prompt before go-live)
Integration test on test DB (full pipeline run against empty test company ABC, verify correct fields updated, flags set, report sent)


Excellent English is a must, we're collecting applications only for the next 2-3h.

Please give me your time and costs estimate (PER PROJECT please, that also shows me your experience if you know to estimate a project - it means you actually what I'm talking about and how to do it, if you need more info, simply ask) and information if you built anything similar. 


Please don't generate your answer with an LLM, I appreciate a casual style response.
View Original Listing
Unlock AI intelligence, score breakdowns, and real-time alerts
Upgrade to Pro — $29.99/mo