Python Data Engineer — API-driven MLB Hitter Scoring Engine (No ML, Daily Automation)
UpworkUSNot specifiedexpertScore: 45
API IntegrationPythonData EngineeringETLGithub ActionsGoogle Sheet APICron Airtable APIData ValidationBaseball Analytics(Statcast)
I’m hiring a strong Python data engineer to build an automated MLB hitter scoring engine that refreshes daily.
This is a backend automation project only: no frontend, no UI, and no machine learning research.
The system must pull daily hitter and pitcher inputs via APIs (not large-scale scraping).
Metrics include contact quality (e.g., Barrel%, Hard Hit%, Fly Ball%, HR/FB), recent HR totals/trends, pitcher pitch mix and splits, park HR factors, and weather (wind direction/speed).
You will implement a transparent weighted scoring model that outputs a 1–100 composite HR opportunity score.
Rankings must be dynamic day-to-day based on matchup + environment, not a static “best hitters” list.
Wind, park, and pitch-mix advantages must materially move scores and rankings when conditions warrant.
Add regression/due flags using simple statistical logic (strong contact quality + low recent HR conversion vs expected baselines).
Auto-generate a short reason string per player from the top 2–3 drivers (e.g., “High Barrel + Wind Out + Fastball Edge”).
Generate 10–15 top 3-player combinations ranked by composite score with a mix of top setups and regression-flagged candidates.
Output clean, ranked results to Google Sheets or Airtable with a stable schema and daily automated runs on a lightweight scheduler (GitHub Actions or cron).
The scoring weights must be configurable (YAML/JSON) so adjustments can be made without rewriting core code.
Unlock AI Intelligence, score breakdowns, and real-time alerts
Upgrade to Pro — $29.99/moClient
Spent: $941.75Rating: 5.0Verified