Texas K-12 board/RFP monitor: scrape portals, OCR PDFs, extract keywords, weekly JSON output.

UpworkUSNot specifiedexpertScore: 93
Data ScrapingPythonData ExtractionAutomationJSONAPI
Project: Texas K-12 Board Agenda & RFP Monitoring Tool (Education / Safety / Audio Technology Focus) I work for a company that provides technology solutions to K-12 school districts, including: Classroom audio systems Intercom / paging systems Emergency notification and safety solutions Translation and communication tools I want to build an automated monitoring tool that scans Texas school district board agendas, meeting minutes, and procurement portals to identify opportunities relevant to our solutions. The goal is to detect early signals of projects, funding, or RFPs before they are widely announced. What the System Should Do The tool should: • Monitor multiple district board websites and procurement portals • Detect newly posted PDFs or documents automatically • Download and store documents • Run OCR on scanned PDFs when needed • Extract keywords and relevant topics • Output structured data (district, date, keywords, summary) • Provide a weekly report or alerts when relevant items are found This should run automatically on a schedule with minimal manual effort. Keywords / Topics to Detect The system should identify documents related to: Intercom or paging systems School safety or emergency communication Audio or classroom technology Construction / new campuses / renovations Bond programs or funding approvals Technology infrastructure upgrades Security systems or emergency response Translation or bilingual communication programs Grants or funding awards RFP / RFQ / bid announcements Bonus if AI/NLP can classify relevance automatically. Deliverables Automated monitoring + scraping system OCR integration for scanned documents Keyword extraction or classification logic Structured output (JSON, database, or spreadsheet) Scheduled automation (daily or weekly) Deployment instructions Basic documentation for maintenance Optional but preferred: • Email or Slack alerts • Change detection (alert only when new documents appear) • Cloud deployment Required Skills Python Web scraping (BeautifulSoup, Playwright, Selenium, etc.) OCR tools (Tesseract, AWS Textract, Google Vision) PDF parsing / document processing Automation / scheduling API or data structuring Preferred: NLP / AI experience Cloud platforms (AWS, GCP, Azure) Experience with government or education websites To Apply Please include: Examples of similar monitoring or scraping projects Recommended tech stack for this project Estimated timeline to MVP Applications without relevant experience will not be considered.
View Original Listing
Unlock AI Intelligence, score breakdowns, and real-time alerts
Upgrade to Pro — $29.99/mo