Changelog
0.8.0 — 2026-04-20 (Manual Company Entry)
Standalone company creation and safer manual entity linking.
Added
- Add Company form: New card on the Companies list page lets users create a standalone company with an optional type. Duplicate names redirect to the existing record; new entries redirect to the freshly created company's detail page.
- POST
/companies/add endpoint backing the new form.
Fixed
- Manual company/person linking on project pages: The "Add Company" and "Add Person" forms on project detail previously ran typed names through the fuzzy entity resolver (
resolve_company), which could silently map a new name to an unrelated existing company via short substring matches (partial_ratio × 0.9 crossing the 90% threshold). Manual entry now uses exact-name lookup, falling back to insert-new — the resolver remains in place for LLM-extracted names where fuzzy matching is desired.
0.7.0 — 2026-04-16 (Article Management)
Manual article import, article-project linking, video content detection, and pagination.
Added
- Manual URL import: Paste an article URL on the Articles page to fetch, extract, and analyze it with Sonnet — skips keyword filter and Haiku classification (user has already vetted relevance). Shows project creation/update summary on completion.
- Article-project linking: Search and link articles to existing projects from the article detail page. Includes autocomplete project search and unlink controls.
- Video content detection: Detects YouTube/Vimeo embeds and video tags during article processing. New
content_type field on articles (article, video, mixed). Filterable on the articles page.
- Articles pagination: Replaced hardcoded 100-article limit with paginated browsing (50 per page) with prev/next navigation.
- Articles relevance filter: Toggle between "All Articles" and "Linked to Projects" on the articles page.
- Project search API:
GET /articles/api/search-projects?q=... endpoint for autocomplete lookups.
Changed
- Extraction refactor: Shared
_process_extraction() helper in articles routes — used by both manual import and re-analyze, eliminating code duplication.
- Pipeline content_type: RSS collector and pipeline runner now detect and store
content_type for all collected articles.
Schema
- Migration 003:
content_type TEXT DEFAULT 'article' column on articles table.
Housekeeping
- Added
.pytest_cache/ and data/backups/ to .gitignore
- Removed stale pip install artifacts (
=22.0, =3.10)
- Archived completed Session 4 plan to
sessions/archive/
0.6.0 — 2026-03-20 (Branding & Public Pages)
Parjana branding, GaiaOps attribution, public homepage, and changelog page.
Added
- Parjana favicon: Downloaded from parjanaengineering.com, referenced in all pages via
base.html
- Parjana logo: Icon in dashboard nav bar (24px) and login page header (64px)
- GaiaOps footer: "Created by GaiaOps — Multiply Your Environmental Impact" on all pages with link to gaiaops.io
- Public homepage: Anonymous visitors to
/ see a landing page with SEO meta tags (title, description), Parjana logo, project description, and login link. Authenticated users still see the dashboard.
- Changelog page:
/changelog route (public, no auth) renders CHANGELOG.md to HTML via the markdown library. Linked from footer.
- SEO controls:
<meta name="robots" content="noindex"> on all dashboard pages. Public homepage is the only indexed page.
Changed
/ route: No longer requires authentication — shows public homepage for anonymous visitors, dashboard for authenticated users
- Context processor: Skips DB query for nav counts on anonymous requests
Dependencies
- Added
markdown>=3.3 for changelog rendering
Assets
dashboard/static/img/favicon.png — Parjana stacked layers icon
dashboard/static/img/parjana-logo-icon.jpg — Parjana icon (standalone)
dashboard/static/img/parjana-logo-full.webp — Parjana full logo with text
0.5.0 — 2026-03-20 (Production Hardening)
Production reliability improvements: gunicorn WSGI server, daily database backups.
Added
- Gunicorn WSGI server: Replaced Flask dev server with gunicorn (
--workers 1 --timeout 120). Eliminates dev server warning in production.
- Daily database backups: APScheduler job at 04:00 UTC copies SQLite DB to
/data/backups/solarscout-YYYY-MM-DD.db with 7-day retention and auto-pruning.
- Pre-deploy testing checklist: Documented in session notes.
Changed
- App initialization:
run_dashboard.py refactored so DB init, app creation, and scheduler start happen at module load — works for both gunicorn import and direct execution.
- Scheduler: Now runs two jobs — daily pipeline (12:00 UTC) and daily backup (04:00 UTC).
0.4.0 — 2026-03-20 (Production Deployment)
First production deployment to Railway with custom domain.
Added
- Production hosting: Deployed to Railway Hobby Plan with SQLite on persistent volume
- Custom domain: Live at parjanasolarscout.app with auto-provisioned SSL via Railway/Let's Encrypt
- Run Pipeline button: Dashboard home page has a "Run Pipeline Now" button with status polling, concurrent-run prevention, and auto-refresh on completion
- APScheduler integration: Daily pipeline runs at 12:00 UTC via in-process background scheduler — no separate cron service needed
- Pipeline HTTP endpoint:
/run-pipeline endpoint (POST, auth required) for triggering pipeline runs programmatically
- Pipeline status endpoint:
/pipeline-status returns whether a pipeline run is currently in progress
- BACKLOG.md: Backlog file for tracking planned features and known issues across sessions
Changed
- Default Sonnet model: Updated from
claude-sonnet-4-5-20250514 to claude-sonnet-4-6
Infrastructure
- GitHub repo:
ross-gaiaops852/solar-scout (private)
- Railway web service auto-deploys from
main branch
- Volume mounted at
/data for SQLite persistence across deploys
- Environment variables managed via Railway dashboard (not
.env in production)
First Production Pipeline Run
- 305 articles collected from 7 feeds
- 142 passed keyword filter, 54 passed Haiku classification
- 48 new projects created, 51 companies identified, 11 people tracked
- Estimated API cost: $1.41
- Duration: ~31 minutes
0.3.0 — 2026-03-19 (Dashboard v2)
Dashboard usability improvements and activity tracking.
Added
- Activity log: All manual edits to project fields and outreach status changes are logged chronologically. Displayed at the bottom of each project detail page.
- People page: New top-level "People" tab — browse all tracked people with their company and project associations, with search/filter.
- Review queue redesign: Side-by-side comparison of existing record vs incoming data. Differing fields highlighted. Links to view full existing record before deciding.
Improved
- Key People company links: Company names in project Key People section now hyperlink to the company record.
- Company People → Project links: Key People on company detail pages now show which project they're associated with, hyperlinked.
- Review queue context: Shows structured field comparison instead of raw JSON blob.
Fixed
- Intelligence extractor crash:
.format() on extraction prompt was interpreting JSON schema braces as template variables (KeyError). Switched to .replace().
- Article text lost on re-analysis failure: Text updates and API calls shared a transaction — if re-analysis failed, pasted text was rolled back. Text is now committed before attempting extraction.
Schema
- Migration 002:
activity_log table with project_id, field_name, old/new values, timestamp.
0.1.0 — 2026-03-03 (MVP Alpha)
First working end-to-end pipeline run.
Added
- Data collection: RSS collector pulling from 7 feeds (Solar Power World, PV Magazine, Utility Dive, PR Newswire, GlobeNewsWire, 2 Google News queries)
- Keyword filter: Two-tier regex matching (solar terms + development signals) eliminates irrelevant articles before any API calls
- Haiku relevance classifier: Sends filtered articles to Claude Haiku for relevance scoring (50MW+ US solar projects)
- Sonnet intelligence extractor: Full structured extraction — project details, companies, people, timelines, stormwater relevance, outreach urgency
- Entity resolution: Fuzzy matching for projects (85% threshold) and companies (90% threshold) with ambiguous matches (70-85%) routed to review queue
- Pipeline runner: Orchestrates collect → filter → classify → extract → resolve → store with per-run stats logging
- Dashboard: Flask app with login, pipeline status, projects list (filterable/sortable), project detail with stormwater design window, companies, articles, review queue
- Health endpoint:
/health returns JSON stats without auth (Railway health checks)
- Database: SQLite schema with 9 tables, indexes on common query columns
- Tests: Keyword filter and entity resolver test suites
Fixed
- Relevance classifier prompt crash —
.format() was interpreting JSON braces as template variables
- Entity resolver crash —
sqlite3.Row objects don't support .get(); converted to dicts at resolver boundary
- Entity resolver developer matching —
_developer field was referenced but never populated; now queries project_companies table
Improved
- Full-text fetching parallelized across domains (ThreadPoolExecutor) — previously sequential with 2s delay per article