FinSight
Quant Research OS
End-to-end ML · Walk-forward validated · Live demo available

FinSight

An end-to-end machine learning pipeline extracting alpha signals from S&P 500 earnings call transcripts using FinBERT, RAG, and gradient-boosted models.

Key insights

Signal quality is highly uneven across sectors

Why it matters

Energy and Industrials show strong predictability while Technology is near-noise.

Deploy sector-aware capital allocation instead of uniform market-wide exposure.

Model choice affects reliability more than headline accuracy

Why it matters

LightGBM’s stability is materially stronger than baseline alternatives.

Prioritize low-variance model behavior for live risk management.

Short holding windows remain fragile

Why it matters

Backtests indicate weak profitability net of costs at 5-day horizon.

Default to longer holding windows until execution edge improves.

System Pipeline

Stage 1
Data Ingestion
14,584 transcripts · yfinance
Stage 2
NLP Pipeline
FinBERT + RAG · 34 features
Stage 3
ML Models
XGBoost · LightGBM · LSTM
Stage 4
Backtesting
Long-short · 10bps TC
Stage 5
Sector Analysis
Energy IC = 0.31
Transcript Coverage
Earnings calls by year
Sentiment Trend
Management vs Q&A net sentiment
Model IC Comparison
Mean IC across walk-forward folds

Key Findings