Capstone · AllianceBernstein

Market Signal Extraction

Lead analyst · Jan–May 2024

⚡ Tech sector signal: 14% annualized backtested return

Problem

AllianceBernstein's data science and investment teams read through enormous volumes of financial news, but turning that reading into something systematically useful is hard. The challenge is not gathering the data. It is building signals you can actually validate against real market returns rather than assuming they are informative.

System Design

We started with 93,845 financial articles from Perigon covering S&P 500 companies between September 2022 and December 2023. After deduplication and a news purity filter (keeping only articles where the primary company accounted for at least 30% of entity mentions), we worked with 68,482 unique pieces.

Topic discovery ran in three passes. LDA and BERTopic surfaced themes at the article and company level first, then at the sector level, then across the full S&P 100. GPT-3.5 ran the same process in parallel with structured prompts via LangChain. We evaluated both approaches against human-labeled articles using BLEU and ROUGE scores. GPT-3.5 produced more coherent topics at the broad level (ROUGE 0.533 for Level 1 topics) though it struggled with finer subtopic distinctions. That comparison justified using GPT-generated topics as the primary labeling approach rather than assuming it would win.

The final output was 10 unified topics labeled across all 68,482 articles. We then ran FinBERT sentiment classification on each article. GPT-3.5 outperformed FinBERT on ambiguous and forward-looking financial language: FinBERT is solid for clear-cut cases but struggled with the nuanced phrasing common in real analyst reports. Article-level sentiment scores were averaged to daily company-level signals, forward-filled with a decay factor on days without coverage, then rolled up to scores across all 11 S&P 500 sectors.

Alphalens provided the validation layer, testing whether the signals actually predicted forward returns rather than just correlating with past prices.

Architecture

~68K S&P 500 financial news articles
  → LDA + BERTopic topic modeling (baseline comparison)
  → GPT-3.5 + LangChain prompt engineering (primary)
  → FinBERT sentiment classification
  → Sector-level signal aggregation (11 sectors)
  → Alphalens factor validation against real market returns

Results

The technology sector signal produced 14% in annualized backtested returns, with a clear relationship between topic-level sentiment and subsequent sector performance. The full factor analysis was delivered to AllianceBernstein's portfolio team, showing that LLM-extracted signals can generate statistically meaningful alpha when the methodology is validated bottom-up rather than assumed.

Stack

PythonLDABERTopicGPT-3.5 APILangChainFinBERTAlphalensPandas

GitHub

Next Project

Gas Notification Pipeline→