How does StockMarketSignals score a stock?

Each ticker's score sums reliability weight × stance × confidence across every qualifying mention in a rolling 14-day window, normalized to a 0–100 index, with a consensus bonus when three or more independent channels mention the same name.

What is a reliability weight?

Every tracked YouTuber carries an editorial reliability weight from 1 to 100 based on their track record. Higher-weight creators move a ticker's score more than lower-weight ones.

Why are some stocks not shown?

Signals must score at least 40 to be published. Mentions that are too thin, too neutral, or low-confidence are automatically discarded and never reach the public feed.

How are stock mentions extracted from videos?

Video transcripts (native captions, or AI speech-to-text as a fallback) are parsed by Google Gemini to isolate tickers and label each mention with a stance — bullish, neutral, or bearish — plus a confidence score.

How current is the data?

Scores recompute every 6 hours over a rolling 14-day window, so the board reflects recent creator sentiment rather than outdated calls.

Engine Internals

Technical Methodology

Every formula, gate, and pipeline step that turns a YouTube transcript into a published score. No hand-waving — this is what actually runs. For the high-level overview, see Methodology.

Video ingestion

Every 6 hours we poll the YouTube Data API for new uploads across our curated channel list. Each channel has a stable channel_id and an admin-tuned reliability weight (1–100) that reflects the creator's track record.

New videos land in the videos table with metadata only — transcript and mention extraction run as separate stages so a failure in one doesn't block the next channel.

Transcript pipeline

Two-pass strategy, every 5 minutes, batched per video:

Pass 1 — native captions. YouTube auto-captions (or creator-provided) via youtube-transcript. Free, fast, ~80% coverage.
Pass 2 — AI speech-to-text. For the ~20% with no captions, Google Gemini transcribes the audio directly. Slower and metered, but no video is left behind.

Both paths produce a single transcript column with the source tagged (youtube / ai / none) so we can audit quality later.

Mention extraction

Each transcript is passed to Gemini with a structured-output schema. The model returns a JSON array of mentions, each carrying:

{
  "ticker": "NVDA",
  "stance": "bullish" | "neutral" | "bearish",
  "confidence": 0.0 - 1.0,
  "timestamp_seconds": 412,
  "excerpt": "...verbatim quote..."
}

Stance reflects what the creator said about the ticker — not whether we agree. Confidence is the model's certainty that the ticker was actually being discussed (vs. a passing reference or a misheard word).

Ticker validation

Models hallucinate. Before any mention can move forward, the ticker symbol is checked against Yahoo Finance — if the symbol doesn't resolve to a real listed instrument it's quarantined and removed by a weekly cleanup-invalid-tickers cron job. Validated symbols are cached in the tickers table with exchange, name, and logo.

Scoring formula

This is the real math, run on every recompute (auto-chained after every new transcript batch). For each ticker T across a rolling 14-day window:

Step 1 — sign each mention

sign = +1   if stance == "bullish"
       -1   if stance == "bearish"
       +0.2 if stance == "neutral"

mention_value = sign × confidence

Neutral mentions count, but barely — they nudge the score without dominating it. A creator saying "I'm watching NVDA" shouldn't move the needle the same as "I'm buying NVDA."

Step 2 — collapse per creator, clamp

for each creator C who mentioned T:
  creator_avg[C] = mean(mention_value over C's mentions of T)
  creator_avg[C] = clamp(creator_avg[C], -1, +1)

Clamping prevents a single creator with many high-confidence bullish mentions from running away with the score — one creator can move the needle at most ± their weight.

Step 3 — weight and sum

raw = Σ ( weight[C] × creator_avg[C] )   for all creators C of T

theoretical_max = max_weight × number_of_creators
consensus_bonus = 1.15 if creator_count >= 3 else 1.00

Step 4 — normalize to 0–100

deviation = (raw / theoretical_max) × 50 × consensus_bonus
score     = round( clamp( 50 + deviation, 0, 100 ) )

50 is neutral. Pure bullish consensus across high-weight creators pushes toward 100; bearish pushes toward 0. The 15% consensus bonus rewards independent corroboration — three creators agreeing is meaningfully different from one creator shouting.

Coverage gates

A score alone doesn't get published. The pick must clear:

≥ 2 distinct creators, OR
≥ 3 total mentions from the same creator

This kills single-mention noise. A ticker namedropped once by one creator never becomes a published recommendation, regardless of how confidently the model extracted it.

The previously-published score remains live until the next recompute changes it by ≥ 5 points or the contributing mentions change — at which point the AI thesis is invalidated and regenerated.

AI reasoning regeneration

Every published pick carries a written thesis — the "why" you see on the stock page. It's generated by a separate Gemini pass that ingests:

The ticker's verbatim mention excerpts (with timestamps)
Each contributing creator's name and weight
The final score and stance distribution

When the underlying mentions or score shift materially, the thesis is marked stale and re-queued. A worker drains the queue every 5 minutes, so the public site never shows a thesis that contradicts the current score.

Backtest engine

We measure ourselves. Every published pick with score ≥ 70 is auto-enrolled into a hypothetical $100 position, captured at the next trading day's open after publication.

entry_date  = first trading day after published_at
entry_price = open[entry_date]
spy_entry   = open[entry_date] for SPY

window prices captured at +1M, +3M, +6M, +1Y:
  pick_return  = (price[window] / entry_price) - 1
  spy_return   = (spy[window]   / spy_entry)   - 1
  alpha        = pick_return - spy_return

Snapshots fill in automatically as windows mature (daily cron). We aggregate per creator and site-wide to a leaderboard. The dashboard is currently admin-only until the sample size is large enough to publish honestly — back-filling old picks with synthetic entry prices would distort the record, so we're letting it accumulate forward-only.

Assumes no fees, slippage, taxes, or dividends. α is purely illustrative — past performance doesn't guarantee future results.

Limits & honesty

What we explicitly do not do:

No price targets. We don't predict where a stock will go — we surface what high-conviction creators are saying, weighted by their reliability.
No recency decay yet. Within the 14-day window all mentions count equally. A future revision may weight more recent mentions higher.
No sector or macro weighting. A 100 score on a microcap means the same thing as a 100 score on a megacap. Position-sizing is on you.
No short signals emphasis. Bearish stances are tracked symmetrically but the audience reality is that most viewers act on long ideas.
Not investment advice. This is a signal aggregator. Read the disclaimer.

← Back to overview See current picks →