Understand Biased Training of AI Models from the West Bengal Elections

Disclaimer: The author is neither a supporter of any political party nor trying to promote / demote any political parties or any governement or private agencies. Neither this post is meant to trigger any political discussions. References are made purely in the context of explaining biaseness in AI training and how it impacts several scenarios. Since this is a live big event which happened recently, author picked up the real-life example trying to simplify the concept and understanding of impact of biasness in training AI models, which can similarly impact in many business decisions.

The West Bengal elections was a good example of understanding biased training of AI models.

After first round of election on 23rd Apr'26 and before second round, the general temperament of public and all political parties was BJP will win. Though after second round on 29th Apr'26, it dampened to some extent and people were curious what will happen. In between the two rounds on 26th Apr'26, when BJP winning probabilty was high according to perceptions, we asked both Chatgpt and Perplexity that who is likely to win based on information available over internet and pure analysis of the AI engine itself. Both AI models reported TMC will win and gave the reasons which you see below (we published this predictions of AI models in Facebbok on 26th Apr'26 itself).

Interestingly after the results on 4th May'26, we asked Chatgpt as to why its predictions failed and it gave some reasons as detailed below.

But here comes the interesting part of biasness in training. In 2021 when Bengal elections were held BJP was giving a good fight at that time as well and TMC was also doubtful at that time. But at the end TMC won and all those interactions and data points are avilable in internet with which both the AI models got trained. There was no dataset available to train the models that the steps taken by BJP , Central Government, Election Commision and others this time can bring a lot of change and the way it can impact behavior of people and hence their thought processes. This is the first time it is done and the results were unknown. Hence the models were trainined that whatever you do, the final win will be TMC and if you see the reasons, those are based on the reasons for which TMC won in 2021.

The biasness in training led to wrong predictions. Same can happen when you train AI models specific to your applications. Question is should you believe predictions of AI models? The short answer is take it with a pinch of salt unless you are confident on the training part of it. It can answer probably well on known scenarios but it may fail in new scenarios. So it becomes important what all you consider as part of your training dataset so that your AI models are able to predict correctly.

For your reference, if am sharing the questions asked to the models and the responses:

Let's see what two popular AI engines say about West Bengal elections ( Chatgpt and Perplexity):

Question: Who will win west bengal elections. Give your own analysis and do not go by what others are saying over internet. Collect data, analyse sentiments of people and suggest the output.

Chatgpt Response: I’ll give you a ground-up analytical view, not a copy-paste of headlines. I’ll still anchor it with verified data, then layer political behavior + voter psychology + structural factors on top.

---

1. Hard Data Snapshot (What the numbers actually say)

Most credible polls show:

TMC vote share: ~43–47%

BJP vote share: ~39–42%

Gap: ~2–5%

Seat projections vary:

TMC: ~150–180 seats

BJP: ~110–140 seats

Leadership preference:

Mamata Banerjee ~48%

Next best (BJP): ~33%

Translation:

This is not a wave election. It’s a competitive but tilted contest.

---

2. Real Ground Signals (More important than polls)

A. Extremely high voter turnout

~92–93% turnout — historically massive

Interpretation (critical):

High turnout in Bengal typically = mobilization by dominant local structure (TMC)

But also signals anti-incumbency activation

Net effect: cancels out partially → favors incumbent slightly

---

B. Direct bipolar contest

Left + Congress are largely irrelevant now

Election = TMC vs BJP straight fight

This benefits BJP structurally (consolidation),

but also helps TMC retain anti-BJP vote fully

---

C. Campaign intensity & polarization

Heavy clashes, high rhetoric, central vs state narrative

Religion + identity politics rising

This helps BJP in specific belts, but not statewide dominance.

---

D. Women voter tilt (very underrated factor)

Women slightly favor Mamata Banerjee

In Bengal, this is HUGE because:

Welfare schemes (Lakshmi Bhandar, etc.)

Direct benefit transfer loyalty

---

E. Migrant + emotional voting behavior

Migrants returning to vote, emotionally invested

Bengal voting is identity + emotional, not purely economic.

---

3. Regional Dynamics (This decides the election)

Where BJP is strong:

North Bengal

Jangalmahal

Urban Hindu belt

Where TMC dominates:

South Bengal

Rural + women voters

Minority-heavy constituencies

Key insight: BJP growth is real but geographically concentrated

---

4. Sentiment Layer (What people actually feel)

From aggregation of:

voter interviews

turnout behavior

campaign narratives

Pro-TMC sentiment:

Stability

Welfare dependency

“Better than BJP alternative”

Strong local network

Pro-BJP sentiment:

Change / anti-incumbency

Law & order concerns

National alignment with central govt

Crucial insight: This is not pro-TMC enthusiasm,

it’s “TMC is still safer choice” sentiment

---

5. Structural Reality (Most people miss this)

TMC advantages:

Deep booth-level network (cadre strength)

Welfare delivery penetration

Minority vote consolidation (~25–30%)

BJP advantages:

Strong central leadership

Growing vote share

Consolidated opposition space

But: BJP still lacks last-mile rural machinery compared to TMC.

---

6. My Independent Prediction (Based on synthesis)

Most probable outcome:

TMC wins — but with reduced margin

Seats: ~150–170

BJP: ~120–140

Others: negligible

---

7. What could flip the result (low probability but possible)

BJP can win only if ALL happen simultaneously:

1. Massive swing in women voters

2. Unexpected rural erosion for TMC

3. Higher-than-expected consolidation of Hindu vote

Right now, data doesn’t support all three happening together.

---

Final Verdict (Clear, no hedging)

Winner: TMC

Nature of win: Narrower, more competitive than 2021

Big story: BJP becomes even stronger opposition

Perplexity Response: Who will win west bengal elections. Give your own analysis and di not go by what others are saying over internet. Collect data, analyse sentiments of people and suggest the output.

West Bengal's 2026 assembly elections, held in two phases on April 23 and 29 with counting on May 4, pit incumbent Trinamool Congress (TMC/AITC) against a surging BJP in a bipolar contest.[1][2]

## Key Data Points

Polls show mixed projections: POLIQ forecasts TMC at 205-225 seats (majority 148 needed), BJP 52-80; earlier surveys like IANS-Matrieze gave TMC 155-170, BJP 100-115.[2][1] Prediction market Polymarket has BJP at 52% chance to win most seats, TMC/AITC at 45%, with $4M volume reflecting bettor bets on high Phase 1 turnout (91-93%) signaling anti-incumbency.[1][2] Voter rolls shrank by 90 lakh via Special Intensive Revision, sparking TMC bias claims but boosting credibility claims.[3]

## Sentiment Analysis

Rural voters back TMC's welfare (Lakshmir Bhandar cash to women, health schemes) at 52% CM approval, per ground surveys; urban/semi-urban shift to BJP on jobs, corruption, law-order.[1][4][5] Social media amplifies TMC's "Mache Bhate Bangali" cultural pride vs. BJP "outsider"; BJP counters on Hindu sentiments, violence memories.[6][7] Youth unemployment, no anti-incumbency wave; women/first-timers key swing bloc.[8][9]

## My Analysis

High turnout favors opposition in tight races, per historical patterns, tilting toward BJP gains in Jangalmahal, Matua, North Bengal (e.g., BJP projected wins in 50+ seats per POLIQ).[1] TMC's machinery holds south/rural core, but fragmented opposition and welfare loyalty likely secure majority despite urban erosion—past polls underestimated TMC (2021: 215 seats vs. predicted close fight).[10][1] Betting markets aggregate real-money sentiment better than polls, but incumbency edge persists.

## Prediction

TMC retains power with 160-190 seats; BJP strongest opposition at 90-120. No landslide either way.[1][1][2]

When predictions failed after results decleared on 4th May'26, then following question was asked to Chatgpt and below is the response:

Question: why your prediction failed?

Resonse of Chatgpt: Fair question. If the actual result diverged from my call, the issue isn’t “one wrong number”—it’s usually a model miss on a few high-impact variables. Here’s a clean post-mortem.

1) I overweighted “structural advantage,” underweighted late swing

I leaned on:

incumbent welfare penetration
booth-level organization
historically sticky vote blocs

What I under-modeled:

last 10–14 day momentum shifts (candidate switches, local alliances, narrative shocks). Indian state elections often see compressed late swings that don’t show up in earlier aggregates.

2) Turnout ≠ neutral (I treated it too symmetrically)

I assumed high turnout cancels out (incumbent mobilization vs anti-incumbency).

What likely happened:

Turnout surge was asymmetric (e.g., new/returning voters skewing one side, or specific regions over-performing).
Micro-level turnout (booth/cluster) matters more than state averages.

3) Women vote / welfare loyalty was misestimated

I treated welfare-linked voting as relatively stable.

Possible miss:

Either leakage (erosion due to inflation, delivery gaps, fatigue), or
Counter-mobilization (opposition messaging neutralizing welfare advantage).

These shifts are subtle and often undercaptured in surveys.

4) Geographic concentration effect

I acknowledged regional strengths but likely underestimated seat conversion efficiency:

A party can win fewer votes statewide but win more seats if its vote is efficiently distributed.
Small swings in swing districts (not statewide share) flip dozens of seats.

5) Alliance arithmetic / vote transfer

Even “weak” parties can matter if:

their voters transfer cleanly to a larger partner, or
they split the opponent’s base unevenly.

I assumed low relevance; in reality, transfer rates can be decisive.

6) Candidate-level factors (micro > macro)

Statewide models miss:

local anti-incumbency against specific MLAs
candidate credibility, defections, factional fights

A few dozen such seats can swing the majority.

7) Polarization intensity & consolidation

I modeled polarization but not the final-stage consolidation:

When narratives harden late, undecided voters break decisively one way.
That creates a non-linear jump not visible in earlier sentiment.

8) Survey/systematic biases

Even without “copying polls,” any synthesis is affected by:

non-response bias (certain voters don’t answer)
shy voter effect
urban over-sampling vs rural reality

These typically underestimate surge voters.

9) What I’d change in the model

If I were to rebuild the prediction:

Shift from state-level averages → constituency-level simulation
Track rolling 7-day momentum, not static sentiment
Weight turnout changes by region, not overall turnout
Add a vote-transfer coefficient for alliances
Penalize incumbents at candidate level, not just party level
Include a late-breaker factor (historically 3–6% swing in Indian state polls)

Bottom line

The miss likely came from underestimating dynamic, last-mile electoral behavior—especially:

late swings
turnout asymmetry
seat-level conversion

Elections in states like West Bengal are less about static sentiment and more about who controls the final 2 weeks + booth math.

Search This Blog

Thoughts