Certified AI & Engineering Partner · Est. 2012

One Model, Open-Sourced. The Architecture Powering ML for 6+ Industries.

Published openly on Hugging Face and Kaggle — a 48-signal CatBoost predictor for Spanish La Liga. The same engineering powers production ML for our fintech, healthcare, eCommerce, SaaS, logistics, and EdTech clients. Read the model. Audit the code. Then let’s discuss your prediction problem.

Discuss Your ML Project

Audit the Model

Model Type

CatBoost · Tabular

Public Version

2026.04.1

License

MIT · Open

48+

Numeric Signals
at Inference

3+

Trained Model
Artifacts

120+

Sample Match Rows
(Synthetic Demo)

35+

Documented
CSV Columns

2/mo

Public Release
Cadence

Published On

Audited By the ML Community

Published on the two leading open ML platforms. Inspectable, downloadable, and versioned.

Hugging Face

addweb-solution / la-liga-score-predictor

The home of open ML. Model card, evaluation summary, training artifacts, SHA256 verification, and full Python package distribution.

Tabular-Classification · Catboost · La-Liga · Machine-Learning

View Model Card

Kaggle

addwebsolutionpvtltd / la-liga-score-predictor

The home of competitive data science. Public dataset distribution, notebook integration, and discoverability across millions of data scientists worldwide.

Sports · Prediction · Football · Dataset

View on Kaggle

THE WHY

Most Agencies Talk AI. We Ship It.

Every offshore development agency claims AI capability. Few have shipped a trained model anyone can audit. We built and open-sourced this La Liga Score Predictor for one reason: to make our ML capability verifiable, not just visible.

If a CTO wants to know whether AddWeb really does ML, they don’t have to take our word for it. They read the code, audit the model card, run the smoke test, inspect the SHA256 artifacts.

This is one model. We have built dozens. The rest live inside client production systems under NDA. This one is yours to inspect, fork, learn from, and use as proof of what we can build for you.

Inference Pipeline Architecture

Historical Match CSV

scores · Elo · tactics · player aggregates

Feature Builder

rolling form · 35-column normalization · fallback defaults

48-Signal Feature Vector

team form · Elo deltas · player minutes · tactic stability

CatBoost Ensemble

home_goals · away_goals · outcome models

Calibration Layer

temperature scaling · abstain logic · score range

Predicted Score + Probabilities + Confidence

ADDWEB ML PRACTICE

Open-Source Is the Sample. Here’s the Practice Behind It.

4+

Years

AI-Native Engineering

Production AI work since 2022. Not a pivot. Not a wrapper.

3+

Owned AI Products

AddWeb AI · EcomSupport360 · WeWP. Live in production today.

6+

Industries Shipped

FinTech · Healthcare · eCommerce · SaaS · Logistics · EdTech.

ISO

27001 Certified

Information security certified. NDA-first. Enterprise-grade.

ENGINEERING DEEP DIVE

What’s Actually Inside

Six engineering decisions that separate this from a wrapper around a public API. Each one matters when shipping ML to production.

01

Three-Model Ensemble

Predicting a football score is not one prediction. It’s three: how many goals home, how many goals away, and the outcome direction. We trained three independent CatBoost models and reconcile them through a calibrated decoder.

home_goals_model.cbm — regression on home goal expectation
away_goals_model.cbm — regression on away goal expectation
outcome_model.cbm — classification: home / draw / away

02

Calibrated Probabilities

Raw model probabilities lie. They overstate confidence. We apply temperature scaling to produce probabilities that reflect actual outcome frequencies — meaning a 60% confidence prediction is right roughly 60% of the time.

Pre-calibration probabilities preserved for diagnostics
Calibrated probabilities surfaced as primary output
Confidence margin computed between top two outcomes

03

Abstain Logic

Not every match is predictable. Tightly contested derbies, fragile tactical matchups, and historically volatile pairings produce ambiguous probabilities. The model knows when to flag a fixture as fragile rather than overcommit.

abstain_recommended: true when confidence is low
predicted_score_range returned for fragile matches
Specialist rule engine for known volatile patterns

04

Resilient Feature Builder

Real-world data is messy. The wrapper handles thin CSVs, missing optional columns, inconsistent team names, and partial player aggregates. Quality scales with data richness — but the package never breaks because of imperfect input.

Rolling features computed from historical rows
Fallback defaults when optional fields are missing
Stable team naming validation

05

Two-Tier Public API

Different consumers need different shapes. Front-end product cards need the simple response. Power users need the full debug payload. Both interfaces sit on the same model artifact.

predict_match() — full advanced response
predict_match_simple() — UI-friendly subset
predict_features() — raw feature interface

06

Production-Grade Packaging

This is not a notebook dumped on the internet. It is a real Python package — pyproject.toml, semantic versioning, SHA256-verified artifacts, model card, evaluation summary, FAQ, smoke test, batch inference, CLI demo, and notebook starter all included.

MIT licensed for unrestricted commercial use
Versioned (2026.04.1) with planned bi-monthly cadence
Cryptographic artifact verification via ARTIFACTS_SHA256.txt

FEATURE ENGINEERING

The 48 Signals

No magic. Just disciplined feature engineering. Every signal earns its place in the model.

12

Form & Goals

Last-5 and last-10 average goals scored, conceded, and goal differentials — for both home and away contexts

06

Win Rates

Home, away, and overall win rates plus draw rates over the prior 10 matches

04

Elo Strength

Pre-match Elo ratings and Elo differential — capturing strength asymmetry between sides

02

Rest Days

Days of recovery since each team’s last competitive fixture

14

Player Aggregates

Minutes, goals, assists, cards, starters, used players, injuries, suspensions over prior 5 matches

06

Tactic & Coach IDs

Tactic identifier, coach identifier, and tactic-stability indicators per side

02

Team IDs

Stable identifiers for each side, enabling team-specific embeddings and patterns

02

Matchup Codes

Tactic-matchup composite codes capturing how specific tactical pairings historically resolve

TRY IT YOURSELF

From Install to Prediction in Five Lines

No API keys. No paywall. No registration. Open-source, MIT licensed, ready to fork.

Read on Hugging Face

Explore on Kaggle

CAPABILITIES

What the Model Returns

Predicted Score

Final exact scoreline chosen by the model — both home and away goal counts.

Calibrated Probabilities

Home win, draw, and away win probabilities — temperature-scaled for accuracy.

Confidence Level

Simple high / medium / low label with underlying confidence score and margin.

Abstain Signal

Boolean flag when the model believes the fixture is too fragile to commit to an exact score.

Score Range

Optional home / away score band returned for fragile fixtures instead of a single scoreline.

Decoder Diagnostics

Specialist rule triggers, outcome enforcement flags, and xG delta for advanced users.

OUR STANCE

Most agencies sell you a service. We sell you an outcome — then we don’t sleep until we deliver it.

We are not a body shop. We are not a feature factory. When we ship ML, we ship the model card, the evaluation summary, the SHA256 artifacts, and the responsible-use boundaries. Engineering integrity is not optional.

FOR YOUR INDUSTRY

Beyond Football — Custom ML We Build for Clients

The same 48-signal architecture you saw above generalizes to any data-rich domain. Six prediction problems we ship into production for clients today — and the value each one delivers.

FinTech

Credit Risk Scoring

Calibrated default-probability models for loan origination, underwriting decisions, and portfolio risk monitoring.

INPUTS

Transaction history · payment patterns · demographic signals · macro indicators · alternative data

OUTPUTS

Calibrated default probability · risk band · abstain flag for thin-file applicants

CLIENT VALUE

Lower default rates while expanding approval coverage. Better risk-adjusted returns.

Healthcare

Patient Risk Prediction

HIPAA-aware risk models for readmission, deterioration detection, and treatment response — built with proper PHI handling.

INPUTS

Clinical signals · lab trajectories · vital trends · intervention history · medication adherence

OUTPUTS

30-day readmission risk · deterioration signal · treatment response probability · BAA available

CLIENT VALUE

Earlier intervention. Better patient outcomes. Optimized resource allocation across care teams.

eCommerce

Demand Forecasting

SKU-level demand prediction with calibrated confidence intervals — for inventory planning, promotional sizing, and stockout prevention.

INPUTS

SKU sales velocity · price elasticity · seasonal patterns · promo calendar · external signals

OUTPUTS

Demand quantity intervals · stockout probability · replenishment recommendations · cold-start abstain

CLIENT VALUE

Reduced stockouts. Lower carrying costs. Higher fulfillment rates and margin.

SaaS

Churn Prediction

User-level churn probability across 30, 60, and 90-day windows — with intervention recommendations and abstain logic for cold-start accounts.

INPUTS

Engagement signals · feature-usage trajectories · support interaction patterns · billing history

OUTPUTS

Churn probability per window · intervention recommendation · cohort segmentation · drift signals

CLIENT VALUE

Targeted retention spend. Higher net revenue retention. Faster CAC payback.

Logistics

ETA & Route Modeling

Calibrated arrival-time distributions for last-mile delivery, freight scheduling, and fleet optimization — with confidence bands per delivery.

INPUTS

Historical traffic · driver patterns · weather signals · vehicle telemetry · route geometry

OUTPUTS

Calibrated ETA distributions · delay probability · route optimization · driver-level signals

CLIENT VALUE

Improved on-time delivery. Lower fuel costs. Higher customer satisfaction scores.

EdTech

Learner Outcome Prediction

Course-level completion probability and at-risk classification — driving personalized intervention and content recommendations.

INPUTS

Engagement signals · assessment trajectories · content consumption · interaction patterns

OUTPUTS

Course completion probability · at-risk classification · content recommendations · cohort analyticsCourse completion probability · at-risk classification · content recommendations · cohort analytics

CLIENT VALUE

Higher completion rates. Lower dropout. Personalized learning paths at scale.

Don’t see your industry?

If your business has historical data and a prediction problem, the same architecture probably applies.

Tell us your use case

THE PROCESS

How AddWeb ML Engagements Work

Six phases. Each with defined deliverables, sign-off gates, and clear go/no-go decisions. No black boxes, no perpetual scope creep.

Week 1-2

01

Discovery & Data Audit

Feasibility assessment, data readiness review, problem framing. Output: a calibrated go/no-go recommendation. If ML isn’t the right tool, we tell you.

Week 2-4

02

Feature Engineering

Design the signal architecture. Build the feature builder. Establish baseline performance. This is where 80% of model quality is determined.

Week 4-8

03

Model Training & Calibration

Train candidate models. Apply temperature scaling. Validate on held-out data. Engineer abstain logic. Document evaluation summary with calibrated metrics.

Week 8-10

04

Deployment Integration

Plug into your stack — REST API, batch pipeline, or embedded inference. Instrument tracking. Set up monitoring. Production-ready hand-off.

Ongoing

05

Monitoring & Drift Detection

Track real-world accuracy. Detect concept drift. Schedule retraining. Maintain model card. The work isn’t done at deployment — it’s done when the model still works two years in.

Continuous

06

Knowledge Transfer

Documentation, runbooks, training data pipelines, model card, evaluation summary. Your team can maintain this. We make leaving easy — that’s why nobody does.

How You Hire Us

Custom ML Engagement Models

Three ways to start, ranked by commitment level. Pick the one that matches your stage and risk tolerance.

ML Discovery Sprint

“Should we build this?”

2-week fixed-scope feasibility audit. Best for teams unsure whether ML is the right tool — or which prediction problem to solve first.

Data readiness audit
Problem framing & ICE-prioritized use cases
Feasibility report with go / no-go recommendation
Baseline performance estimate
Build timeline & cost estimate

From $15K

2 weeks · fixed price

Start Discovery

Custom ML Build

“From signals to deployed model.”

8-16 week end-to-end engagement. Feature engineering, model training, calibration, deployment, monitoring. The full La Liga-grade approach for your business.

Production-ready trained model + artifacts
Calibrated probabilities & abstain logic
Deployed inference API or pipeline
Monitoring & drift detection setup
Model card · evaluation summary · runbooks
Full IP transfer on completion

$40K – $150K

8-16 weeks · scope-based

Build Your Model

ML Practice Embed

“Ongoing ML capability.”

Monthly retainer. Senior ML engineers embedded with your team. Ideal for businesses building ML as a core capability across multiple use cases.

Dedicated ML engineer(s)
Multiple model engagements per quarter
Continuous monitoring & retraining
ML strategy advisory
Direct integration with your data team

$20K – $40K/mo

Monthly · 3-month minimum

Embed a Team

Pricing ranges shown for transparency. Exact rates depend on data complexity, integration scope, and team composition. Discussed in your free Discovery call.

AddWeb vs. In-House ML vs. AutoML Platforms

Three paths to a production ML model. Each has trade-offs. Here’s the honest breakdown.

Factor

AddWeb ML Practice

Build In-House Team

AutoML Platforms

Time to first calibrated model	✓ 8-16 weeks	6-12 months (hiring + ramp)	2-4 weeks (uncalibrated)
Industry-specific feature engineering	✓ Deep, custom	~ Depends on hire	✗ Generic only
Production deployment experience	✓ 4+ years	~ Variable	✗ Platform-locked
Calibration & abstain logic rigor	✓ Standard practice	~ Often skipped	✗ Rarely available
Open-source proof points	✓ HF + Kaggle	✗ N/A	✗ Closed platform
Monitoring & drift detection	✓ Built in	~ Manual setup	~ Platform-tier dependent
Cost (typical)	✓ $40K-$150K project	$300K+/year team	$50K-$500K/year licensing
You own the IP	✓ Full transfer	✓ Yes	✗ Platform vendor lock-in
Long-term maintenance	✓ Optional retainer	~ Ongoing team cost	~ Subscription required

Low-Risk ML

How We Make ML Engagements Low-Risk

ML projects fail more often than they succeed. We’ve engineered five concrete safeguards to flip those odds.

01

Discovery Sprint Truth

If our 2-week Discovery determines ML isn’t the right tool, we tell you. You pay only for the Discovery — never for the wrong build.

02

NDA-First Engagement

Mutual NDA before any data review. Every engineer signs an individual NDA. ISO 27001 certified for information security.

03

Calibrated Honesty

We share confidence intervals on our own delivery estimates. No vendor over-promising. We tell you the realistic accuracy ceiling before training begins.

04

Full IP & Model Transfer

Source code · trained models · training pipelines · model card · eval summary · runbooks. All yours on completion. No retention.

05

No Lock-In Exit

Documentation complete enough that another team could maintain it. We make leaving easy — that’s why nobody does.

AddWeb AI Suite

This Is One Model. We Run an AI Practice.

Three owned AI products. Four years of production AI work. The La Liga Score Predictor is a sample — not the ceiling.

Beyond what’s open-sourced, AddWeb operates a full AI Solutions practice covering Generative AI & LLM development, AI Agents & Automation, Custom ML, AI for eCommerce, AI Voice Agents, and Computer Vision. Three of our AI products are in production today:

AddWeb AI

Customizable AI Platform for Business Workflows

EcomSupport360

AI-Powered eCommerce Automation

WeWP

AI-Driven WordPress Hosting

Six Commitments

How We Engineer ML

Book Your Call with a
Full Stack Expert

Ravi Maniyar

Director – Full Stack Development
With 13+ years of experience in JavaScript, TypeScript, and React Native, building high-performance web and mobile applications with scalable, clean architecture.

Free Resource

The 27-Point ML Project Audit Checklist

Before you commit budget to a custom ML project, audit your readiness across 27 critical factors — data quality, problem framing, deployment fit, ROI viability, team capability. Used by senior CTOs to de-risk ML investments.

Get the Checklist (Free)

FAQs