Certified AI & Engineering Partner · Est. 2012
Hugging Face
Kaggle
CatBoost
MIT Licensed
v2026.04.1
One Model, Open-Sourced. The Architecture Powering ML for 6+ Industries.
Published openly on Hugging Face and Kaggle — a 48-signal CatBoost predictor for Spanish La Liga. The same engineering powers production ML for our fintech, healthcare, eCommerce, SaaS, logistics, and EdTech clients. Read the model. Audit the code. Then let’s discuss your prediction problem.
Model Type
CatBoost · Tabular
Public Version
2026.04.1
License
MIT · Open
Published On
Audited By the ML Community
Published on the two leading open ML platforms. Inspectable, downloadable, and versioned.
Hugging Face
addweb-solution / la-liga-score-predictor
The home of open ML. Model card, evaluation summary, training artifacts, SHA256 verification, and full Python package distribution.
Tabular-Classification · Catboost · La-Liga · Machine-Learning

Kaggle
addwebsolutionpvtltd / la-liga-score-predictor
The home of competitive data science. Public dataset distribution, notebook integration, and discoverability across millions of data scientists worldwide.
Sports · Prediction · Football · Dataset
THE WHY
Most Agencies Talk AI. We Ship It.
Every offshore development agency claims AI capability. Few have shipped a trained model anyone can audit. We built and open-sourced this La Liga Score Predictor for one reason: to make our ML capability verifiable, not just visible.
If a CTO wants to know whether AddWeb really does ML, they don’t have to take our word for it. They read the code, audit the model card, run the smoke test, inspect the SHA256 artifacts.
This is one model. We have built dozens. The rest live inside client production systems under NDA. This one is yours to inspect, fork, learn from, and use as proof of what we can build for you.
Inference Pipeline Architecture
Historical Match CSV
scores · Elo · tactics · player aggregates
Feature Builder
rolling form · 35-column normalization · fallback defaults
48-Signal Feature Vector
team form · Elo deltas · player minutes · tactic stability
CatBoost Ensemble
home_goals · away_goals · outcome models
Calibration Layer
temperature scaling · abstain logic · score range
Predicted Score + Probabilities + Confidence
ADDWEB ML PRACTICE
Open-Source Is the Sample. Here’s the Practice Behind It.
ENGINEERING DEEP DIVE
What’s Actually Inside
Six engineering decisions that separate this from a wrapper around a public API. Each one matters when shipping ML to production.
01
Three-Model Ensemble
Predicting a football score is not one prediction. It’s three: how many goals home, how many goals away, and the outcome direction. We trained three independent CatBoost models and reconcile them through a calibrated decoder.
02
Calibrated Probabilities
Raw model probabilities lie. They overstate confidence. We apply temperature scaling to produce probabilities that reflect actual outcome frequencies — meaning a 60% confidence prediction is right roughly 60% of the time.
03
Abstain Logic
Not every match is predictable. Tightly contested derbies, fragile tactical matchups, and historically volatile pairings produce ambiguous probabilities. The model knows when to flag a fixture as fragile rather than overcommit.
04
Resilient Feature Builder
Real-world data is messy. The wrapper handles thin CSVs, missing optional columns, inconsistent team names, and partial player aggregates. Quality scales with data richness — but the package never breaks because of imperfect input.
05
Two-Tier Public API
Different consumers need different shapes. Front-end product cards need the simple response. Power users need the full debug payload. Both interfaces sit on the same model artifact.
06
Production-Grade Packaging
This is not a notebook dumped on the internet. It is a real Python package — pyproject.toml, semantic versioning, SHA256-verified artifacts, model card, evaluation summary, FAQ, smoke test, batch inference, CLI demo, and notebook starter all included.
FEATURE ENGINEERING
The 48 Signals
No magic. Just disciplined feature engineering. Every signal earns its place in the model.
12
Form & Goals
Last-5 and last-10 average goals scored, conceded, and goal differentials — for both home and away contexts
06
Win Rates
Home, away, and overall win rates plus draw rates over the prior 10 matches
04
Elo Strength
Pre-match Elo ratings and Elo differential — capturing strength asymmetry between sides
02
Rest Days
Days of recovery since each team’s last competitive fixture
14
Player Aggregates
Minutes, goals, assists, cards, starters, used players, injuries, suspensions over prior 5 matches
06
Tactic & Coach IDs
Tactic identifier, coach identifier, and tactic-stability indicators per side
02
Team IDs
Stable identifiers for each side, enabling team-specific embeddings and patterns
02
Matchup Codes
Tactic-matchup composite codes capturing how specific tactical pairings historically resolve
TRY IT YOURSELF
From Install to Prediction in Five Lines
No API keys. No paywall. No registration. Open-source, MIT licensed, ready to fork.

CAPABILITIES
What the Model Returns
Predicted Score
Final exact scoreline chosen by the model — both home and away goal counts.
Calibrated Probabilities
Home win, draw, and away win probabilities — temperature-scaled for accuracy.
Confidence Level
Simple high / medium / low label with underlying confidence score and margin.
Abstain Signal
Boolean flag when the model believes the fixture is too fragile to commit to an exact score.
Score Range
Optional home / away score band returned for fragile fixtures instead of a single scoreline.
Decoder Diagnostics
Specialist rule triggers, outcome enforcement flags, and xG delta for advanced users.
OUR STANCE
Most agencies sell you a service. We sell you an outcome — then we don’t sleep until we deliver it.
We are not a body shop. We are not a feature factory. When we ship ML, we ship the model card, the evaluation summary, the SHA256 artifacts, and the responsible-use boundaries. Engineering integrity is not optional.
FOR YOUR INDUSTRY
Beyond Football — Custom ML We Build for Clients
The same 48-signal architecture you saw above generalizes to any data-rich domain. Six prediction problems we ship into production for clients today — and the value each one delivers.
FinTech
Credit Risk Scoring
Calibrated default-probability models for loan origination, underwriting decisions, and portfolio risk monitoring.
INPUTS
Transaction history · payment patterns · demographic signals · macro indicators · alternative data
OUTPUTS
Calibrated default probability · risk band · abstain flag for thin-file applicants
CLIENT VALUE
Lower default rates while expanding approval coverage. Better risk-adjusted returns.
Healthcare
Patient Risk Prediction
HIPAA-aware risk models for readmission, deterioration detection, and treatment response — built with proper PHI handling.
INPUTS
Clinical signals · lab trajectories · vital trends · intervention history · medication adherence
OUTPUTS
30-day readmission risk · deterioration signal · treatment response probability · BAA available
CLIENT VALUE
Earlier intervention. Better patient outcomes. Optimized resource allocation across care teams.
eCommerce
Demand Forecasting
SKU-level demand prediction with calibrated confidence intervals — for inventory planning, promotional sizing, and stockout prevention.
INPUTS
SKU sales velocity · price elasticity · seasonal patterns · promo calendar · external signals
OUTPUTS
Demand quantity intervals · stockout probability · replenishment recommendations · cold-start abstain
CLIENT VALUE
Reduced stockouts. Lower carrying costs. Higher fulfillment rates and margin.
SaaS
Churn Prediction
User-level churn probability across 30, 60, and 90-day windows — with intervention recommendations and abstain logic for cold-start accounts.
INPUTS
Engagement signals · feature-usage trajectories · support interaction patterns · billing history
OUTPUTS
Churn probability per window · intervention recommendation · cohort segmentation · drift signals
CLIENT VALUE
Targeted retention spend. Higher net revenue retention. Faster CAC payback.
Logistics
ETA & Route Modeling
Calibrated arrival-time distributions for last-mile delivery, freight scheduling, and fleet optimization — with confidence bands per delivery.
INPUTS
Historical traffic · driver patterns · weather signals · vehicle telemetry · route geometry
OUTPUTS
Calibrated ETA distributions · delay probability · route optimization · driver-level signals
CLIENT VALUE
Improved on-time delivery. Lower fuel costs. Higher customer satisfaction scores.
EdTech
Learner Outcome Prediction
Course-level completion probability and at-risk classification — driving personalized intervention and content recommendations.
INPUTS
Engagement signals · assessment trajectories · content consumption · interaction patterns
OUTPUTS
Course completion probability · at-risk classification · content recommendations · cohort analyticsCourse completion probability · at-risk classification · content recommendations · cohort analytics
CLIENT VALUE
Higher completion rates. Lower dropout. Personalized learning paths at scale.
Don’t see your industry?
If your business has historical data and a prediction problem, the same architecture probably applies.
THE PROCESS
How AddWeb ML Engagements Work
Six phases. Each with defined deliverables, sign-off gates, and clear go/no-go decisions. No black boxes, no perpetual scope creep.
Week 1-2
01
Discovery & Data Audit
Feasibility assessment, data readiness review, problem framing. Output: a calibrated go/no-go recommendation. If ML isn’t the right tool, we tell you.
Week 2-4
02
Feature Engineering
Design the signal architecture. Build the feature builder. Establish baseline performance. This is where 80% of model quality is determined.
Week 4-8
03
Model Training & Calibration
Train candidate models. Apply temperature scaling. Validate on held-out data. Engineer abstain logic. Document evaluation summary with calibrated metrics.
Week 8-10
04
Deployment Integration
Plug into your stack — REST API, batch pipeline, or embedded inference. Instrument tracking. Set up monitoring. Production-ready hand-off.
Ongoing
05
Monitoring & Drift Detection
Track real-world accuracy. Detect concept drift. Schedule retraining. Maintain model card. The work isn’t done at deployment — it’s done when the model still works two years in.
Continuous
06
Knowledge Transfer
Documentation, runbooks, training data pipelines, model card, evaluation summary. Your team can maintain this. We make leaving easy — that’s why nobody does.
How You Hire Us
Custom ML Engagement Models
Three ways to start, ranked by commitment level. Pick the one that matches your stage and risk tolerance.
ML Discovery Sprint
“Should we build this?”
2-week fixed-scope feasibility audit. Best for teams unsure whether ML is the right tool — or which prediction problem to solve first.
From $15K
2 weeks · fixed price
ML Practice Embed
“Ongoing ML capability.”
Monthly retainer. Senior ML engineers embedded with your team. Ideal for businesses building ML as a core capability across multiple use cases.
$20K – $40K/mo
Monthly · 3-month minimum
Pricing ranges shown for transparency. Exact rates depend on data complexity, integration scope, and team composition. Discussed in your free Discovery call.
AddWeb vs. In-House ML vs. AutoML Platforms
Three paths to a production ML model. Each has trade-offs. Here’s the honest breakdown.
Factor
AddWeb ML Practice
Build In-House Team
AutoML Platforms
Low-Risk ML
How We Make ML Engagements Low-Risk
ML projects fail more often than they succeed. We’ve engineered five concrete safeguards to flip those odds.
01
Discovery Sprint Truth
If our 2-week Discovery determines ML isn’t the right tool, we tell you. You pay only for the Discovery — never for the wrong build.
02
NDA-First Engagement
Mutual NDA before any data review. Every engineer signs an individual NDA. ISO 27001 certified for information security.
03
Calibrated Honesty
We share confidence intervals on our own delivery estimates. No vendor over-promising. We tell you the realistic accuracy ceiling before training begins.
04
Full IP & Model Transfer
Source code · trained models · training pipelines · model card · eval summary · runbooks. All yours on completion. No retention.
05
No Lock-In Exit
Documentation complete enough that another team could maintain it. We make leaving easy — that’s why nobody does.
AddWeb AI Suite
This Is One Model. We Run an AI Practice.
Three owned AI products. Four years of production AI work. The La Liga Score Predictor is a sample — not the ceiling.
Beyond what’s open-sourced, AddWeb operates a full AI Solutions practice covering Generative AI & LLM development, AI Agents & Automation, Custom ML, AI for eCommerce, AI Voice Agents, and Computer Vision. Three of our AI products are in production today:
AddWeb AI
Customizable AI Platform for Business Workflows
EcomSupport360
AI-Powered eCommerce Automation
WeWP
AI-Driven WordPress Hosting
Six Commitments
How We Engineer ML
AI-native since 2022 — embedded into your stack, not bolted on top.
4.2 year average relationship. 98% retention. We grow with you.
Model cards. Evaluation summaries. SHA256 artifacts. No black boxes.
Code reviewed, architecture justified, models calibrated and abstain-aware.
Open-source proof on Hugging Face + Kaggle. ISO 9001 & 27001 certified.
13+ years shipping. 1000+ projects. Real ML in production today.

Book Your Call with a
Full Stack Expert

Ravi Maniyar
Director – Full Stack Development
With 13+ years of experience in JavaScript, TypeScript, and React Native, building high-performance web and mobile applications with scalable, clean architecture.
Free Resource
The 27-Point ML Project Audit Checklist
Before you commit budget to a custom ML project, audit your readiness across 27 critical factors — data quality, problem framing, deployment fit, ROI viability, team capability. Used by senior CTOs to de-risk ML investments.
FAQs
Questions Engineers & Buyers Ask
It’s an open-source machine learning model built by AddWeb Solution that predicts pre-match outcomes for Spanish La Liga fixtures. The 48-signal CatBoost architecture behind it is the same approach we deploy for production ML across fintech, healthcare, eCommerce, SaaS, logistics, and EdTech clients.
To make our ML capability verifiable, not just visible. Anyone evaluating AddWeb for AI work can audit the actual code, architecture, documentation, and engineering rigor. It is shipped under MIT license on both Hugging Face and Kaggle with full model card, evaluation summary, and SHA256-verified artifacts.
Yes. The 48-signal feature engineering, ensemble classification, calibrated probabilities, and abstain logic translate directly to fintech credit scoring, healthcare risk prediction, eCommerce demand forecasting, SaaS churn modeling, logistics ETA, and EdTech outcome prediction. We engineer custom variants per client engagement.
A 2-week Discovery Sprint validates feasibility. A full Custom ML Build typically runs 8 to 16 weeks from data audit to deployed production model — depending on data complexity, integration scope, and required model variants. Discovery outputs a calibrated timeline before commitment.
Discovery Sprints start at $15,000. Custom ML Builds typically range $40,000 to $150,000 depending on scope. ML Practice Embeds (ongoing retainer) start at $20,000 per month. Exact pricing depends on data complexity, deployment requirements, and timeline. Discussed in your free Discovery call.
Our Discovery Sprint includes a data readiness audit. If your data isn’t sufficient for the prediction problem, we tell you directly — and recommend either data collection strategies, simpler heuristic alternatives, or pausing the project. We don’t build models on inadequate data.
Yes. NDA-first is our standard engagement model. Mutual NDA before any data review or technical conversation. Every engineer assigned to your engagement signs an individual NDA. We hold ISO 27001 certification for information security.
This is called concept drift, and it’s why we include monitoring and retraining in every production engagement. We instrument drift detection, set retraining schedules, and provide runbooks for ongoing maintenance. The Discovery Sprint defines the monitoring strategy upfront.
Yes — fully. On final payment, all source code, trained model artifacts, training pipelines, model card, evaluation summary, and runbooks transfer to you. No retention, no lock-in. Our exit clause makes the handover comprehensive.
AutoML platforms automate model training but fall short on industry-specific feature engineering, calibration rigor, abstain logic, and production deployment integration. We build the same approach manually with senior ML engineers — meaning better calibration, deeper feature engineering, custom integration with your stack, and ongoing maintenance partnership.
CatBoost gradient boosting for the three trained artifacts. Python package with pyproject.toml standard. Calibrated probabilities via temperature scaling. MIT license. Versioned via semantic versioning (current: 2026.04.1). For client engagements we adapt the stack to the use case — XGBoost, LightGBM, scikit-learn, PyTorch, Hugging Face Transformers as appropriate.
Yes. Schedule a 30-minute Discovery Call with our ML practice. We’ll assess your data, identify the highest-leverage prediction problem, and tell you honestly whether ML is the right tool. If we determine it’s not, we’ll recommend simpler alternatives. No sales pitch.
Your Move
Three Ways to Start. Pick Your Commitment Level.
From a 30-minute Discovery Call to a free audit checklist, every entry point is designed to give you something useful before you commit a dollar.
We respond within 4 business hours · NDA available on request · ISO 27001 certified
