addweb-solution

Upload FAQ.md with huggingface_hub

e435ddf verified 17 days ago

preview code

raw

history blame contribute delete

4.47 kB

FAQ

Is this model for all football leagues?

No. This public release is intended only for Spanish La Liga.

Can I use it for Premier League or Champions League?

Not as a validated release. You can experiment privately, but the published bundle is not documented or evaluated for those competitions.

Why do I need a history CSV?

Because predict_match(...) builds features from past match data. The model does not fetch live match history by itself.

In simple terms:

the model contains learned prediction patterns
the history CSV provides the current football context needed for a future fixture

Without that context, the package cannot know:

each team's recent form
recent goals for and against
recent rest timing
recent tactical identity signals

This package should be understood as:

a model-and-inference bundle
not a bundled football data service

Do I need to manually enter fields like `home_player_red_cards_total_prev5`?

No, not for the normal public flow. That is exactly why the bundle includes a feature-building wrapper. Most users should use:

predict_match(...)
predict_match_simple(...)

The raw feature interface is only for advanced users who already manage engineered features themselves.

What does `abstain_recommended` mean?

It means the fixture is fragile enough that the exact score should be treated cautiously. In those cases, the scoreline is less trustworthy than the overall probability shape.

What is `xg_delta`?

xg_delta means:

expected_home_goals - expected_away_goals

How to read it:

positive: the home side has the stronger expected scoring outlook
negative: the away side has the stronger expected scoring outlook
near zero: the match is more balanced

Why are there simple and advanced outputs?

Because many products only need a few fields, while developers may want more diagnostics.

Use these for a simple product-facing response:

predict_match_simple(...)
predict_features_simple(...)

Use the full methods if you want richer diagnostics.

What if my CSV only has the minimum columns?

The bundle will still run. But prediction quality may be weaker because richer fields will use fallback defaults.

Why does the model talk about 48 signals if the sample CSV has fewer raw columns?

Because the wrapper builds the final model input row before inference.

Some signals come from:

raw columns already present in the history CSV
rolling features derived from past match results
fallback defaults when richer optional columns are missing

So:

the model expects 48 numeric signals at prediction time
your history CSV does not need to contain all 48 as raw columns
but a richer CSV helps the wrapper build a stronger feature row

Will I get the exact same answers as your internal environment?

Not necessarily.

You should expect the same model logic, but not guaranteed identical predictions unless the input history data is also effectively identical.

Differences in:

historical rows
team IDs
Elo values
tactical IDs
coach IDs
rolling-form inputs

can all change the final prediction.

How should I name teams in my CSV?

Keep naming consistent. The wrapper normalizes common variations, including accent and case differences, but stable naming is still best.

Examples:

Atletico Madrid and Atlético de Madrid
Mallorca and Real Mallorca

Are the sample CSV files real production data?

No. They are synthetic examples included only to make the bundle runnable out of the box.

They are useful for:

learning the package shape
testing integration
understanding the expected CSV schema

They are not meant to represent:

a full production dataset
a complete public La Liga historical archive
the exact private environment used during internal experimentation

Can I use the package without providing historical data?

Not for predict_match(...).

If you do not want to provide historical data, your alternative is the advanced path:

predict_features(...)
predict_features_simple(...)

Those methods require you to provide the engineered numeric features directly.

Are we expected to publish large historical CSV files too?

No. In most cases you should not publish large internal historical datasets unless redistribution rights are explicit. The safer pattern is:

publish the model bundle
publish schema docs
publish synthetic samples
let users bring their own historical CSV

FAQ