Instructions to use thinktecture/intent-logreg-nextera with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Scikit-learn
How to use thinktecture/intent-logreg-nextera with Scikit-learn:
from huggingface_hub import hf_hub_download import joblib model = joblib.load( hf_hub_download("thinktecture/intent-logreg-nextera", "sklearn_model.joblib") ) # only load pickle files from sources you trust # read more about it here https://skops.readthedocs.io/en/stable/persistence.html - Notebooks
- Google Colab
- Kaggle
Model Licenses β Read Before Redistributing
The Apache-2.0 LICENSE at the repo root covers this repo's code.
It does not cover the model weights you download, train against, or merge into GGUFs. Those follow their respective base-model licenses. If you fine-tune one of these models and publish the result (e.g. push to HuggingFace, ship in a product), the base-model terms come with it.
Base models used by this pipeline
| Model | HF ID | License | Click-through? |
|---|---|---|---|
| Gemma 3 (1B, 4B, EmbeddingGemma 300M) | google/gemma-3-*, google/embeddinggemma-300m |
Gemma Terms of Use | Yes β once per HF account |
| Qwen 3.5-4B | Qwen/Qwen3.5-4B |
Tongyi Qianwen License Agreement | No |
| GLM-OCR (optional, for OCR upload path) | zai-org/GLM-OCR |
MIT (check current README) | No |
The
ggml-org/*-GGUFrepos referenced insetup.share official llama.cpp quantizations of the same upstream models above β same license terms apply.
What "derivative work" means here
When you run python -m finetune.train_gemma3 and produce
models/gemma3-1b-ft-merged/gemma3-ft-<scenario>.gguf, that file is a derivative
work of Gemma 3. Anyone who downloads it must accept the Gemma Terms of Use
the same way you did when you first pulled the base weights from HuggingFace.
Practical implications:
- Inside this repo only: nothing to do. The base weights are gitignored
(
models/*.gguf), the FT outputs are gitignored, and the training data is scenario-specific synthetic content owned by you. - Publishing a fine-tuned GGUF: include the Gemma / Qwen license file in the release, name the base model in the model card, and follow the upstream attribution requirements.
- Building a product on top: the model is "used", not redistributed β most permissive license terms allow that. Read the actual license; this is not legal advice.
Gemma 3 β quick reference
The Gemma Terms of Use (you accept them when you click "Agree" on the HuggingFace page) permit:
- Commercial use, distribution, modification, fine-tuning
- Creating derivative models (including merged + quantized GGUFs)
And require:
- Including the prohibited-use policy in any redistribution
- Marking outputs from Gemma as AI-generated when relevant
- Not redistributing without including the Terms
There's no patent grant in the Gemma Terms (unlike Apache-2.0). For most AI-application scenarios this is benign, but it's the main thing that distinguishes "Gemma derivative" from "Apache-2.0 derivative."
Qwen 3.5 β quick reference
The Tongyi Qianwen License permits commercial use up to a monthly active user threshold (re-check the current text β Alibaba has revised this between Qwen versions). Above the threshold a separate commercial license is required.
For typical local-AI / on-prem deployments well under that MAU bar, the license behaves like an Apache-2.0-equivalent. Attribution to Qwen / Alibaba Cloud is required in any distribution.
Your fine-tuned outputs
The training data, prompts, and scenario configurations in this repo are MIT/Apache-2.0
licensed (per the repo LICENSE). The merged GGUF model files that
the training pipeline produces are:
- a derivative of the base model β covered by Gemma Terms / Qwen License
- AND contain weights influenced by your training data β covered by your own license
In practical terms, when you publish a fine-tuned GGUF you should ship:
- The base-model license (Gemma Terms or Qwen License, alongside the GGUF)
- A model card describing your training data + intended use (you can use
the existing
data/training-data/*.jsonlas the data card) - Your own license for the training-data-derived artifacts (Apache-2.0
inherited from this repo's
LICENSEis the default)
Not legal advice
This is a one-page operator's summary. For anything material (commercial release, public model card, enterprise deployment) consult an actual lawyer who reads AI licenses for a living β the Gemma and Qwen license texts have evolved multiple times and the obligations may have changed since this document was written.