EN

I continue to combine models on my device, doing this without knowing the exact indicators of the models. It will probably be necessary to start running models through benchmarks before merging them further, but that's not the case yet.

This time, Arcee Fusion is used for fusion.

Some information on the merger can be found here: https://huggingface.co/zelk12/Mergekit_Gemma-4-E2B

#GGUF

If I don’t delete it and forget to post it, then GGUF probably Q6 of this model can be found here: https://huggingface.co/zelk12/MT2_gemma-4-E2B-Q6_K-GGUF

You can also often find GGUF and imatrix GGUF here: mradermacher Most often I use their quanta myself if possible.

Well, sometimes he also posts my models: Otakadelic But in addition, he himself is also involved in combining models.

Information

What does the model name MT-Gen_gemma-4-E2B mean?

MT = merge test, just a merge test and a number, per generation.
Gen - what generation of associations this is, in general, it is not tied to anything, just an additional number, but usually the generation can change when I test, some completely new options. I also try not to include models of the same generation into the association.

Based on the name of the original model Gemma-4-E2B-it Gemma is Google's family of open models.

4 is essentially a generation of the model.
E2B - means that the model uses all effective parameters, which when calculated are equal to 2 billion ordinary parameters, or so.
it = instruction tuned, means that the model is prepared to work with instructions for this model to form a chat.

The hypothetical model itself can work with a 128k context. The sliding attention window has a size of 512 tokens. A variable aspect ratio function and image encoding options in 70, 140, 280, 560, 1120 tokens have been introduced. Additionally, Gemma-4-E2B-it models are usually capable of working with audio data.

Data on tokens may have changed due to the fact that this is not pure Gemma-4-E2B, but its combinations.

RU

Продолжаю, объединять модели на своём устройстве, делая, это не зная точных показателей моделей. Вероятно нужно будет начать проводить модели через бенчмарки, перед тем как объединять дальше, но пока этого нет.

В этот раз здесь для объединения применяется Arcee Fusion.

Некоторая информация по объединению, находится здесь: https://huggingface.co/zelk12/Mergekit_Gemma-4-E2B

GGUF

Если я не удалю и не забуду выложить, тогда GGUF вероятно Q6 этой модели можно будете найти здесь: https://huggingface.co/zelk12/MT2_gemma-4-E2B-Q6_K-GGUF

Также нередко GGUF и imatrix GGUF можно найти зесь: mradermacher Чаще всего я сам использую их кванты если это возможно.

Ну и иногда он тоже выкладывает мои модели: Otakadelic Но кроме того, он ещё и сам занимается объединением моделей.

Информация

Что значит название моделией MT2_gemma-4-E2B

MT = merge test, просто проверка объединений и номер, в поколении.
Gen - какое это поколение объединений, в целом, оно мало к чему привязано, просто дополнительная цифра, но обычно поколение может изменится, когда тестирую, какие-то совсем новые варианты. Также стараюсь не вводить в состав объединения, модели с тем же поколением.

По названию оригинальной модели Gemma-4-E2B-it Gemma - семейство открытых моделей Google.

4 - это по своей сути поколение модели.
E2B - значит что у модели всего используются эффективные параметры, которые при выполнении равны 2 миллиардам обычных параметров, или около того.
it = instruction tuned, значит что модель подготовлена работать с инструкциями для данной модели, для формирования чата.

Сама модель гипотетический может работать с контекстом 128к. Скользящее окно внимания имеет размер 512 токена. Введена функция переменного соотношения сторон и варианты кодировки изображения в 70, 140, 280, 560, 1120 токенов. Дополнительно модели Gemma-4-E2B-it обычно способны работать с аудио данными.

Данные по токенам могли измениться, из-за того что это не чистая Gemma-4-E2B, а её объединения.

MT2_gemma-4-E2B

MT2_gemma-4-E2B is a merge of the following models using LazyMergekit:

🧩 Configuration

  - model: TrevorJS/gemma-4-E2B-it-uncensored
    parameters:
      density: 0.8
      weight: 0.4

  - model: MrHurro/Caveman_gemma-4-E2B_checkpoint-5550
    parameters:
      density: 0.5
      weight: 0.6

merge_method: arcee_fusion
base_model: TrevorJS/gemma-4-E2B-it-uncensored
parameters:
  normalize: true
dtype: bfloat16
tokenizer_source: base

💻 Usage

!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "zelk12/MT2_gemma-4-E2B"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

Downloads last month: 2

Safetensors

Model size

5B params

Tensor type

F32

Model tree for zelk12/MT2_gemma-4-E2B

MrHurro/Caveman_gemma-4-E2B_checkpoint-5550

TrevorJS/gemma-4-E2B-it-uncensored

Merge model

this model

Quantizations

1 model