--- tags: - model-merge - hermite-interpolation - deepseek base_model: - deepseek-ai/deepseek-math-7b-instruct - jahyungu/deepseek-math-7b-instruct_hendrycks_math --- # deepseek-7b-math-hendrycksmath-lambda06 2モデルの線形補間マージモデル。 ## Merge Configuration | Parameter | Value | |-----------|-------| | Model A | `deepseek-ai/deepseek-math-7b-instruct` | | Model B | `jahyungu/deepseek-math-7b-instruct_hendrycks_math` | | λ_a | 0.60 | | λ_b | 0.40 | | Formula | θ* = 0.60 × θ_a + 0.40 × θ_b | | dtype | torch.float16 | ## Tokenizer Union tokenizer (mergekit-style): vocabularies of both models are merged. - Union vocab size: 100002 - Tokens added from Model B: 0 - Tokens only in Model A: 0 For tokens missing from a model, the other model's embedding is used as fallback before linear interpolation. ## Description This model was created by linearly interpolating the parameters of two models: - **Model A** (`deepseek-ai/deepseek-math-7b-instruct`): weight = 0.60 - **Model B** (`jahyungu/deepseek-math-7b-instruct_hendrycks_math`): weight = 0.40