chewwt
/

dm_qwen4b_noise_emulator

chewwt commited on Apr 29

Commit

89d3393

verified ·

1 Parent(s): db27964

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md ADDED Viewed

+---
+tags:
+  - pytorch
+  - safetensors
+license: mit
+---
+# dm_qwen4b_noise_emulator
+Laplacian kernel regression noise model for `std_math` prediction in data mixture optimization.
+Predicts per-configuration std_math (across seeds) given data mixture proportions,
+used as a heteroscedastic noise model in Bayesian optimization.
+## Architecture
+- Kernel: Laplacian  `K(x, x') = exp(-γ · ||x - x'||₁)`
+- Support points: 50 training configs
+- Input features (3): `[if_prop1, math_prop1, math_prop2]`  (values in [0, 1])
+- Output: predicted std_math (scalar)
+## Usage
+```python
+import torch
+from huggingface_hub import hf_hub_download
+from safetensors.torch import load_file
+class KernelRegressionModel(torch.nn.Module):
+    def __init__(self, dual_coef, X_fit, gamma=0.1):
+        super().__init__()
+        self.gamma = gamma
+        self.register_buffer("dual_coef", dual_coef)
+        self.register_buffer("X_fit", X_fit)
+    def forward(self, x):
+        dist = torch.cdist(x, self.X_fit, p=1)
+        K = torch.exp(-self.gamma * dist)
+        return K @ self.dual_coef
+path = hf_hub_download("chewwt/dm_qwen4b_noise_emulator", "noise_model.safetensors")
+tensors = load_file(path)
+model = KernelRegressionModel(tensors["dual_coef"], tensors["X_fit"])
+model.eval()
+# x: (batch, 3) float64 tensor, features in [0, 1]
+x = torch.tensor([[0.3, 0.4, 0.2]], dtype=torch.float64)
+with torch.no_grad():
+    sigma = model(x)   # predicted std_math
+```