chewwt commited on
Commit
89d3393
·
verified ·
1 Parent(s): db27964

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - pytorch
4
+ - safetensors
5
+ license: mit
6
+ ---
7
+
8
+ # dm_qwen4b_noise_emulator
9
+
10
+ Laplacian kernel regression noise model for `std_math` prediction in data mixture optimization.
11
+
12
+ Predicts per-configuration std_math (across seeds) given data mixture proportions,
13
+ used as a heteroscedastic noise model in Bayesian optimization.
14
+
15
+ ## Architecture
16
+
17
+ - Kernel: Laplacian `K(x, x') = exp(-γ · ||x - x'||₁)`
18
+ - Support points: 50 training configs
19
+ - Input features (3): `[if_prop1, math_prop1, math_prop2]` (values in [0, 1])
20
+ - Output: predicted std_math (scalar)
21
+
22
+ ## Usage
23
+
24
+ ```python
25
+ import torch
26
+ from huggingface_hub import hf_hub_download
27
+ from safetensors.torch import load_file
28
+
29
+
30
+ class KernelRegressionModel(torch.nn.Module):
31
+ def __init__(self, dual_coef, X_fit, gamma=0.1):
32
+ super().__init__()
33
+ self.gamma = gamma
34
+ self.register_buffer("dual_coef", dual_coef)
35
+ self.register_buffer("X_fit", X_fit)
36
+
37
+ def forward(self, x):
38
+ dist = torch.cdist(x, self.X_fit, p=1)
39
+ K = torch.exp(-self.gamma * dist)
40
+ return K @ self.dual_coef
41
+
42
+
43
+ path = hf_hub_download("chewwt/dm_qwen4b_noise_emulator", "noise_model.safetensors")
44
+ tensors = load_file(path)
45
+ model = KernelRegressionModel(tensors["dual_coef"], tensors["X_fit"])
46
+ model.eval()
47
+
48
+ # x: (batch, 3) float64 tensor, features in [0, 1]
49
+ x = torch.tensor([[0.3, 0.4, 0.2]], dtype=torch.float64)
50
+ with torch.no_grad():
51
+ sigma = model(x) # predicted std_math
52
+ ```