K=10,T=0.8: jnp . square ( jax . nn . relu ( x 1 - x 2 ) ) / dispersion return jnp . exp ( d 2 - jnp . ( d , d 2 ) ) def _ swap _ prob _ entropy _ reg _ ( x 1 , x 2 , dispersion = 1 . 0 , norm _ p = 1 . 0 ) : d = 2 * jnp . ( jax . nn . relu ( x 2 - x 1 ) , norm _ p
K=10,T=0.8: small will a large learning rate and will results . _ : if true , if gradient . initial _ const : the initial - constant to use to the of distance and confidence . should be set to a small value ( but ) . _ const : the constant to use we . should ( _ . _ _ _
K=10,T=0.8: jnp . square ( jax . nn . relu ( x 1 - x 2 ) ) / dispersion return jnp . exp ( d 2 - jnp . ( d , d 2 ) ) def _ swap _ prob _ entropy _ reg _ ( x 1 , x 2 , dispersion = 1 . 0 , norm _ p = 1 . 0 ) : d = 2 * jnp . ( jax . nn . relu ( x 2 - x 1 ) , norm _ p = . ' . _ . . _ . ) . _