ZibinDong
/

ActionCodec-Base-RVQft

@@ -2,7 +2,6 @@
 library_name: transformers
 tags: []
 ---
 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
@@ -13,25 +12,16 @@ tags: []
 ### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
 ### Model Sources [optional]
 <!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
 ## Uses
@@ -41,37 +31,85 @@ This is the model card of a 🤗 transformers model that has been pushed on the
 <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
 ### Downstream Use [optional]
 <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
 ### Out-of-Scope Use
 <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
 ## Bias, Risks, and Limitations
 <!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
 ### Recommendations
 <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
 ## How to Get Started with the Model
 Use the code below to get started with the model.
-[More Information Needed]
 ## Training Details

 library_name: transformers
 tags: []
 ---
 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
 ### Model Description
+ActionCodec model trained on 3 embodiments:
+- franka_libero_20hz_1s
+- widowx_bridge_5hz_3s
+- franka_droid_15hz_1s
 ### Model Sources [optional]
 <!-- Provide the basic links for the model. -->
+TODO
 ## Uses
 <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+```python
+import numpy as np
+from transformers import AutoModel
+np.set_printoptions(suppress=True)
+if __name__ == "__main__":
+    tokenizer = AutoModel.from_pretrained("ZibinDong/ActionCodec-Base-RVQft", trust_remote_code=True)
+    q99 = np.array([0.9375, 0.91071427, 0.9375, 0.20357142, 0.26357144, 0.375, 1.0])
+    q01 = np.array([-0.87857145, -0.87589288, -0.9375, -0.15107143, -0.20678571, -0.27964285, 0.0])
+    # an example action from physical-intelligence/libero
+    action = np.array(
+        [
+            [0.3268, 0.2089, -0.3295, 0.0000, -0.0868, -0.0611, 1.0000],
+            [0.3696, 0.1955, -0.2866, 0.0000, -0.0793, -0.0643, 1.0000],
+            [0.3857, 0.1929, -0.2759, 0.0000, -0.0782, -0.0654, 1.0000],
+            [0.3964, 0.2089, -0.2786, 0.0000, -0.0761, -0.0654, 1.0000],
+            [0.3321, 0.1741, -0.3268, 0.0000, -0.0793, -0.0686, 1.0000],
+            [0.2250, 0.0964, -0.4232, 0.0000, -0.0932, -0.0761, 1.0000],
+            [0.0723, 0.0000, -0.5625, 0.0000, -0.1339, -0.0879, 1.0000],
+            [0.0536, 0.0000, -0.5652, 0.0000, -0.1521, -0.0921, 1.0000],
+            [0.0750, 0.0000, -0.5464, 0.0000, -0.1511, -0.0964, 1.0000],
+            [0.0723, 0.0000, -0.5411, 0.0000, -0.1414, -0.0986, 1.0000],
+            [0.0402, 0.0000, -0.5196, 0.0000, -0.1350, -0.1007, 1.0000],
+            [0.0080, 0.0000, -0.4795, 0.0000, -0.1189, -0.1018, 1.0000],
+            [0.0000, 0.0000, -0.4527, 0.0000, -0.0986, -0.1018, 1.0000],
+            [0.0000, 0.0000, -0.4313, 0.0000, -0.0846, -0.1018, 1.0000],
+            [-0.0455, -0.0268, -0.3509, 0.0000, -0.0568, -0.1018, 1.0000],
+            [-0.0964, -0.0482, -0.3321, 0.0000, -0.0439, -0.1039, 1.0000],
+            [-0.1768, -0.0562, -0.3402, 0.0000, -0.0300, -0.1050, 1.0000],
+            [-0.2438, -0.0429, -0.3187, 0.0000, -0.0193, -0.0996, 1.0000],
+            [-0.3054, -0.0054, -0.2893, 0.0000, -0.0139, -0.0932, 1.0000],
+            [-0.3509, 0.0000, -0.2598, 0.0000, -0.0054, -0.0879, 1.0000],
+        ],
+    )[None]
+    # normalization
+    normalized_action = np.copy(action)
+    normalized_action[..., :-1] = normalized_action[..., :-1] / np.maximum(np.abs(q99), np.abs(q01))[..., :-1]
+    normalized_action[..., -1] = normalized_action[..., -1] * 2.0 - 1.0  # scale to [-1, 1]
+    normalized_action = normalized_action.clip(-1.0, 1.0)
+    # tokenization
+    tokens = tokenizer.encode(normalized_action)  # numpy (b, n, d) -> list of ints
+    print(tokens)
+    # decoding
+    decoded_action, padding_mask = tokenizer.decode(tokens)  # list of ints -> numpy (b, n, d)
+    # calculate reconstruction error
+    mse_error = np.mean((normalized_action - decoded_action) ** 2)
+    l1_error = np.mean(np.abs(normalized_action - decoded_action))
+    print(f"Reconstruction MSE error: {mse_error:.6f}")
+    print(f"Reconstruction L1 error: {l1_error:.6f}")
+```
 ### Downstream Use [optional]
 <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+TODO
 ### Out-of-Scope Use
 <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+TODO
 ## Bias, Risks, and Limitations
 <!-- This section is meant to convey both technical and sociotechnical limitations. -->
+TODO
 ### Recommendations
 <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+TODO
 ## How to Get Started with the Model
 Use the code below to get started with the model.
+TODO
 ## Training Details