SangminLee-NOTA commited on
Commit
51ad96b
·
verified ·
1 Parent(s): 90b2638

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -7
README.md CHANGED
@@ -10,7 +10,7 @@ pinned: false
10
  <div align="center">
11
  <img src="https://netspresso-docs-imgs.s3.ap-northeast-2.amazonaws.com/imgs/banner/huggingfacenota.png"
12
  alt="Nota AI Banner"
13
- style="width: 100%; height: 200px; object-fit: cover;" />
14
  </div>
15
 
16
  <br>
@@ -33,13 +33,14 @@ From our automated **optimization platform** to bespoke **AI solutions**, we ens
33
  > ## **World Best LLM (WBL) Project**
34
  > Nota AI participates in the **'World Best LLM' (WBL)** project, a key initiative by the South Korean government (NIPA) to develop global-tier foundation models. As a core optimization partner, we focus on compressing massive LLMs for practical deployment.
35
  >
36
- > ## **🔥 New Release: [Qwen3-30B-A3B-NotaMoEQuant-Int4](https://huggingface.co/nota-ai/Qwen3-30B-A3B-NotaMoEQuant-Int4)**
37
- > **4-bit Quantization for Mixture-of-Experts (MoE)**
38
  >
39
- > This model demonstrates our proprietary **NotaMoEQuant** technology applied to the Qwen3-30B architecture.
40
- > * **Optimization Tech:** NotaMoEQuant (Int4 Quantization for Active Parameters).
41
- > * **Key Benefit:** Significantly reduces memory bandwidth requirements while maintaining reasoning capabilities of the 30B MoE model.
42
- > * **Target:** Efficient inference on consumer-grade GPUs and edge servers.
 
43
 
44
  # 🚀 Our Core Business
45
  <table border="0" cellspacing="0" cellpadding="0" style="border: none; border-collapse: collapse; width: 100%;">
 
10
  <div align="center">
11
  <img src="https://netspresso-docs-imgs.s3.ap-northeast-2.amazonaws.com/imgs/banner/huggingfacenota.png"
12
  alt="Nota AI Banner"
13
+ style="width: 100%; height: auto; max-width: 100%;" />
14
  </div>
15
 
16
  <br>
 
33
  > ## **World Best LLM (WBL) Project**
34
  > Nota AI participates in the **'World Best LLM' (WBL)** project, a key initiative by the South Korean government (NIPA) to develop global-tier foundation models. As a core optimization partner, we focus on compressing massive LLMs for practical deployment.
35
  >
36
+ > ## **🔥 New Release: [Solar-Open-100B-NotaMoEQuant-Int4](https://huggingface.co/nota-ai/Solar-Open-100B-NotaMoEQuant-Int4)**
37
+ > **Quantized Model for Upstage's Solar-Open-100B**
38
  >
39
+ > This model is optimized using our proprietary **NotaMoEQuant**, a specialized methodology for Mixture-of-Experts (MoE) architectures.
40
+ > * **Why NotaMoEQuant:** Unlike conventional methods (e.g., AutoRound) that overlook expert routing changes during quantization, our approach directly resolves the resulting representational distortion, delivering superior benchmark accuracy.
41
+ > * **Hardware Efficiency:** Reduces the GPU requirement for maximum context generation from **4x A100 (80GB) to 2x A100 (80GB)**, saving up to 50% on inference costs.
42
+ >
43
+ > *Also available: [Solar-Open-100B-Nota-FP8](https://huggingface.co/nota-ai/Solar-Open-100B-Nota-FP8)*
44
 
45
  # 🚀 Our Core Business
46
  <table border="0" cellspacing="0" cellpadding="0" style="border: none; border-collapse: collapse; width: 100%;">