File size: 4,583 Bytes
6ed2e4d 78b884a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 |
---
license: apache-2.0
tags:
- code
- code-refactoring
- bug-detection
- code-translation
- static-analysis
- transformer
- developer-tools
language:
- code
pipeline_tag: other
model_type: transformer
library_name: transformers
datasets:
- custom
trained_on:
- multi-language code repositories
- refactor pairs
- bugfix pairs
- conversion pairs
---
# π Universal Code Refactor 32B
Universal Code Refactor 32B is a complete **AI-driven code engineering system** designed to automate large-scale refactoring, bug discovery, language-to-language conversion, and code optimization.
The project includes a full toolkit: **model**, **pipelines**, **refactor engine**, **bug detector**, **conversion engine**, **API**, **CLI**, **Gradio UI**, **datasets**, and **training scripts**.
# π Features
## π§ 1. Multi-Language Code Refactoring
Supports intelligent transformations for multiple languages:
- **Python**
- **Java**
- **JavaScript**
Includes:
- Automatic formatting (Black + isort)
- Unused import removal
- Inline simple functions
- Java loop modernization β for-each syntax
- JavaScript `var β let` transformation
- Structural code cleanup
- Rule-based + AST-based hybrid refactoring
## π 2. Static Bug Detection
Real AST-based detection, including:
- Possible None/null dereferences
- Unused variables
- Unsafe JavaScript `eval()` usage
- Missing null checks in Java
- Future support for type-based reasoning
## π 3. Multi-Language Code Conversion
Built-in conversions:
- **Python β Java**
- **Java β Python**
Supports:
- Function extraction
- Main() generation
- Basic block translation
- Extendable conversion rules
## π 4. Patch & Diff Generation
Automated patch engine creates:
- Unified diffs
- Patch previews
- Patch cleanliness scores
- Complexity reduction metrics
Useful for PR automation and CI pipelines.
## π§ 5. Compact Transformer Code Model
The model includes:
- Token embedding
- Positional encoding
- Transformer encoder stack
- Code-token-aware tokenizer
- Modular upgrade path to LLaMA / CodeGen / StarCoder models
## π 6. Deployment Ecosystem
Included ready-to-run components:
### β FastAPI REST Server
```
uvicorn inference.api_server:app --reload
```
### β CLI Tool
```
python inference/cli.py --mode refactor --file example.py
```
### β Gradio Web UI
```
python inference/gradio_app.py
```
### β Docker Container
```
docker build -t universal-refactor .
docker run -p 8000:8000 universal-refactor
```
### β Hugging Face Spaces App
Located inside `/deployment/huggingface_spaces/`
# π Project Structure
```
Universal-Code-Refactor-32B/
β
βββ README.md
βββ requirements.txt
βββ MODEL_CARD.md
β
βββ src/universal_refactor/
β βββ refactor_engine.py
β βββ bug_detector.py
β βββ code_converter.py
β βββ patch_generator.py
β βββ pipelines.py
β βββ tokenizer.py
β βββ model.py
β βββ long_context_manager.py
β βββ utils.py
β βββ embeddings/
β
βββ inference/
β βββ api_server.py
β βββ cli.py
β βββ gradio_app.py
β
βββ deployment/
β βββ Dockerfile
β βββ huggingface_spaces/
β
βββ training/
β βββ pretrain.py
β βββ finetune_refactor.py
β βββ finetune_bugfix.py
β βββ tokenizer_training.py
β βββ long_context_training.py
β βββ distributed/
β
βββ datasets/
βββ code_repo_raw/
βββ multilingual_code_clean/
βββ refactor_pairs/
βββ bugfix_pairs/
βββ conversion_pairs/
βββ metadata.json
```
# π Installation
## 1. Clone Repository
```
git clone https://github.com/YOUR_USERNAME/universal-code-refactor-32b
cd universal-code-refactor-32b
```
## 2. Install Dependencies
```
pip install -r requirements.txt
```
# π Usage Examples
## π§ Refactor Python Code
```
python inference/cli.py --mode refactor --file sample.py --lang python
```
## π Convert Java β Python
```
python inference/cli.py --mode convert --file MyClass.java --src java --tgt python
```
## π Run Web UI
```
python inference/gradio_app.py
```
# π Evaluation Tools
The evaluation pipeline computes:
- Cyclomatic complexity reduction
- Patch cleanliness
- Code change metrics
- Structural improvement score
Run evaluation:
```
python evaluation/evaluate.py
```
|