Vu Hoang Anh | AI Engineer Portfolio

~/career-objective

Research-oriented AI Engineer with hands-on engineering capabilities. My focus areas include:

Machine Learning — Deep understanding of algorithms and model architectures
Natural Language Processing — Text processing, embeddings, and transformers
Large Language Models — Fine-tuning, evaluation, and production deployment

Seeking an internship to contribute to real AI projects while deepening expertise in LLM architecture and reasoning capabilities.

objective.py

class AIEngineer:
    def __init__(self):
        self.focus = ["ML", "NLP", "LLMs"]
        self.mindset = "research + engineering"
    
    def goals(self):
        # Not "learning AI for fun"
        return {
            "contribute": "real AI projects",
            "deep_dive": "LLM architecture",
            "master": "reasoning capabilities"
        }

~/education

Posts and Telecommunications Institute of Technology (PTIT)

Engineer of Information Technology — High-Quality Program

Aug 2023 – Apr 2028 (Expected)

3.86 / 4.00

CGPA

Scholarships

Fall 2023–2024

Academic Excellence

Spring 2023–2024

Academic Excellence

Spring 2024–2025

Academic Excellence

Fall 2024–2025

Academic Excellence

~/experience

AI & Backend Developer

PTIT IEC — Part-time

Oct 2024 – Present

Participated in academic and startup lab projects involving applied AI and automation
Optimized and refactored legacy research code, focusing on runtime performance and maintainability
Integrated custom LLM modules into backend systems to enhance context-aware response capabilities

Not typo-fixing intern — real backend + AI integration work

~/projects

Selected projects demonstrating research + production capabilities

Code Vulnerability Detection with LLMs

Apr 2025 – Sep 2025

Leader Researcher

Research + production pipeline for detecting and explaining security vulnerabilities in source code using LLMs.

0.80

GraphCodeBERT Accuracy

0.82

F1-Score

78.7%

Qwen2.5 Accuracy

64.3%

Explanation Clarity

GPT-4o-mini GraphCodeBERT Qwen2.5-Coder-14B LoRA SFT GGUF Docker

Technical Details

• Collected 30,000 code samples from DiverseVul (20k vulnerable, 10k non-vulnerable)
• Applied oversampling to balance dataset (20k–20k)
• Generated structured explanations via GPT-4o-mini API for classification and explanation tasks
• Fine-tuned GraphCodeBERT with LoRA SFT on Qwen2.5-Coder-14B
• Quantized to GGUF using llama.cpp, containerized with Docker

Efficient RAG Pipeline

Jan 2026 – Present

AI Engineer

Production-oriented RAG system with custom model selection, hybrid retrieval, and comprehensive evaluation.

0.98

RAGAS Faithfulness

0.99

Answer Relevance

# RAG Architecture

Input → Dynamic Chunking → Hybrid Retrieval

├─ Milvus (Vector)

├─ Elasticsearch (Keyword)

└─ RRF Fusion

→ GGUF Trio (Embed/Rerank/Gen) → Output

llama.cpp Milvus Elasticsearch RRF RAGAS GGUF

Technical Details

• Hand-picked GGUF model trio: Embedding, Reranking, Generation
• Full llama.cpp inference stack
• Dynamic hierarchical chunking strategy
• Hybrid retrieval: Milvus (vector) + Elasticsearch (keyword)
• Reciprocal Rank Fusion for result combination
• Evaluated with RAGAS on AWS RAG documentation (Gemini 3 Pro)

Tea Leaves Intelligent System

Feb 2025 – May 2025

Leader Team: 4

IoT architecture for real-time tea harvest management with enterprise-grade backend security.

Real-time IoT data collection via MQTT
Spring Security + JWT authentication
Automatic weight recording system
Farmer ID tracking integration

# Tech Stack

backend:

- Spring Boot

- Spring Security

- JWT

iot:

- MQTT

- Real-time sensors

data:

- MySQL

- Docker

Demonstrates system thinking beyond pure AI — enterprise backend + IoT integration

~/skills

Programming

Python C++ Java SQL

LLM & AI

Transformers PyTorch Unsloth LLaMA.cpp Ollama LangChain

Frameworks

FastAPI Spring Boot

Databases

MySQL PostgreSQL Milvus