Yuna Sim

AI Engineer · LLM Agent · Computer Vision

Building end-to-end AI solutions — from LLM agent systems and RAG pipelines to computer vision for real-world deployment

L*66 a*14 b*28
L*50 a*5 b*20
L*18 a*0 b*0
L*68 a*5 b*36
L*93 a*0 b*10

Color palette inspired by Korean drama color grading preference study (n=86, p<.05)

About

I'm an AI engineer who builds end-to-end intelligent systems — from RAG-based chatbots for public service navigation, to computational analysis pipelines for visual data, to computer vision systems for industrial quality inspection.

I hold an M.S. in AI Convergence and a dual B.S./B.A. in Electronics Engineering and Multimedia from Ewha Womans University. My focus is on turning research into deployable solutions — designing LLM pipelines, training and optimizing CV models, and integrating AI into real-world workflows.

Research Interests

LLM / RAG Multimodal AI Computer Vision OCR / Document AI
4
Publications
2
Patents Filed
Top 20
/ 181 teams
87.5%
Defect Detection mAP
Projects
Welfare Compass v2 service main page showing category navigation and prioritized welfare policies

Welfare Compass v2 — Production Service

Team of 3 · Full-Stack Rebuild · Jan–Mar 2026

In Progress

Problem

The hackathon prototype had no persistent state and a simplistic retrieval pipeline — couldn't scale for real users with diverse eligibility conditions.

Approach

  • Django + Next.js full-stack architecture with persistent user profiles and eligibility engine
  • LangGraph ReAct Agent replacing sequential RAG with autonomous reasoning + tool use
  • Hybrid search pipeline: BM25 + Dense (BGE-m3) + Cross-encoder reranking

Outcome

Architecture designed and core modules under active development. Production deployment planned for March 2026.

Tech Stack

Django Next.js LangGraph PostgreSQL BM25 + Dense BGE-m3

Business Impact: Evolving a hackathon prototype into a production-grade service — persistent conversation flow, eligibility-aware recommendations, and scalable architecture for real-world deployment.

Welfare Compass v1 — Hackathon Prototype

Seoul AI Hackathon · Top 20 / 181 teams · 2025

Top 20
of 181 teams

Problem

300+ welfare policies scattered across documents — citizens struggle to determine eligibility, especially those with low digital literacy.

Approach

  • RAG pipeline using LangChain + FAISS for policy document retrieval
  • Natural language query processing to extract relevant policies and eligibility criteria
  • Streamlit-based conversational interface for accessible interaction

Outcome

Demonstrated effective retrieval for complex policy queries, advancing to hackathon finals.

Tech Stack

RAG LangChain FAISS GPT API Streamlit

Business Impact: Reduces citizen service center workload by automating policy matching — a scalable RAG solution applicable to any document-heavy consultation workflow.

CatchUp application showing structured note output from VLM/LLM pipeline

CatchUp v2 — Study Material Structuring Pipeline

Solo Project · VLM/LLM Pipeline · Mar 2026

In Progress

Problem

Dropping a document into GPT gives a summary — but when you have a PDF, a notebook, and screenshots covering related concepts, nothing connects the knowledge across them.

Approach

  • Multi-format parsing pipeline: PDF (DoclingLoader), Jupyter notebooks (nbformat), images (VLM 5-class classification)
  • 12 VLM models benchmarked across 8 axes — including reasoning vs. non-reasoning and generational gap analysis
  • LLM-powered note generation with versioned prompts (v1.0→v1.4) and concept extraction with cross-document backlinking

Outcome

Core parsing + VLM client operational with 10-model unified API, cost tracking, and structured JSON output. Experiment and RAG pipeline under active development.

Tech Stack

VLM (12 models) LangChain ChromaDB Qwen2-VL Streamlit

Business Impact: Transforms scattered study materials into a searchable, concept-linked knowledge base — demonstrating multi-modal document AI pipeline design with systematic model evaluation across 12 VLMs.

Color comparison between Korean remake and Spanish original versions of Money Heist

Drama Color Pattern Analysis

Published · Journal of the Korean Data Analysis Society · Feb 2025

1.5×
preference (p<.05)

Problem

Color's impact on viewer emotional engagement was based on subjective judgment — no quantitative analysis existed for cross-cultural comparison.

Approach

  • ConvNeXtLarge for scene clustering across drama episodes
  • HSV/LAB color metrics comparing Korean vs Spanish versions of Money Heist
  • User survey (n=86) for preference validation

Outcome

Korean color grading preferred 1.5× overall across the full sample (p<.05). First author publication.

Tech Stack

ConvNeXtLarge CIELAB HSV Analysis User Study

Business Impact: Data-driven framework for visual content optimization — applicable to automated color grading pipelines, content recommendation, and A/B testing visual assets at scale.

Welding defect detection results showing crack, porosity, and spatter detection

Welding Defect Detection

Presented at IPIU 2024 · First Author

87.5%
mAP score

Problem

No real defect images available in manufacturing environments due to security and cost constraints — severe training data shortage.

Approach

  • CycleGAN for domain adaptation (X-ray radiographs → inspection images)
  • Preprocessing pipeline (contrast enhancement, wavelet filtering) to suppress artifacts
  • YOLOv5 with two-stage transfer learning (COCO → synthetic defects)

Outcome

87.5% mAP achieved on synthetic-to-real defect detection. First author conference paper.

Tech Stack

CycleGAN YOLOv5 Domain Adaptation Transfer Learning

Business Impact: Eliminates dependency on restricted real-world defect data — enabling automated quality inspection deployment in manufacturing environments where labeled data is unavailable.

Publications
01
"A Preference Analysis of Color: A Pilot Study Comparing the Drama Money Heist Original and Remake Versions"
Yuna Sim, Hoyoung Yoon
Journal of the Korean Data Analysis Society, Feb 2025
Journal · 1st Author
02
"Global Illumination Estimation Using Local Patch Selection"
Yuna Sim, Hyejin Oh, Jewon Kang
The Korean Institute of Broadcast and Media Engineers (KIBME), Nov 2024
Conference · 1st Author
03
"Deep Learning-Based Fast Auto White Balancing Algorithm Development for Multi-View Video"
Hyejin Oh, Yuna Sim, Jewon Kang
The Korean Institute of Broadcast and Media Engineers (KIBME), Jun 2024
Conference
04
"Welding defect detection using synthetic data and object recognition algorithms"
Yuna Sim, Jewon Kang
Image Processing and Image Understanding (IPIU), Feb 2024
Conference · 1st Author
Patents
P1
"Global Illumination Estimation Method Based on Local Illumination Estimation and Image Processing Apparatus"
Yuna Sim, Hyejin Oh, Jewon Kang
Korean Patent No. 10-2024-0164675, Filed 2024
P2
"Auto White Balancing Method for Multi-View Video and Image Processing Apparatus"
Yuna Sim, Hyejin Oh, Jewon Kang
Korean Patent No. 10-2024-0162001, Filed 2024
Contact

Let's connect

I'm looking for opportunities in AI engineering, LLM solution development, and AI consulting — open to roles where I can design and deploy intelligent systems that solve real business problems.