Yuna Sim

Welfare Compass v2 service main page showing category navigation and prioritized welfare policies

Welfare Compass v2 — Production Service

Team of 3 · Full-Stack Rebuild · Jan–Mar 2026

In Progress

Problem

The hackathon prototype had no persistent state and a simplistic retrieval pipeline — couldn't scale for real users with diverse eligibility conditions.

Approach

Django + Next.js full-stack architecture with persistent user profiles and eligibility engine
LangGraph ReAct Agent replacing sequential RAG with autonomous reasoning + tool use
Hybrid search pipeline: BM25 + Dense (BGE-m3) + Cross-encoder reranking

Outcome

Architecture designed and core modules under active development. Production deployment planned for March 2026.

Tech Stack

Django Next.js LangGraph PostgreSQL BM25 + Dense BGE-m3

Business Impact: Evolving a hackathon prototype into a production-grade service — persistent conversation flow, eligibility-aware recommendations, and scalable architecture for real-world deployment.

View Full Case Study →

Welfare Compass v1 — Hackathon Prototype

Seoul AI Hackathon · Top 20 / 181 teams · 2025

Top 20

of 181 teams

Problem

300+ welfare policies scattered across documents — citizens struggle to determine eligibility, especially those with low digital literacy.

Approach

RAG pipeline using LangChain + FAISS for policy document retrieval
Natural language query processing to extract relevant policies and eligibility criteria
Streamlit-based conversational interface for accessible interaction

Outcome

Demonstrated effective retrieval for complex policy queries, advancing to hackathon finals.

Tech Stack

RAG LangChain FAISS GPT API Streamlit

Business Impact: Reduces citizen service center workload by automating policy matching — a scalable RAG solution applicable to any document-heavy consultation workflow.

View Full Case Study → Try Live Demo →

CatchUp application showing structured note output from VLM/LLM pipeline

CatchUp v2 — Study Material Structuring Pipeline

Solo Project · VLM/LLM Pipeline · Mar 2026

In Progress

Problem

Dropping a document into GPT gives a summary — but when you have a PDF, a notebook, and screenshots covering related concepts, nothing connects the knowledge across them.

Approach

Multi-format parsing pipeline: PDF (DoclingLoader), Jupyter notebooks (nbformat), images (VLM 5-class classification)
12 VLM models benchmarked across 8 axes — including reasoning vs. non-reasoning and generational gap analysis
LLM-powered note generation with versioned prompts (v1.0→v1.4) and concept extraction with cross-document backlinking

Outcome

Core parsing + VLM client operational with 10-model unified API, cost tracking, and structured JSON output. Experiment and RAG pipeline under active development.

Tech Stack

VLM (12 models) LangChain ChromaDB Qwen2-VL Streamlit

Business Impact: Transforms scattered study materials into a searchable, concept-linked knowledge base — demonstrating multi-modal document AI pipeline design with systematic model evaluation across 12 VLMs.

View Full Case Study →

Color comparison between Korean remake and Spanish original versions of Money Heist

Drama Color Pattern Analysis

Published · Journal of the Korean Data Analysis Society · Feb 2025

1.5×

preference (p<.05)

Problem

Color's impact on viewer emotional engagement was based on subjective judgment — no quantitative analysis existed for cross-cultural comparison.

Approach

ConvNeXtLarge for scene clustering across drama episodes
HSV/LAB color metrics comparing Korean vs Spanish versions of Money Heist
User survey (n=86) for preference validation

Outcome

Korean color grading preferred 1.5× overall across the full sample (p<.05). First author publication.

Tech Stack

ConvNeXtLarge CIELAB HSV Analysis User Study

Business Impact: Data-driven framework for visual content optimization — applicable to automated color grading pipelines, content recommendation, and A/B testing visual assets at scale.

View Full Case Study →

Welding Defect Detection

Presented at IPIU 2024 · First Author

87.5%

mAP score

Problem

No real defect images available in manufacturing environments due to security and cost constraints — severe training data shortage.

Approach

CycleGAN for domain adaptation (X-ray radiographs → inspection images)
Preprocessing pipeline (contrast enhancement, wavelet filtering) to suppress artifacts
YOLOv5 with two-stage transfer learning (COCO → synthetic defects)

Outcome

87.5% mAP achieved on synthetic-to-real defect detection. First author conference paper.

Tech Stack

CycleGAN YOLOv5 Domain Adaptation Transfer Learning

Business Impact: Eliminates dependency on restricted real-world defect data — enabling automated quality inspection deployment in manufacturing environments where labeled data is unavailable.

View Full Case Study →

Welfare Compass v2 — Production Service

Problem

Approach

Outcome

Tech Stack

Welfare Compass v1 — Hackathon Prototype

Problem

Approach

Outcome

Tech Stack

CatchUp v2 — Study Material Structuring Pipeline

Problem

Approach

Outcome

Tech Stack

Drama Color Pattern Analysis

Problem

Approach

Outcome

Tech Stack

Welding Defect Detection

Problem

Approach

Outcome

Tech Stack

Let's connect