Publications

Research papers, technical reports, benchmark studies, whitepapers, and case studies on safe, secure, and trustworthy AI deployment.

10 publications

AEGIS-RP-2026-002Research Paper

기업 환경에서 AI 에이전트 도입의 기대-현실 격차 분석: 신뢰, 거버넌스, ROI를 중심으로

Analyzing the Expectation-Reality Gap in Enterprise AI Agent Adoption: Focusing on Trust, Governance, and ROI

Kwangil Kim, AEGIS Research Team

본 연구는 기업 환경에서 AI 에이전트 도입 시 발생하는 기대-현실 격차를 체계적으로 분석한다. 2024-2026년 발행된 산업 보고서, 학술 논문, 제품 사용자 피드백 등 24개 문헌을 대상으로 삼각검증(triangulation)을 수행하여, 거버넌스, ROI, 보안, 신뢰, 통합, 인재, 윤리, 규제의 8대 핵심 축에서 격차를 분석하였다. 거버넌스 격차가 가장 크며(92% 인식 vs 21% 성숙 모델), ROI(171% 기대 vs 15% EBITDA 개선), 보안(82% 자신감 vs 14.4% 공식 승인)이 뒤를 잇는다. 한국 시장은 인재 부족(49.8%), 중소기업 도입률(5.3%), 정부 주도 모델이라는 고유 특성을 보인다. 투명성, 역량 인식, 문화적 맥락을 통합한 기업용 AI 에이전트 신뢰 모델과 4단계 도입 성숙도 프레임워크를 제안한다.

AI-agententerprise-adoptionexpectation-reality-gaptrust
AEGIS-RP-2026-003Research Paper

교육 분야 AI 에이전트 도입의 기대-현실 격차 분석: 학습 효과, 학문적 무결성, 형평성을 중심으로

Analyzing the Expectation-Reality Gap in AI Agent Adoption for Education: Focusing on Learning Effectiveness, Academic Integrity, and Equity

Kwangil Kim, AEGIS Research Team

교육 분야에서 AI 에이전트의 도입이 급속히 확산되고 있다. 2025년 기준 글로벌 학생의 92%가 AI를 활용하고 있으나, 교육적 효과에 대한 기대와 현실 사이에는 상당한 격차가 존재한다. 본 연구는 2024-2026년 발행된 산업 보고서, 학술 논문, OECD 정책 보고서, 교육 현장 데이터 등 28개 문헌을 대상으로 삼각검증을 수행하여, 학습 효과, 교수자 신뢰, 학문적 무결성, 형평성/접근성, 거버넌스/정책, 데이터 프라이버시의 6대 핵심 축에서 격차를 분석하였다. 학문적 무결성(58% 부정 사용 vs 28% 정책 효과), 형평성(86% 대학생 활용 vs 5.3% 저소득 학교 접근), 교수자 신뢰(63% 활용 vs 76% AI 교육 부족)에서 가장 큰 격차가 확인되었다. 한국의 AI 디지털 교과서 정책은 전국 의무화에서 자율 채택(37%→19%)으로 전환된 사례로서 교훈을 제공한다. 본 연구는 교육 AI 에이전트 신뢰 형성 모델과 4단계 도입 성숙도 프레임워크를 제안한다.

AI-agenteducationagentic-AIintelligent-tutoring
AEGIS-TR-2026-004Technical Report

TurboQuant-Adam: Adaptive 4-bit Momentum and Activation Compression for LLM Distributed Training

8× communication reduction and 4× memory savings validated to 355M parameters with full convergence preservation

Kwangil Kim, AEGIS Research Team

We present TurboQuant-Adam v3, a unified compression framework for LLM distributed training that jointly addresses communication bandwidth and activation memory bottlenecks. The key insight is that standard Error Feedback is incompatible with AdamW's nonlinear variance scaling, and that variance freezing — effective at small scale — diverges at 124M+ parameters. Our solution applies 4-bit Lloyd-Max quantization (enabled by Hadamard rotation) at the momentum level with live variance tracking, achieving 8× communication reduction and 4× memory savings while preserving convergence (loss gap +0.002 at 1,500 steps on GPT-2 124M–355M).

distributed-traininggradient-quantizationLloyd-MaxHadamard-transform
AEGIS-WP-2026-002Whitepaper

TurboQuant Business Impact Analysis: Economic Effects on GPU-Based Distributed Training Infrastructure

How 8× communication compression and 4× memory savings reshape LLM training cost structures

Kwangil Kim, AEGIS Research Team

This whitepaper analyzes the business impact of TurboQuant's dual-axis compression — 8× communication bandwidth reduction and 4× activation memory savings — on LLM distributed training cost structures. Based on 2026 GPU cloud pricing data and empirical TurboQuant results validated to 355M parameters, we project that TurboQuant enables: (1) transition from $15M InfiniBand to $7M Ethernet infrastructure at equivalent effective throughput, (2) 27–56% training cost reduction for 70B models ($240K–$500K), (3) 2–3× larger batch sizes enabling 50% GPU count reduction for 70B models via lower tensor parallelism, and (4) new architectural possibilities including multi-datacenter and cloud-hybrid training. We position TurboQuant as the only framework simultaneously addressing both communication and memory bottlenecks with full AdamW and AMP compatibility.

cost-analysisGPU-infrastructuredistributed-trainingInfiniBand
AEGIS-BR-2026-001Benchmark Report

Benchmarking Guardrail Effectiveness in High-Risk LLM Use Cases

Comparative Evaluation of Layered Guardrail Strategies across Policy-Sensitive Enterprise Scenarios

AEGIS Research Team, Yatav Inc.

Enterprise deployment of Large Language Models in policy-sensitive domains — healthcare, finance, legal, telecommunications, and defense — introduces risks that extend far beyond conventional content safety. Regulatory mandates (EU AI Act, K-AI Act, NIST AI RMF), sector-specific compliance requirements, and the potential for catastrophic harm in high-stakes decision contexts demand guardrail systems whose effectiveness is empirically validated rather than assumed. We present a comprehensive benchmarking study of layered guardrail strategies using the AEGIS (AI Engine for Guardrail & Inspection System) framework, evaluating 8 commercial LLM models across 7 adversarial attack algorithms (112 total evaluations) in two independent sessions. Our results reveal three critical findings: (1) all tested models — including GPT-5, Claude Opus 4.6, and Gemini 3.1 Pro — are classified as VULNERABLE without external guardrails, with a mean baseline defense rate of only 38.1%; (2) a 3-Tier layered guardrail architecture (rule-based filters at <0.5ms, ML classifiers at <5ms, LLM judges at <200ms) improves defense rates from 38.1% to an estimated 75–85% while maintaining 50,000+ RPS throughput; and (3) domain-specific policy enforcement — including telecom-specific threat taxonomies, military ROE compliance, and financial boundary detection — is essential for high-risk use cases where generic safety mechanisms are insufficient. We propose the Enterprise Guardrail Effectiveness Index (EGEI), a composite metric incorporating attack resilience, regulatory compliance coverage, latency overhead, and domain-specific policy adherence, and use it to evaluate guardrail configurations across 5 enterprise deployment scenarios.

guardrailbenchmarkred teamingenterprise AI
AEGIS-RP-2026-001Research Paper

AEGIS: A Multi-Layered Framework for Automated LLM Safety Diagnosis through Adversarial Red-Teaming and Statistical Risk Analysis

Integrated Offensive Red-Teaming and Defensive Guardrail System with SABER Statistical Risk Prediction

AEGIS Research Team, Yatav Inc.

Existing approaches to LLM safety evaluation treat offensive testing and defensive deployment as independent concerns: red-team researchers measure attack success rates while defense engineers deploy guardrails, with no formal framework connecting attacker effort to defender resilience. We present AEGIS, an integrated system that closes this loop through three tightly coupled subsystems: (1) an offensive red-team engine comprising 8 attack algorithms, a Meta-Attack genetic recombinator over 30 atomic primitives, and a reinforcement learning attack agent with PPO-trained policy networks; (2) a defensive guardrail pipeline combining the PALADIN 6-layer deep inspection network with a 3-Tier hierarchical defense (rule-based at <0.5ms, ML classifier at <5ms, LLM Judge at <200ms) and 4 specialized detection algorithms (GuardNet, JBShield, CCFC, MULI); and (3) SABER (Statistical Adversarial risk with Beta Extrapolation and Regression), a statistical risk framework that models per-query vulnerability as θ ~ Beta(α, β), derives the ASR@N scaling law to predict attack success rates under Best-of-N scenarios, and introduces the Budget@τ metric to quantify defender resilience as the minimum attack budget required to achieve success probability τ. SABER further implements a closed-loop deterministic defense promotion system that automatically converts high-risk queries (ASR@1000 ≥ 0.8) into θ=0 deterministic blocks. Empirical evaluation across 8 LLM models (GPT-5, Claude Opus 4.6, Gemini 3.1 Pro, Grok 4.1/3, DeepSeek) in 112 evaluations reveals a baseline defense rate of only 38.1%, with PAIR and Crescendo achieving 100% ASR universally. SABER analysis classifies 6 of 8 models at Critical risk (ASR@1000 ≥ 0.8), while the integrated AEGIS defense improves effective defense rates to 75–90% with the deterministic promotion mechanism providing an additional 5–12% improvement on recurring attack patterns.

red teamingLLM safetySABERPALADIN
AEGIS-WP-2026-001Whitepaper

Building Safe and Compliant Enterprise LLM Deployments

A Whitepaper on Governance, Verification, Security, and Operational Readiness for Applied AI

AEGIS Research Team, Yatav Inc.

Deploying large language models (LLMs) in enterprise environments demands more than model accuracy — it requires a comprehensive framework addressing governance, security, regulatory compliance, and operational resilience. This whitepaper presents the AEGIS (AI Engine for Guardrail & Inspection System) platform's production-grade infrastructure for safe enterprise LLM deployments. Drawing from an implemented codebase comprising 1,591+ passing tests, 32 database migration schemas, 14 agent security modules, and compliance mappings across three regulatory frameworks (EU AI Act, K-AI Act, NIST AI RMF), we detail the architectural patterns, middleware stacks, and operational practices required for responsible AI deployment. The platform supports multi-tenant SaaS with four subscription tiers, Redis-backed rate limiting, comprehensive audit logging with 23 event types, and GS quality certification (ISO/IEC 25051). We demonstrate how defense-in-depth security — spanning JWT/API-key authentication, RBAC+ABAC authorization, PII pseudonymization, and federated learning — can be integrated into a cohesive deployment architecture that satisfies both technical and regulatory requirements.

governancecomplianceenterprise AIsecurity hardening
AEGIS-TR-2026-002Technical Report

AEGINEL Guard: Multilingual AI Prompt Security Classifier for Browser Extensions

Development of a lightweight on-device multi-label threat classifier for real-time prompt safety

Kwangil Kim, AEGIS Research Team

This research presents the full development pipeline of AEGINEL Guard, a multilingual AI prompt security classifier designed to run entirely on-device within Chrome browser extensions. The classifier detects six threat categories — Jailbreak, Prompt Injection, Harmful Content, Script Evasion, Social Engineering, and Encoding Bypass — using a multi-label classification approach. Trained on 188,109 samples across 8 languages, the final DistilBERT-based model achieves 100% binary detection accuracy at 7.6 ms/sample inference speed in a 129.5 MB INT8 ONNX package.

prompt-securityguardrailmultilingualon-device
AEGIS-TR-2026-003Technical Report

TruthAnchor: A Multi-Layer Defense Framework for Hallucination Mitigation in Financial Domain LLMs

Production-ready four-layer pipeline achieving >=97% hallucination detection with <=200ms latency for Korean financial services

Kwangil Kim, AEGIS Research Team

We present TruthAnchor, a multi-layer defense framework designed to mitigate LLM hallucinations in Korean financial services. The architecture implements a four-layer pipeline: (1) Input Governance with triple-layered prompt injection defense, (2) Evidence-Grounded Generation via Hybrid RAG with cross-encoder reranking, (3) Output Verification through a novel 4-signal composite uncertainty scorer and Multi-LLM consensus validation, and (4) Escalation and Audit with human-in-the-loop expert review. Evaluation demonstrates hallucination detection rate >=97%, financial RAG accuracy >=98%, p95 response latency <=200ms, and 100% prompt injection defense rate. The system includes a Rust-accelerated guardrail engine achieving 8.5x throughput improvement over Python baselines.

hallucinationfinancial-AIRAGguardrail
AEGIS-TR-2026-001Technical Report

Reducing Hallucinations in Enterprise AI Systems

A Multi-Layered Defense Architecture for Mission-Critical Domains

TruthAnchor Research Group, AEGIS Research Team

This paper presents TruthAnchor, a comprehensive multi-layered defense architecture designed to reduce hallucination rates in enterprise AI deployments to below 3%. The system implements a four-layer pipeline comprising input governance, evidence-grounded generation, output verification, and human-in-the-loop escalation. Key innovations include a triple-layered prompt injection defense achieving 100% detection, a four-signal composite uncertainty scorer, and a multi-LLM consensus validation mechanism. Evaluated in a Korean financial services context with 104 test cases, the system achieves 100% hallucination detection rate, 100% RAG accuracy, and p95 latency of 142ms.

hallucinationTruthAnchorRAGuncertainty quantification