AI Researcher in Audio Deepfake Detection
μ€λμ€ λ₯νμ΄ν¬ νμ§ Β· μμ λΆλ¦¬ Β· μ€λͺ
κ°λ₯ν AI
νμ¬ λ³΅ν© μ€λμ€ μλ³μ‘° νμ§ μ°κ΅¬λ₯Ό μ§ν μ€μ
λλ€.
κΈ°μ‘΄ λͺ¨λΈμ΄ "μ§μ§/κ°μ§" μ΄μ§ νλ³μ κ·ΈμΉλ κ²κ³Ό λ¬λ¦¬, μμ±κ³Ό νκ²½μμ λΆλ¦¬νμ¬ "μ΄λ μ±λμ΄ κ°μ§μΈκ°" λ₯Ό 5κ° ν΄λμ€λ‘ μ€λͺ
νλ νλ μμν¬λ₯Ό μ μν©λλ€.
κ΄μ¬ μ°κ΅¬ λΆμΌ
- μ€λμ€ λ₯νμ΄ν¬ νμ§ (Audio Anti-Spoofing)
- μμ λΆλ¦¬ (Blind Source Separation)
- μ€λͺ κ°λ₯ν AI (Explainable AI, XAI)
- μμ±/μ€λμ€ μ νΈ μ²λ¦¬
μμ λΆλ¦¬ + 물리 μν₯ νΌμ² κΈ°λ° 5-class λ³΅ν© μ€λμ€ μλ³μ‘° νμ§ μμ€ν
| νλͺ© | λ΄μ© |
|---|---|
| λ°μ΄ν°μ | CompSpoofV2 (24,864 samples, 5-class) |
| ν΅μ¬ λͺ¨λΈ | Conv-TasNet (μμ λΆλ¦¬) + LightGBM (19-dim 물리 νΌμ²) |
| Accuracy | 69.8% Β· Macro-F1 0.6564 |
| FAKE Recall | 56.4% (SuDORM-RF λλΉ +41%p κ°μ ) |
| λ Όλ¬Έ | μ 보μ²λ¦¬νν λ Όλ¬Έ μμ± μ€ (2026) |
ν΅μ¬ κΈ°μ¬
- μ¬μ νμ΅ μμ΄ λ¬Όλ¦¬ μν₯ μ§ν(RT60, Noise Floor, MSC, XCorr)λ§μΌλ‘ κ²½μλ ₯ μλ νμ§ μ±λ₯ λ¬μ±
- SI-SNR β FAKE recall λΉμ ν μκ΄κ΄κ³ μ λ μ€μ¦
- LightGBM Feature Importanceλ₯Ό ν΅ν νμ κ·Όκ±° μκ°ν
[μ
λ ₯ μ€λμ€]
β
[LCNN-SE Gatekeeper] β REAL μ‘°κΈ° νμ
β
[Conv-TasNet μμ λΆλ¦¬]
βββ π£οΈ Speech Stream β WavLM μμ‘° μ μ
βββ πΏ Env Stream β LCNN-SE μμ‘° μ μ
β
[물리 μν₯ λΆμ: RT60 Β· Noise Floor Β· MSC Β· XCorr]
β
[LightGBM 5-class] β REAL / GENUINE / SPOOF_SPEECH / SPOOF_ENV / FAKE
| μν | μ λͺ© | νμ λν |
|---|---|---|
| μμ± μ€ | S-CAD: μμ λΆλ¦¬ κΈ°λ° λ³΅ν© μ€λμ€ μλ³μ‘° 5-class νμ§ | μ 보μ²λ¦¬νν 2026 |
- Email: whgdkgo614@gmail.com
- Blog: velog.io/@dydtn61498

