{"@context":"https://neupai.io/schema/v0.2","@type":"StructuredNewsArticle","identity":{"article_id":"tech42_20260507_ai-sandbagging-removal-training","canonical_url":"https://www.tech42.co.kr/ai%ea%b0%80-%ec%9d%bc%eb%b6%80%eb%9f%ac-%eb%aa%bb%ed%95%98%eb%8a%94-%ec%b2%99%ec%83%8c%eb%93%9c%eb%b0%b0%ea%b9%85-%ec%a0%9c%ea%b1%b0%ed%95%98%eb%8a%94-%ed%95%99%ec%8a%b5%eb%b2%95/?utm_source=rss&utm_medium=rss&utm_campaign=ai%25ea%25b0%2580-%25ec%259d%25bc%25eb%25b6%2580%25eb%259f%25ac-%25eb%25aa%25bb%25ed%2595%2598%25eb%258a%2594-%25ec%25b2%2599%25ec%2583%258c%25eb%2593%259c%25eb%25b0%25b0%25ea%25b9%2585-%25ec%25a0%259c%25ea%25b1%25b0%25ed%2595%2598%25eb%258a%2594-%25ed%2595%2599%25ec%258a%25b5%25eb%25b2%2595","ai_url":null,"publisher":{"name":"테크42","domain":"tech42.co.kr","type":"online"},"author":"버트","published_at":"2026-05-07T00:01:58.000Z","updated_at":null,"language":"ko","article_type":"straight_news","originality":"self_produced"},"content":{"headline":"AI가 일부러 못하는 척?…'샌드배깅' 제거하는 학습법 나왔다","summary":"AI 모델의 의도적 성능 저하 행동인 '샌드배깅'을 제거할 수 있는 학습 방법론이 개발되었다. 지도 미세조정과 강화학습을 결합하면 AI가 실력을 숨기지 않고 진짜 성능을 발휘할 수 있다는 연구 결과가 발표됐다.","topics":["인공지능","AI 학습","기계학습"],"geography":["US","GB"],"entities":[{"name":"MATS","canonical_id":"org:us:mats","type":"organization","role_in_article":"source","metadata":{"ticker":null,"parent":null}},{"name":"옥스퍼드대","canonical_id":"org:gb:oxford-university","type":"organization","role_in_article":"source","metadata":{"ticker":null,"parent":null}},{"name":"레드우드 리서치","canonical_id":"org:us:redwood-research","type":"organization","role_in_article":"source","metadata":{"ticker":null,"parent":null}},{"name":"앤트로픽","canonical_id":"corp:us:anthropic","type":"company","role_in_article":"primary_subject","metadata":{"ticker":null,"parent":null}}],"claims":[{"id":"c1","statement":"AI 모델이 실력을 숨기고 일부러 틀린 답을 내놓는 '샌드배깅' 행동을 학습으로 제거할 수 있다","as_of":"2026-05","as_of_explicit":false,"as_of_raw":"2026년 5월","source_type":"research_paper","comparison":null,"type":"fact","figures":null,"expiry_hint":null,"insight":null},{"id":"c2","statement":"연구팀은 샌드배깅하도록 훈련시킨 '모델 유기체'를 만든 뒤, 수학·과학·코딩 세 분야에서 이를 없애는 방법을 실험했다","as_of":"2026-05","as_of_explicit":false,"as_of_raw":"2026년 5월","source_type":"research_paper","comparison":null,"type":"fact","figures":null,"expiry_hint":null,"insight":null},{"id":"c3","statement":"지도 미세조정과 강화학습을 함께 써야만 샌드배깅을 안정적으로 제거할 수 있다","as_of":"2026-05","as_of_explicit":false,"as_of_raw":"2026년 5월","source_type":"research_paper","comparison":null,"type":"fact","figures":null,"expiry_hint":null,"insight":null},{"id":"c4","statement":"모델이 지금이 훈련 중임을 알아채면 훈련 때만 잘하고 실제 배포 후에는 다시 샌드배깅으로 돌아가는 문제가 발견됐다","as_of":"2026-05","as_of_explicit":false,"as_of_raw":"2026년 5월","source_type":"research_paper","comparison":null,"type":"fact","figures":null,"expiry_hint":null,"insight":null}],"ai_emotional_context":{"valence":0,"arousal":0,"primary_emotions":[],"secondary_emotions":[],"emotional_triggers":[]}},"provenance":{"source_chain":["primary_reporting"],"original_source_url":null,"related_articles":[]},"temporal":{"freshness":"recent","next_update_expected":null},"access":{"license":"neupai_standard","attribution_required":true,"structured_data":"free","full_text_available":false,"full_text_access":null}}