Encoders and Ensembles for Task-Free Continual Learning 리뷰

1. Introduction

•Supervised Learning : data가 i.i.d.(identically and independently distributed)이며 고정된 분포로 부터 추출된다 가정

•Continual Learning : 현실적엔 시나리오(데이터 real time)에 대처하는 것을 연구

•주요 문제점 : catastrophic forgetting

Catastrophic forgetting
- 새로운 data를 학습함에 따라 이전 data들에 대한 모델의 성능이 저하 되는 것
- 특히 신경망은 gradient-based로 가중치가 update되기 때문에 이 문제에 취약
- Data가 imbalance할 경우 오랫동안 특정 class에 대해 weight updating이 일어나지 않음

⇨ consolidating weights, retaining a memory of past experience, dividing the architecture into separate modules, meta-learning

① Data drawn from a different distribution

② No clear task boundaries even exist

와 같은 현실적인 문제 에 대처하기 위해 task-free continual learning setting을 제시

Does not require knowledge of task boundaries
Does not try to infer them
It can be applied to continual learning problems where the distribution changes gradually and no clear task boundaries exist

Loss function - softmax나 다른 형태의 normalization가 적용되지 않는 dot product 만을 가지고 계산
1. Imbalence issue를 완화 시키는지 확인
2. 우리는 multi class $y^T y=1, (y^T ) ̂ y ̂=1$ / 논문은 multi – label
t-classfier의 activation function – tanh with scaling factor τ applied, 뉴런의 output이 τ 에 가깝게 되도록
choice of optimiser - each parameter is raised or lowered by a fixed step size

Prompt Learning 오픈소스: OpenPrompt 리뷰 (0)	2022.03.11
GPT Understands, Too 리뷰 (0)	2022.03.11
MASS: Masked Sequence to Sequence Pre-training for Language Generation 리뷰 (0)	2022.02.22
XML, Cross-lingual Language Model Pretraining 리뷰 (0)	2022.02.22
Transformers 가족 (BERT vs GPT vs GPT2) (0)	2021.06.01