์ž์—ฐ์–ด ์ฒ˜๋ฆฌ/Today I learned :

ํ—ˆ๊น…ํŽ˜์ด์Šค์˜ ํŠธ๋žœ์Šคํฌ๋จธ ๐Ÿค— Huggingface's Transformers

์ฃผ์˜ ๐Ÿฑ 2023. 1. 16. 17:18
728x90
๋ฐ˜์‘ํ˜•

์ด๋ฒˆ์—๋Š” Huggingface์—์„œ ์ œ๊ณตํ•˜๋Š” Transformers์— ๋Œ€ํ•˜์—ฌ ์•Œ์•„๋ณด๊ณ ์ž ํ•ฉ๋‹ˆ๋‹ค. 

https://huggingface.co/docs/transformers/index

  • ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ ๊ด€๋ จ ์—ฌ๋Ÿฌ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ ์žˆ์ง€๋งŒ Transformer๋ฅผ ํ™œ์šฉํ•œ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ task์—์„œ ๊ฐ€์žฅ ๋งŽ์ด ํ™œ์šฉ๋˜๊ณ  ์žˆ๋Š” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋Š” transformers์ž…๋‹ˆ๋‹ค.
  • pytorch version์˜ BERT๋ฅผ ๊ฐ€์žฅ ๋จผ์ € ๊ตฌํ˜„ํ•˜๋ฉฐ ์ฃผ๋ชฉ๋ฐ›์•˜๋˜ huggingface๋Š” ํ˜„์žฌ transformer๊ธฐ๋ฐ˜์˜ ๋‹ค์–‘ํ•œ ๋ชจ๋ธ๋“ค์€ ๊ตฌํ˜„ ๋ฐ ๊ณต๊ฐœํ•˜๋ฉฐ ๋งŽ์€ ์ฃผ๋ชฉ์„ ๋ฐ›๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.(์•„๋ž˜ ์ฃผ์†Œ์—์„œ ๋‹ค์–‘ํ•œ ๋ชจ๋ธ๋“ค์„ ํ™•์ธ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค)
  • https://huggingface.co/models
  • ์ œ์‹œ๋œ ๋ชจ๋ธ ์ด์™ธ์—๋„ custom model์„ ์—…๋กœ๋“œํ•˜์—ฌ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

  • AutoConfig์—์„œ๋Š” ๋‹ค์–‘ํ•œ ๋ชจ๋ธ์˜ configuration (ํ™˜๊ฒฝ ์„ค์ •)์„ string tag๋ฅผ ์ด์šฉํ•ด ์‰ฝ๊ฒŒ loadํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๊ฐ Config์—๋Š” ํ•ด๋‹น ๋ชจ๋ธ architecture์™€ task์— ํ•„์š”ํ•œ ๋‹ค์–‘ํ•œ ์ •๋ณด(architecture ์ข…๋ฅ˜, ๋ ˆ์ด์–ด ์ˆ˜, hidden unit size, hyperparameter)๋ฅผ ๋‹ด๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
  • https://huggingface.co/models ์—์„œ ํ•ด๋‹น ๋ชจ๋ธ๋“ค์˜ name tag๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์•„๋ž˜ ์˜ˆ์‹œ์˜ ๊ฒฝ์šฐ -  https://huggingface.co/bert-base-uncased
from transformers import AutoConfig

config = AutoConfig.from_pretrained('bert-base-uncased')
config

#Result
BertConfig {
  "_name_or_path": "bert-base-uncased",
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "classifier_dropout": null,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "position_embedding_type": "absolute",
  "transformers_version": "4.25.1",
  "type_vocab_size": 2,
  "use_cache": true,
  "vocab_size": 30522
}

https://huggingface.co/gpt2

gpt_config = AutoConfig.from_pretrained('gpt2')
gpt_config

#Result
GPT2Config {
  "_name_or_path": "gpt2",
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.1,
  "bos_token_id": 50256,
  "embd_pdrop": 0.1,
  "eos_token_id": 50256,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 768,
  "n_head": 12,
  "n_inner": null,
  "n_layer": 12,
  "n_positions": 1024,
  "reorder_and_upcast_attn": false,
  "resid_pdrop": 0.1,
  "scale_attn_by_inverse_layer_idx": false,
  "scale_attn_weights": true,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 50
    }
  },
  "transformers_version": "4.25.1",
  "use_cache": true,
  "vocab_size": 50257
}
print(config.vocab_size)
30522
config_dict = config.to_dict()
config_dict

#Result
{'return_dict': True,
 'output_hidden_states': False,
 'output_attentions': False,
 'torchscript': False,
 'torch_dtype': None,
 'use_bfloat16': False,
 'tf_legacy_loss': False,
 'pruned_heads': {},
 'tie_word_embeddings': True,
 'is_encoder_decoder': False,
 'is_decoder': False,
 'cross_attention_hidden_size': None,
 'add_cross_attention': False,
 'tie_encoder_decoder': False,
 'max_length': 20,
 'min_length': 0,
 'do_sample': False,
 'early_stopping': False,
 'num_beams': 1,
 'num_beam_groups': 1,
 'diversity_penalty': 0.0,
 'temperature': 1.0,
 'top_k': 50,
 'top_p': 1.0,
 'typical_p': 1.0,
 'repetition_penalty': 1.0,
 'length_penalty': 1.0,
 'no_repeat_ngram_size': 0,
 'encoder_no_repeat_ngram_size': 0,
 'bad_words_ids': None,
 'num_return_sequences': 1,
 'chunk_size_feed_forward': 0,
 'output_scores': False,
 'return_dict_in_generate': False,
 'forced_bos_token_id': None,
 'forced_eos_token_id': None,
 'remove_invalid_values': False,
 'exponential_decay_length_penalty': None,
 'suppress_tokens': None,
 'begin_suppress_tokens': None,
 'architectures': ['BertForMaskedLM'],
 'finetuning_task': None,
 'id2label': {0: 'LABEL_0', 1: 'LABEL_1'},
 'label2id': {'LABEL_0': 0, 'LABEL_1': 1},
 'tokenizer_class': None,
 'prefix': None,
 'bos_token_id': None,
 'pad_token_id': 0,
 'eos_token_id': None,
 'sep_token_id': None,
 'decoder_start_token_id': None,
 'task_specific_params': None,
 'problem_type': None,
 '_name_or_path': 'bert-base-uncased',
 'transformers_version': '4.25.1',
 'gradient_checkpointing': False,
 'model_type': 'bert',
 'vocab_size': 30522,
 'hidden_size': 768,
 'num_hidden_layers': 12,
 'num_attention_heads': 12,
 'hidden_act': 'gelu',
 'intermediate_size': 3072,
 'hidden_dropout_prob': 0.1,
 'attention_probs_dropout_prob': 0.1,
 'max_position_embeddings': 512,
 'type_vocab_size': 2,
 'initializer_range': 0.02,
 'layer_norm_eps': 1e-12,
 'position_embedding_type': 'absolute',
 'use_cache': True,
 'classifier_dropout': None}
from transformers import BertConfig
bertconfig = BertConfig.from_pretrained('bert-base-uncased')
 
bert_in_gpt2_config = BertConfig.from_pretrained('gpt2')

#You are using a model of type gpt2 to instantiate a model of type bert. This is not supported for all configurations of models and can yield errors.


Model: https://github.com/huggingface/transformers/tree/master/src/transformers/models

  • Transformers์—์„œ๋Š” transformer๊ธฐ๋ฐ˜์˜ ๋ชจ๋ธ architecture๋ฅผ ๊ตฌํ˜„ํ•ด๋‘์—ˆ์Šต๋‹ˆ๋‹ค.
  • ์ตœ๊ทผ์—๋Š” https://arxiv.org/abs/2010.11929์™€ ๊ฐ™์ด Vision task์—์„œ ํ™œ์šฉํ•˜๋Š” transformer ๋ชจ๋ธ๋“ค์„ ์ถ”๊ฐ€ํ•˜๋ฉฐ ๊ทธ ํ™•์žฅ์„ฑ์„ ๋”ํ•ด๊ฐ€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๋ชจ๋ธ architecture ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ๊ด€๋ จ task์— ์ ์šฉ๊ฐ€๋Šฅํ•œ ํ˜•ํƒœ์˜ ๊ตฌํ˜„์ฒด๋“ค์ด ์žˆ์Šต๋‹ˆ๋‹ค.
  • BERT ๊ตฌํ˜„์ฒด์—์„œ ์ œ๊ณตํ•˜๊ณ  ์žˆ๋Š” class๋ฅผ ํ™•์ธํ•˜๊ณ  ํ•ด๋‹น ๊ตฌ์กฐ๋ฅผ ์ด์šฉํ•ด ํ•™์Šตํ•œ ๋ชจ๋ธ๋“ค์„ loadํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค
from transformers import BertForMaskedLM, BertForQuestionAnswering, BertForSequenceClassification, BertForTokenClassification, BertForMultipleChoice, BertModel
from transformers import AutoModel, AutoTokenizer, AutoConfig

bertmodel = AutoModel.from_pretrained('bert-base-uncased')

tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
input = tokenizer('hi, my name is joy')
input
 
'input_ids': [101, 7632, 1010, 2026, 2171, 2003, 6569, 102], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1]}
 
 
bert_qa = BertForQuestionAnswering.from_pretrained('bert-base-uncased')

 

๋ฐ˜์‘ํ˜•