KATEGORIZACIJA I ANALIZA SENTIMENTA DOKUMENATA UPOTREBOM VIŠEJEZIČNOG TRANSFORMER MODELA SA VIŠE IZLAZA
Ključne reči:
Sistemi za analizu teksta, klasifikacija teksta, sentiment analiza teksta, višejezični transformer modeli, modeli sa više izlaza
Apstrakt
U radu je predstavljen model za kategorizaciju i analizu sentimenta teksta na sto jezika upotrebom transformer neuronskih mreža. Takođe je pokazan način za optimizaciju takvog modela postupkom destilacije modela.
Reference
[1] GCP taksonomija https://cloud.google.com/natural-language/docs/categories (pristupljeno u septembru 2022.)
[2] Conneau, A., Khandelwal, K., Goyal, N., Chaudhary, V., Wenzek, G., Guzmán, F., ... & Stoyanov, V. (2019). Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116.
[3] Sanh, V., Debut, L., Chaumond, J., & Wolf, T. (2019). DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108.
[4] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998-6008).
[5] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
[6] Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531.
[7] GLUE Benchmark https://gluebenchmark.com/
[8] The Stanford Analysis Treebank https://nlp.stanford.edu/sentiment/
[9] Mediawiki https://github.com/barrust/mediawiki
[10] Wikipedia https://www.wikipedia.org/
[11] Huggingface https://huggingface.co/
[12] Tensorflow https://www.tensorflow.org/
[13] Pytorch https://pytorch.org/
[14] CommonCrawl https://commoncrawl.org/
[15] Agarwal, A., Dahleh, M., Shah, D., Sleeper, D., Tsai, A., & Wong, M. (2019). Zorro: A Model Agnostic System to Price Consumer Data. arXiv preprint arXiv:1906.02420.
[2] Conneau, A., Khandelwal, K., Goyal, N., Chaudhary, V., Wenzek, G., Guzmán, F., ... & Stoyanov, V. (2019). Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116.
[3] Sanh, V., Debut, L., Chaumond, J., & Wolf, T. (2019). DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108.
[4] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998-6008).
[5] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
[6] Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531.
[7] GLUE Benchmark https://gluebenchmark.com/
[8] The Stanford Analysis Treebank https://nlp.stanford.edu/sentiment/
[9] Mediawiki https://github.com/barrust/mediawiki
[10] Wikipedia https://www.wikipedia.org/
[11] Huggingface https://huggingface.co/
[12] Tensorflow https://www.tensorflow.org/
[13] Pytorch https://pytorch.org/
[14] CommonCrawl https://commoncrawl.org/
[15] Agarwal, A., Dahleh, M., Shah, D., Sleeper, D., Tsai, A., & Wong, M. (2019). Zorro: A Model Agnostic System to Price Consumer Data. arXiv preprint arXiv:1906.02420.
Objavljeno
2023-03-06
Sekcija
Elektrotehničko i računarsko inženjerstvo