DETEKCIJA OBLIGACIJA U UGOVORIMA NA ENGLESKOM JEZIKU

  • Marko Žužić
  • Aleksandar Kovačević Fakultet tehničkih nauka
Ključne reči: Detekcija obligacija, Pravni domen, Jezički modeli, NLP, Klasifikacija teksta, BERT

Apstrakt

U ovom radu predstavljen je sistem za detekciju obligacija u okviru ugovora napisanih na engleskom jeziku. Klasifikator obligacija kao ulaz prima rečenice iz ugovora, a kao izlaz daje informaciju da li su rečenice obligacije, ili ne.

Reference

[1] Octavia-Maria S ̧ulea, Marcos Zampieri, Shervin Malmasi, Mihaela Vela, Liviu P. Dinu, Josef van Genabith, “Exploring the Use of Text Classification in the Legal Domain”, ASAIL, oktobar 2017.
[2] Vladimir Zolotov, David Kung, “Analysis and Optimization of FastText Linear Text Classifier”, IBM Watson Research
[3] Ilias Chalkidis, Manos Fergadiotis, Prodromos Malakasiotis, “Large-Scale Multi-Label Text Classification on EU Legislation”, Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, jul 2019
[4] Asaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Lilion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, “Attention Is All You Need”, NIPS 2017, Long Beach, CA, USA
[5] Ilias Chalkidis, Manos Fergadiotis, Prodromos Malakasiotis, Nikolaos Aletras, Ion Androutsopoulos, “LEGAL-BERT: The Muppets straight out of Law School”, oktobar 2020.
[6] TextCat Ensemble Model, spaCy: https://spacy.io/api/architectures#TextCatEnsemble
[7] V. Srividhya, R. Anitha, “Evaluating Preprocessing Techniques in Text Categorization”
[8] Somuya George K, Shibily Joseph, “Text Classification by Augmenting Bag of Words (BOW) Representation with Co-Occurence Feature”, IOSR/JCE, volume 16, Issue 1, Jan 2014.
[9] Shahzad Qaiser, Ramsha Ali, “Text Mining: Use of TF-IDF to Examine the Relevance of Words to Documents”, International Journal of Computer Applications, Volume 181, Jul 2018.
[10] Piotr Bojanowski, Edouard Grave, Armand Joulin, Tomas Mikolov, “Enriching Word Vectors with Subword Information”, Transactions of the Association for Computational Linguistics, 2017.
[11] Birol Kuyumcu, Cuneyt Aksakalli, Selman Delil, “An automated new approach in fast text classification (FastText): A case study for Turkish text classification without preprocessing”, NLPIR 2019, jun 2019.
[12] Corinna Cortes, Vladimir Vapnik, “Support-Vector Networks”, Machine Learning, 20, 273-297, 1995.
[13] I. Rish, “An Empirical Study of the Naive Bayes Classifier”, jan. 2001.
[14] Ashish Chaturvedi, Santosh Yadav, Mohd. Abuzar Mohd. Haroon Ansari, Mahendra Kanojia, “Comparative Multinomial Text Classification Analysis of Naïve Bayes and XGBoost with SMOTE on Imbalanced Dataset”, septembar 2021.
Objavljeno
2022-04-08
Sekcija
Elektrotehničko i računarsko inženjerstvo