Show newer

LeMo: Enabling LEss Token Involvement for MOre Context Fine-tuning arxiv.org/abs/2501.09767 .CL .AI

Can Large Language Models Predict the Outcome of Judicial Decisions? arxiv.org/abs/2501.09768 .CL .AI

Can Large Language Models Predict the Outcome of Judicial Decisions?

Large Language Models (LLMs) have shown exceptional capabilities in Natural Language Processing (NLP) across diverse domains. However, their application in specialized tasks such as Legal Judgment Prediction (LJP) for low-resource languages like Arabic remains underexplored. In this work, we address this gap by developing an Arabic LJP dataset, collected and preprocessed from Saudi commercial court judgments. We benchmark state-of-the-art open-source LLMs, including LLaMA-3.2-3B and LLaMA-3.1-8B, under varying configurations such as zero-shot, one-shot, and fine-tuning using LoRA. Additionally, we employed a comprehensive evaluation framework that integrates both quantitative metrics (such as BLEU, ROUGE, and BERT) and qualitative assessments (including Coherence, Legal Language, Clarity, etc.) using an LLM. Our results demonstrate that fine-tuned smaller models achieve comparable performance to larger models in task-specific contexts while offering significant resource efficiency. Furthermore, we investigate the impact of fine-tuning the model on a diverse set of instructions, offering valuable insights into the development of a more human-centric and adaptable LLM. We have made the dataset, code, and models publicly available to provide a solid foundation for future research in Arabic legal NLP.

arXiv.org

Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong arxiv.org/abs/2501.09775 .CL .AI

Multi-Head Self-Attending Neural Tucker Factorization arxiv.org/abs/2501.09776 .LG

Multi-Head Self-Attending Neural Tucker Factorization

Quality-of-service (QoS) data exhibit dynamic temporal patterns that are crucial for accurately predicting missing values. These patterns arise from the evolving interactions between users and services, making it essential to capture the temporal dynamics inherent in such data for improved prediction performance. As the size and complexity of QoS datasets increase, existing models struggle to provide accurate predictions, highlighting the need for more flexible and dynamic methods to better capture the underlying patterns in large-scale QoS data. To address this issue, we introduce a neural network-based tensor factorization approach tailored for learning spatiotemporal representations of high-dimensional and incomplete (HDI) tensors, namely the Multi-head Self-attending Neural Tucker Factorization (MSNTucF). The model is elaborately designed for modeling intricate nonlinear spatiotemporal feature interaction patterns hidden in real world data with a two-fold idea. It first employs a neural network structure to generalize the traditional framework of Tucker factorization and then proposes to leverage a multi-head self-attending module to enforce nonlinear latent interaction learning. In empirical studies on two dynamic QoS datasets from real applications, the proposed MSNTucF model demonstrates superior performance compared to state-of-the-art benchmark models in estimating missing observations. This highlights its ability to learn non-linear spatiotemporal representations of HDI tensors.

arXiv.org

Sentiment Analysis in Twitter Social Network Centered on Cryptocurrencies Using Machine Learning arxiv.org/abs/2501.09777 .CL

Sentiment Analysis in Twitter Social Network Centered on Cryptocurrencies Using Machine Learning

Cryptocurrency is a digital currency that uses blockchain technology with secure encryption. Due to the decentralization of these currencies, traditional monetary systems and the capital market of each they, can influence a society. Therefore, due to the importance of the issue, the need to understand public opinion and analyze people's opinions in this regard increases. To understand the opinions and views of people about different topics, you can take help from social networks because they are a rich source of opinions. The Twitter social network is one of the main platforms where users discuss various topics, therefore, in the shortest time and with the lowest cost, the opinion of the community can be measured on this social network. Twitter Sentiment Analysis (TSA) is a field that analyzes the sentiment expressed in tweets. Considering that most of TSA's research efforts on cryptocurrencies are focused on English language, the purpose of this paper is to investigate the opinions of Iranian users on the Twitter social network about cryptocurrencies and provide the best model for classifying tweets based on sentiment. In the case of automatic analysis of tweets, managers and officials in the field of economy can gain knowledge from the general public's point of view about this issue and use the information obtained in order to properly manage this phenomenon. For this purpose, in this paper, in order to build emotion classification models, natural language processing techniques such as bag of words (BOW) and FastText for text vectorization and classical machine learning algorithms including KNN, SVM and Adaboost learning methods Deep including LSTM and BERT model were used for classification, and finally BERT linguistic model had the best accuracy with 83.50%.

arXiv.org

VideoWorld: Exploring Knowledge Learning from Unlabeled Videos arxiv.org/abs/2501.09781 .CV

VideoWorld: Exploring Knowledge Learning from Unlabeled Videos

This work explores whether a deep generative model can learn complex knowledge solely from visual input, in contrast to the prevalent focus on text-based models like large language models (LLMs). We develop VideoWorld, an auto-regressive video generation model trained on unlabeled video data, and test its knowledge acquisition abilities in video-based Go and robotic control tasks. Our experiments reveal two key findings: (1) video-only training provides sufficient information for learning knowledge, including rules, reasoning and planning capabilities, and (2) the representation of visual change is crucial for knowledge acquisition. To improve both the efficiency and efficacy of this process, we introduce the Latent Dynamics Model (LDM) as a key component of VideoWorld. Remarkably, VideoWorld reaches a 5-dan professional level in the Video-GoBench with just a 300-million-parameter model, without relying on search algorithms or reward mechanisms typical in reinforcement learning. In robotic tasks, VideoWorld effectively learns diverse control operations and generalizes across environments, approaching the performance of oracle models in CALVIN and RLBench. This study opens new avenues for knowledge acquisition from visual data, with all code, data, and models open-sourced for further research.

arXiv.org

Unveiling Behavioral Differences in Bilingual Information Operations: A Network-Based Approach arxiv.org/abs/2501.09027 .SI

Unveiling Behavioral Differences in Bilingual Information Operations: A Network-Based Approach

Twitter has become a pivotal platform for conducting information operations (IOs), particularly during high-stakes political events. In this study, we analyze over a million tweets about the 2024 U.S. presidential election to explore an under-studied area: the behavioral differences of IO drivers from English- and Spanish-speaking communities. Using similarity graphs constructed from behavioral patterns, we identify IO drivers in both languages and evaluate the clustering quality of these graphs in an unsupervised setting. Our analysis demonstrates how different network dismantling strategies, such as node pruning and edge filtering, can impact clustering quality and the identification of coordinated IO drivers. We also reveal significant differences in the topics and political indicators between English and Spanish IO drivers. Additionally, we investigate bilingual users who post in both languages, systematically uncovering their distinct roles and behaviors compared to monolingual users. These findings underscore the importance of robust, culturally and linguistically adaptable IO detection methods to mitigate the risks of influence campaigns on social media. Our code and data are available on GitHub: https://github.com/bowenyi-pierre/humans-lab-hackathon-24.

arXiv.org

Distributed Identity for Zero Trust and Segmented Access Control: A Novel Approach to Securing Network Infrastructure arxiv.org/abs/2501.09032 .CR .CY

Distributed Identity for Zero Trust and Segmented Access Control: A Novel Approach to Securing Network Infrastructure

"Distributed Identity" refers to the transition from centralized identity systems using Decentralized Identifiers (DID) and Verifiable Credentials (VC) for secure and privacy-preserving authentications. With distributed identity, control of identity data is returned to the user, making credential-based attacks impossible due to the lack of a single point of failure. This study assesses the security improvements achieved when distributed identity is employed with the ZTA principle, particularly concerning lateral movements within segmented networks. It also considers areas such as the implementation specifications of the framework, the advantages and disadvantages of the method to organizations, and the issues of compatibility and generalizability. Furthermore, the study highlights privacy and regulatory compliance, including the General Data Protection Regulation (GDPR) and California Consumer Data Privacy Act (CCPA), analyzing potential solutions to these problems. The study implies that adopting distributed identities can enhance overall security postures by an order of magnitude, providing contextual and least-privilege authorization and user privacy. The research recommends refining technical standards, expanding the use of distributed identity in practice, and discussing its applications for the contemporary digital security landscape.

arXiv.org

SCOT: Self-Supervised Contrastive Pretraining For Zero-Shot Compositional Retrieval arxiv.org/abs/2501.08347 .CV .AI

SCOT: Self-Supervised Contrastive Pretraining For Zero-Shot Compositional Retrieval

Compositional image retrieval (CIR) is a multimodal learning task where a model combines a query image with a user-provided text modification to retrieve a target image. CIR finds applications in a variety of domains including product retrieval (e-commerce) and web search. Existing methods primarily focus on fully-supervised learning, wherein models are trained on datasets of labeled triplets such as FashionIQ and CIRR. This poses two significant challenges: (i) curating such triplet datasets is labor intensive; and (ii) models lack generalization to unseen objects and domains. In this work, we propose SCOT (Self-supervised COmpositional Training), a novel zero-shot compositional pretraining strategy that combines existing large image-text pair datasets with the generative capabilities of large language models to contrastively train an embedding composition network. Specifically, we show that the text embedding from a large-scale contrastively-pretrained vision-language model can be utilized as proxy target supervision during compositional pretraining, replacing the target image embedding. In zero-shot settings, this strategy surpasses SOTA zero-shot compositional retrieval methods as well as many fully-supervised methods on standard benchmarks such as FashionIQ and CIRR.

arXiv.org

Weight Averaging for Out-of-Distribution Generalization and Few-Shot Domain Adaptation arxiv.org/abs/2501.08361 .CV .LG

Weight Averaging for Out-of-Distribution Generalization and Few-Shot Domain Adaptation

Empirical risk minimization (ERM) is not robust to changes in the distribution of data. When the distribution of test data is different from that of training data, the problem is known as out-of-distribution generalization. Recently, two techniques have been developed for addressing out-of-distribution generalization in computer vision: weight averaging (WA) and sharpness-aware minimization (SAM). WA involves training multiple models with different hyperparameters and then averaging the weights of these models, which can significantly improve out-of-distribution generalization performance. SAM optimizes a neural network to find minima in flat regions, which have been proven to perform well under distribution shifts. While these techniques have made great progress, there is still room for improvement and further exploration. In this thesis, we propose increasing the model diversity in WA explicitly by introducing gradient similarity as a loss regularizer to further improve out-of-distribution generalization performance. We also propose combining WA and SAM to solve the problem of few-shot domain adaptation. Our extensive experiments on digits datasets (MNIST, SVHN, USPS, MNIST-M) and other domain adaptation datasets (VLCS, PACS) show that combining WA and SAM leads to improved out-of-distribution generalization performance and significantly increases few-shot domain adaptation accuracy.

arXiv.org
Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.