arXiv Computer Science @arxiv_cs@qoto.org

FinSphere: A Conversational Stock Analysis Agent Equipped with Quantitative Tools based on Real-Time Database https://arxiv.org/abs/2501.12399 #q-fin.CP #cs.AI #cs.CL #cs.IR

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 24

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 24

Scopes of Alignment https://arxiv.org/abs/2501.12405 #cs.CY #cs.AI #cs.CL

Scopes of Alignment

Much of the research focus on AI alignment seeks to align large language models and other foundation models to the context-less and generic values of helpfulness, harmlessness, and honesty. Frontier model providers also strive to align their models with these values. In this paper, we motivate why we need to move beyond such a limited conception and propose three dimensions for doing so. The first scope of alignment is competence: knowledge, skills, or behaviors the model must possess to be useful for its intended purpose. The second scope of alignment is transience: either semantic or episodic depending on the context of use. The third scope of alignment is audience: either mass, public, small-group, or dyadic. At the end of the paper, we use the proposed framework to position some technologies and workflows that go beyond prevailing notions of alignment.

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 24

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 24

The Streaming Batch Model for Efficient and Fault-Tolerant Heterogeneous Execution

The Streaming Batch Model for Efficient and Fault-Tolerant Heterogeneous Execution https://arxiv.org/abs/2501.12407 #cs.DC #cs.LG

While ML model training and inference are both GPU-intensive, CPU-based data processing is often the bottleneck. Distributed data processing systems based on the batch or stream processing models assume homogeneous resource requirements. They excel at CPU-based computation but either under-utilize heterogeneous resources or impose high overheads on failure and reconfiguration. We introduce the streaming batch model, a hybrid of the two models that enables efficient and fault-tolerant heterogeneous execution. The key idea is to execute one partition at a time to allow lineage-based recovery with dynamic resource allocation. This enables memory-efficient pipelining across heterogeneous resources, similar to stream processing, but also offers the elasticity and fault tolerance properties of batch processing. We present Ray Data, an implementation of the streaming batch model that improves throughput on heterogeneous batch inference pipelines by 3--8$\times$ compared to traditional batch and stream processing systems. When training Stable Diffusion, Ray Data matches the throughput of single-node ML data loaders while additionally leveraging distributed heterogeneous clusters to further improve training throughput by 31%.

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 24

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 24

Control-ITRA: Controlling the Behavior of a Driving Model

Control-ITRA: Controlling the Behavior of a Driving Model https://arxiv.org/abs/2501.12408 #eess.SY #stat.ML #cs.AI #cs.LG #cs.RO #cs.SY

Simulating realistic driving behavior is crucial for developing and testing autonomous systems in complex traffic environments. Equally important is the ability to control the behavior of simulated agents to tailor scenarios to specific research needs and safety considerations. This paper extends the general-purpose multi-agent driving behavior model ITRA (Scibior et al., 2021), by introducing a method called Control-ITRA to influence agent behavior through waypoint assignment and target speed modulation. By conditioning agents on these two aspects, we provide a mechanism for them to adhere to specific trajectories and indirectly adjust their aggressiveness. We compare different approaches for integrating these conditions during training and demonstrate that our method can generate controllable, infraction-free trajectories while preserving realism in both seen and unseen locations.

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 24

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 24

Egoistic MDS-based Rigid Body Localization https://arxiv.org/abs/2501.12417 #eess.SP #cs.RO

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 24

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 24

ImageRef-VL: Enabling Contextual Image Referencing in Vision-Language Models https://arxiv.org/abs/2501.12418 #cs.CV #cs.AI

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 24

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 24

Consolidating TinyML Lifecycle with Large Language Models: Reality, Illusion, or Opportunity? https://arxiv.org/abs/2501.12420 #cs.SE #cs.AI #cs.LG

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 24

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 24

Tackling Small Sample Survival Analysis via Transfer Learning: A Study of Colorectal Cancer Prognosis

Tackling Small Sample Survival Analysis via Transfer Learning: A Study of Colorectal Cancer Prognosis https://arxiv.org/abs/2501.12421 #q-bio.QM #cs.LG #cs.AI

Survival prognosis is crucial for medical informatics. Practitioners often confront small-sized clinical data, especially cancer patient cases, which can be insufficient to induce useful patterns for survival predictions. This study deals with small sample survival analysis by leveraging transfer learning, a useful machine learning technique that can enhance the target analysis with related knowledge pre-learned from other data. We propose and develop various transfer learning methods designed for common survival models. For parametric models such as DeepSurv, Cox-CC (Cox-based neural networks), and DeepHit (end-to-end deep learning model), we apply standard transfer learning techniques like pretraining and fine-tuning. For non-parametric models such as Random Survival Forest, we propose a new transfer survival forest (TSF) model that transfers tree structures from source tasks and fine-tunes them with target data. We evaluated the transfer learning methods on colorectal cancer (CRC) prognosis. The source data are 27,379 SEER CRC stage I patients, and the target data are 728 CRC stage I patients from the West China Hospital. When enhanced by transfer learning, Cox-CC's $C^{td}$ value was boosted from 0.7868 to 0.8111, DeepHit's from 0.8085 to 0.8135, DeepSurv's from 0.7722 to 0.8043, and RSF's from 0.7940 to 0.8297 (the highest performance). All models trained with data as small as 50 demonstrated even more significant improvement. Conclusions: Therefore, the current survival models used for cancer prognosis can be enhanced and improved by properly designed transfer learning techniques. The source code used in this study is available at https://github.com/YonghaoZhao722/TSF.

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 24

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 24

CroMe: Multimodal Fake News Detection using Cross-Modal Tri-Transformer and Metric Learning

CroMe: Multimodal Fake News Detection using Cross-Modal Tri-Transformer and Metric Learning https://arxiv.org/abs/2501.12422 #cs.LG #cs.AI #cs.CV

Multimodal Fake News Detection has received increasing attention recently. Existing methods rely on independently encoded unimodal data and overlook the advantages of capturing intra-modality relationships and integrating inter-modal similarities using advanced techniques. To address these issues, Cross-Modal Tri-Transformer and Metric Learning for Multimodal Fake News Detection (CroMe) is proposed. CroMe utilizes Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models (BLIP2) as encoders to capture detailed text, image and combined image-text representations. The metric learning module employs a proxy anchor method to capture intra-modality relationships while the feature fusion module uses a Cross-Modal and Tri-Transformer for effective integration. The final fake news detector processes the fused features through a classifier to predict the authenticity of the content. Experiments on datasets show that CroMe excels in multimodal fake news detection.

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 23

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 23

How Large Language Models (LLMs) Extrapolate: From Guided Missiles to Guided Prompts

How Large Language Models (LLMs) Extrapolate: From Guided Missiles to Guided Prompts https://arxiv.org/abs/2501.10361 #cs.CY #cs.CL

This paper argues that we should perceive LLMs as machines of extrapolation. Extrapolation is a statistical function for predicting the next value in a series. Extrapolation contributes to both GPT successes and controversies surrounding its hallucination. The term hallucination implies a malfunction, yet this paper contends that it in fact indicates the chatbot efficiency in extrapolation, albeit an excess of it. This article bears a historical dimension: it traces extrapolation to the nascent years of cybernetics. In 1941, when Norbert Wiener transitioned from missile science to communication engineering, the pivotal concept he adopted was none other than extrapolation. Soviet mathematician Andrey Kolmogorov, renowned for his compression logic that inspired OpenAI, had developed in 1939 another extrapolation project that Wiener later found rather like his own. This paper uncovers the connections between hot war science, Cold War cybernetics, and the contemporary debates on LLM performances.

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 23

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 23

Reviewing Uses of Regulatory Compliance Monitoring https://arxiv.org/abs/2501.10362 #cs.CY #cs.DB

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 23

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 23

A Web-Based IDE for DevOps Learning in Software Engineering Higher Education https://arxiv.org/abs/2501.10363 #cs.CY #cs.SE

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 23

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 23

AI-Enhanced Decision-Making for Sustainable Supply Chains: Reducing Carbon Footprints in the USA https://arxiv.org/abs/2501.10364 #cs.CY

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 23

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 23

Can LLMs Identify Gaps and Misconceptions in Students' Code Explanations? https://arxiv.org/abs/2501.10365 #cs.CY #cs.AI #cs.SE

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 23

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 23

Participatory Assessment of Large Language Model Applications in an Academic Medical Center https://arxiv.org/abs/2501.10366 #cs.CY #cs.AI #cs.LG

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 23

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 23

GTDE: Grouped Training with Decentralized Execution for Multi-agent Actor-Critic https://arxiv.org/abs/2501.10367 #cs.MA #cs.AI

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 23

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 23

The Potential of Answer Classes in Large-scale Written Computer-Science Exams -- Vol. 2

The Potential of Answer Classes in Large-scale Written Computer-Science Exams -- Vol. 2 https://arxiv.org/abs/2501.10368 #cs.CY #cs.AI

Students' answers to tasks provide a valuable source of information in teaching as they result from applying cognitive processes to a learning content addressed in the task. Due to steadily increasing course sizes, analyzing student answers is frequently the only means of obtaining evidence about student performance. However, in many cases, resources are limited, and when evaluating exams, the focus is solely on identifying correct or incorrect answers. This overlooks the value of analyzing incorrect answers, which can help improve teaching strategies or identify misconceptions to be addressed in the next cohort. In teacher training for secondary education, assessment guidelines are mandatory for every exam, including anticipated errors and misconceptions. We applied this concept to a university exam with 462 students and 41 tasks. For each task, the instructors developed answer classes -- classes of expected responses, to which student answers were mapped during the exam correction process. The experiment resulted in a shift in mindset among the tutors and instructors responsible for the course: after initially having great reservations about whether the significant additional effort would yield an appropriate benefit, the procedure was subsequently found to be extremely valuable. The concept presented, and the experience gained from the experiment were cast into a system with which it is possible to correct paper-based exams on the basis of answer classes. This updated version of the paper provides an overview and new potential in the course of using the digital version of the approach.

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 23

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 23

Creative Loss: Ambiguity, Uncertainty and Indeterminacy https://arxiv.org/abs/2501.10369 #cs.CY #cs.AI #cs.HC #cs.LG

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 23

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 23

Harnessing Large Language Models for Mental Health: Opportunities, Challenges, and Ethical Considerations https://arxiv.org/abs/2501.10370 #cs.CY #cs.AI #cs.LG

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 21

**arXiv Computer Science** @arxiv_cs@qoto.org · Jan 21

Jan 21

iTool: Reinforced Fine-Tuning with Dynamic Deficiency Calibration for Advanced Tool Use

Boosting Tool Use of Large Language Models via Iterative Reinforced Fine-Tuning https://arxiv.org/abs/2501.09766 #cs.CL #cs.AI #cs.LG

Augmenting large language models (LLMs) with external tools is a promising approach to enhance their capabilities, especially for complex tasks. Synthesizing tool-use data through real-world simulations is an effective way to achieve this. However, our investigation reveals that training gains significantly decay as synthetic data increases. The model struggles to benefit from more synthetic data, and it can not equip the model with advanced tool-use capabilities in complex scenarios. Moreover, we discovered that the above limitation usually manifests as a fragment deficiency (i.e., parameter errors) in response. To this end, we propose an iterative reinforced fine-tuning strategy designed to alleviate this limitation. This strategy involves: (1) enhancing the diversity of response for synthetic data through path exploration of Monte Carlo Tree Search. (2) iteratively pinpointing the model's deficiency by constructing fine-grained preference pairs, and then improving it by preference optimization algorithms for targeted improvement. The experiments show that our method achieves 13.11% better performance than the same-size base model. It achieves an improvement of 6.5% in complex scenarios compared to the baseline, and it also outperforms larger open-source and closed-source models.