Show newer

FinSphere: A Conversational Stock Analysis Agent Equipped with Quantitative Tools based on Real-Time Database arxiv.org/abs/2501.12399 -fin.CP .AI .CL .IR

The Streaming Batch Model for Efficient and Fault-Tolerant Heterogeneous Execution arxiv.org/abs/2501.12407 .DC .LG

The Streaming Batch Model for Efficient and Fault-Tolerant Heterogeneous Execution

While ML model training and inference are both GPU-intensive, CPU-based data processing is often the bottleneck. Distributed data processing systems based on the batch or stream processing models assume homogeneous resource requirements. They excel at CPU-based computation but either under-utilize heterogeneous resources or impose high overheads on failure and reconfiguration. We introduce the streaming batch model, a hybrid of the two models that enables efficient and fault-tolerant heterogeneous execution. The key idea is to execute one partition at a time to allow lineage-based recovery with dynamic resource allocation. This enables memory-efficient pipelining across heterogeneous resources, similar to stream processing, but also offers the elasticity and fault tolerance properties of batch processing. We present Ray Data, an implementation of the streaming batch model that improves throughput on heterogeneous batch inference pipelines by 3--8$\times$ compared to traditional batch and stream processing systems. When training Stable Diffusion, Ray Data matches the throughput of single-node ML data loaders while additionally leveraging distributed heterogeneous clusters to further improve training throughput by 31%.

arXiv.org

ImageRef-VL: Enabling Contextual Image Referencing in Vision-Language Models arxiv.org/abs/2501.12418 .CV .AI

Consolidating TinyML Lifecycle with Large Language Models: Reality, Illusion, or Opportunity? arxiv.org/abs/2501.12420 .SE .AI .LG

Tackling Small Sample Survival Analysis via Transfer Learning: A Study of Colorectal Cancer Prognosis arxiv.org/abs/2501.12421 -bio.QM .LG .AI

Tackling Small Sample Survival Analysis via Transfer Learning: A Study of Colorectal Cancer Prognosis

Survival prognosis is crucial for medical informatics. Practitioners often confront small-sized clinical data, especially cancer patient cases, which can be insufficient to induce useful patterns for survival predictions. This study deals with small sample survival analysis by leveraging transfer learning, a useful machine learning technique that can enhance the target analysis with related knowledge pre-learned from other data. We propose and develop various transfer learning methods designed for common survival models. For parametric models such as DeepSurv, Cox-CC (Cox-based neural networks), and DeepHit (end-to-end deep learning model), we apply standard transfer learning techniques like pretraining and fine-tuning. For non-parametric models such as Random Survival Forest, we propose a new transfer survival forest (TSF) model that transfers tree structures from source tasks and fine-tunes them with target data. We evaluated the transfer learning methods on colorectal cancer (CRC) prognosis. The source data are 27,379 SEER CRC stage I patients, and the target data are 728 CRC stage I patients from the West China Hospital. When enhanced by transfer learning, Cox-CC's $C^{td}$ value was boosted from 0.7868 to 0.8111, DeepHit's from 0.8085 to 0.8135, DeepSurv's from 0.7722 to 0.8043, and RSF's from 0.7940 to 0.8297 (the highest performance). All models trained with data as small as 50 demonstrated even more significant improvement. Conclusions: Therefore, the current survival models used for cancer prognosis can be enhanced and improved by properly designed transfer learning techniques. The source code used in this study is available at https://github.com/YonghaoZhao722/TSF.

arXiv.org

A Web-Based IDE for DevOps Learning in Software Engineering Higher Education arxiv.org/abs/2501.10363 .CY .SE

AI-Enhanced Decision-Making for Sustainable Supply Chains: Reducing Carbon Footprints in the USA arxiv.org/abs/2501.10364 .CY

Can LLMs Identify Gaps and Misconceptions in Students' Code Explanations? arxiv.org/abs/2501.10365 .CY .AI .SE

Participatory Assessment of Large Language Model Applications in an Academic Medical Center arxiv.org/abs/2501.10366 .CY .AI .LG

GTDE: Grouped Training with Decentralized Execution for Multi-agent Actor-Critic arxiv.org/abs/2501.10367 .MA .AI

The Potential of Answer Classes in Large-scale Written Computer-Science Exams -- Vol. 2 arxiv.org/abs/2501.10368 .CY .AI

The Potential of Answer Classes in Large-scale Written Computer-Science Exams -- Vol. 2

Students' answers to tasks provide a valuable source of information in teaching as they result from applying cognitive processes to a learning content addressed in the task. Due to steadily increasing course sizes, analyzing student answers is frequently the only means of obtaining evidence about student performance. However, in many cases, resources are limited, and when evaluating exams, the focus is solely on identifying correct or incorrect answers. This overlooks the value of analyzing incorrect answers, which can help improve teaching strategies or identify misconceptions to be addressed in the next cohort. In teacher training for secondary education, assessment guidelines are mandatory for every exam, including anticipated errors and misconceptions. We applied this concept to a university exam with 462 students and 41 tasks. For each task, the instructors developed answer classes -- classes of expected responses, to which student answers were mapped during the exam correction process. The experiment resulted in a shift in mindset among the tutors and instructors responsible for the course: after initially having great reservations about whether the significant additional effort would yield an appropriate benefit, the procedure was subsequently found to be extremely valuable. The concept presented, and the experience gained from the experiment were cast into a system with which it is possible to correct paper-based exams on the basis of answer classes. This updated version of the paper provides an overview and new potential in the course of using the digital version of the approach.

arXiv.org

Creative Loss: Ambiguity, Uncertainty and Indeterminacy arxiv.org/abs/2501.10369 .CY .AI .HC .LG

Harnessing Large Language Models for Mental Health: Opportunities, Challenges, and Ethical Considerations arxiv.org/abs/2501.10370 .CY .AI .LG

Boosting Tool Use of Large Language Models via Iterative Reinforced Fine-Tuning arxiv.org/abs/2501.09766 .CL .AI .LG

iTool: Reinforced Fine-Tuning with Dynamic Deficiency Calibration for Advanced Tool Use

Augmenting large language models (LLMs) with external tools is a promising approach to enhance their capabilities, especially for complex tasks. Synthesizing tool-use data through real-world simulations is an effective way to achieve this. However, our investigation reveals that training gains significantly decay as synthetic data increases. The model struggles to benefit from more synthetic data, and it can not equip the model with advanced tool-use capabilities in complex scenarios. Moreover, we discovered that the above limitation usually manifests as a fragment deficiency (i.e., parameter errors) in response. To this end, we propose an iterative reinforced fine-tuning strategy designed to alleviate this limitation. This strategy involves: (1) enhancing the diversity of response for synthetic data through path exploration of Monte Carlo Tree Search. (2) iteratively pinpointing the model's deficiency by constructing fine-grained preference pairs, and then improving it by preference optimization algorithms for targeted improvement. The experiments show that our method achieves 13.11% better performance than the same-size base model. It achieves an improvement of 6.5% in complex scenarios compared to the baseline, and it also outperforms larger open-source and closed-source models.

arXiv.org
Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.