Show newer

A Survey of Safety on Large Vision-Language Models: Attacks, Defenses and Evaluations arxiv.org/abs/2502.14881 .CR .CV

From 16-Bit to 1-Bit: Visual KV Cache Quantization for Memory-Efficient Multimodal Large Language Models arxiv.org/abs/2502.14882 .CV

Can LVLMs and Automatic Metrics Capture Underlying Preferences of Blind and Low-Vision Individuals for Navigational Aid? arxiv.org/abs/2502.14883 .CV .AI

SEM-CLIP: Precise Few-Shot Learning for Nanoscale Defect Detection in Scanning Electron Microscope Image arxiv.org/abs/2502.14884 .CV .LG

Surgical Scene Understanding in the Era of Foundation AI Models: A Comprehensive Review arxiv.org/abs/2502.14886 .CV

Vision-Enhanced Time Series Forecasting via Latent Diffusion Models arxiv.org/abs/2502.14887 .CV .AI

DP-Adapter: Dual-Pathway Adapter for Boosting Fidelity and Text Consistency in Customizable Human Image Generation arxiv.org/abs/2502.13999 .GR

Human-Artificial Interaction in the Age of Agentic AI: A System-Theoretical Approach arxiv.org/abs/2502.14000 .MA .AI .HC

Rectified Lagrangian for Out-of-Distribution Detection in Modern Hopfield Networks arxiv.org/abs/2502.14003 .LG .AI

Inter3D: A Benchmark and Strong Baseline for Human-Interactive 3D Object Reconstruction arxiv.org/abs/2502.14004 .GR .LG

Smaller But Better: Unifying Layout Generation with Smaller Large Language Models arxiv.org/abs/2502.14005 .LG

Bi-Fact: A Bidirectional Factorization-based Evaluation of Intent Extraction from UI Trajectories arxiv.org/abs/2502.13149 .AI

Understanding Dynamic Diffusion Process of LLM-based Agents under Information Asymmetry arxiv.org/abs/2502.13160 .MA .AI

ShieldLearner: A New Paradigm for Jailbreak Attack Defense in LLMs arxiv.org/abs/2502.13162 .CR .AI .CL

Multi-Agent Actor-Critic Generative AI for Query Resolution and Analysis arxiv.org/abs/2502.13164 .MA .AI

HedgeAgents: A Balanced-aware Multi-agent Financial Trading System arxiv.org/abs/2502.13165 -fin.TR .MA .AI

Unveiling the Magic of Code Reasoning through Hypothesis Decomposition and Amendment arxiv.org/abs/2502.13170 .AI .LG

Web Phishing Net (WPN): A scalable machine learning approach for real-time phishing campaign detection arxiv.org/abs/2502.13171 .CR .AI .LG

Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.