Show newer

Real-time Control of Electric Autonomous Mobility-on-Demand Systems via Graph Reinforcement Learning. (arXiv:2311.05780v1 [eess.SY]) arxiv.org/abs/2311.05780

Are cascade dialogue state tracking models speaking out of turn in spoken dialogues?. (arXiv:2311.04922v1 [cs.CL]) arxiv.org/abs/2311.04922

Is one brick enough to break the wall of spoken dialogue state tracking?. (arXiv:2311.04923v1 [cs.CL]) arxiv.org/abs/2311.04923

A comparative analysis between Conformer-Transducer, Whisper, and wav2vec2 for improving the child speech recognition. (arXiv:2311.04936v1 [cs.CL]) arxiv.org/abs/2311.04936

Interpretable Geoscience Artificial Intelligence (XGeoS-AI): Application to Demystify Image Recognition. (arXiv:2311.04940v1 [cs.CV]) arxiv.org/abs/2311.04940

CSAM: A 2.5D Cross-Slice Attention Module for Anisotropic Volumetric Medical Image Segmentation. (arXiv:2311.04942v1 [eess.IV]) arxiv.org/abs/2311.04942

Auto deep learning for bioacoustic signals. (arXiv:2311.04945v1 [cs.LG]) arxiv.org/abs/2311.04945

GPU-Accelerated WFST Beam Search Decoder for CTC-based Speech Recognition. (arXiv:2311.04996v1 [eess.AS]) arxiv.org/abs/2311.04996

Harmonic Retrieval Using Weighted Lifted-Structure Low-Rank Matrix Completion. (arXiv:2311.05003v1 [eess.SP]) arxiv.org/abs/2311.05003

Reinforcement Learning Generalization for Nonlinear Systems Through Dual-Scale Homogeneity Transformations. (arXiv:2311.05013v1 [eess.SY]) arxiv.org/abs/2311.05013

Joint Sensing and Semantic Communications with Multi-Task Deep Learning. (arXiv:2311.05017v1 [cs.NI]) arxiv.org/abs/2311.05017

Near field Exposure Assessment of Complex Anatomical Structures in 5G Bands. (arXiv:2311.03368v1 [physics.med-ph]) arxiv.org/abs/2311.03368

AI-based, automated chamber volumetry from gated, non-contrast CT. (arXiv:2311.03371v1 [physics.med-ph]) arxiv.org/abs/2311.03371

The Fundamental Limits of Light-Wave Sensing for Non-Contact Respiration Monitoring. (arXiv:2311.03377v1 [physics.med-ph]) arxiv.org/abs/2311.03377

Learning Disentangled Speech Representations. (arXiv:2311.03389v1 [eess.AS]) arxiv.org/abs/2311.03389

FPGA-QHAR: Throughput-Optimized for Quantized Human Action Recognition on The Edge. (arXiv:2311.03390v1 [cs.CV]) arxiv.org/abs/2311.03390

PowerFlowNet: Leveraging Message Passing GNNs for Improved Power Flow Approximation. (arXiv:2311.03415v1 [cs.LG]) arxiv.org/abs/2311.03415

Personalizing Keyword Spotting with Speaker Information. (arXiv:2311.03419v1 [eess.AS]) arxiv.org/abs/2311.03419

Efficient and Low-Footprint Object Classification using Spatial Contrast. (arXiv:2311.03422v1 [cs.CV]) arxiv.org/abs/2311.03422

Combinatorial Hodge Theory in Simplicial Signal Processing -- DAFx2023 Lecture Notes. (arXiv:2311.03469v1 [eess.SP]) arxiv.org/abs/2311.03469

Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.