Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis. (arXiv:2106.08352v1 [eess.AS]) arxiv.org/abs/2106.08352

A Multi-Layered Approach for Measuring the Simulation-to-Reality Gap of Radar Perception for Autonomous Driving. (arXiv:2106.08372v1 [cs.RO]) arxiv.org/abs/2106.08372

Plane and Sample: Maximizing Information about Autonomous Vehicle Performance using Submodular Optimization. (arXiv:2106.08389v1 [cs.RO]) arxiv.org/abs/2106.08389

On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control. (arXiv:2106.08414v1 [cs.LG]) arxiv.org/abs/2106.08414

A Framework for Discovering Optimal Solutions in Photonic Inverse Design. (arXiv:2106.08419v1 [physics.optics]) arxiv.org/abs/2106.08419

Design and analysis of deployable clustered tensegrity cable domes. (arXiv:2106.08424v1 [eess.SY]) arxiv.org/abs/2106.08424

Pathological voice adaptation with autoencoder-based voice conversion. (arXiv:2106.08427v1 [cs.SD]) arxiv.org/abs/2106.08427

Optimal control of a 2D diffusion-advection process with a team of mobile actuators under jointly optimal guidance. (arXiv:2106.08429v1 [math.OC]) arxiv.org/abs/2106.08429

Assessment of Subjective and Objective Quality of Live Streaming Sports Videos. (arXiv:2106.08431v1 [eess.IV]) arxiv.org/abs/2106.08431

Co-Design of Free-Space Metasurface Optical Neuromorphic Classifiers for High Performance. (arXiv:2106.08435v1 [physics.optics]) arxiv.org/abs/2106.08435

PhyMask: Robust Sensing of Brain Activity and Physiological Signals During Sleep with an All-textile Eye Mask. (arXiv:2106.07645v1 [eess.SP]) arxiv.org/abs/2106.07645

Using heterogeneity in semi-supervised transcription hypotheses to improve code-switched speech recognition. (arXiv:2106.07699v1 [cs.CL]) arxiv.org/abs/2106.07699

CathAI: Fully Automated Interpretation of Coronary Angiograms Using Neural Networks. (arXiv:2106.07708v1 [cs.LG]) arxiv.org/abs/2106.07708

Overcoming Domain Mismatch in Low Resource Sequence-to-Sequence ASR Models using Hybrid Generated Pseudotranscripts. (arXiv:2106.07716v1 [cs.CL]) arxiv.org/abs/2106.07716

Learning Audio-Visual Dereverberation. (arXiv:2106.07732v1 [cs.SD]) arxiv.org/abs/2106.07732

CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition. (arXiv:2106.07734v1 [cs.CL]) arxiv.org/abs/2106.07734

Unique sparse decomposition of low rank matrices. (arXiv:2106.07736v1 [math.OC]) arxiv.org/abs/2106.07736

Optical Wireless Satellite Networks versus Optical Fiber Terrestrial Networks: The Latency Perspective. (arXiv:2106.07737v1 [eess.SP]) arxiv.org/abs/2106.07737

Kaizen: Continuously improving teacher using Exponential Moving Average for semi-supervised speech recognition. (arXiv:2106.07759v1 [eess.AS]) arxiv.org/abs/2106.07759

Tracing Back Music Emotion Predictions to Sound Sources and Intuitive Perceptual Qualities. (arXiv:2106.07787v1 [cs.SD]) arxiv.org/abs/2106.07787

Show more
Qoto Mastodon

QOTO: Question Others to Teach Ourselves. A STEM-oriented instance.

An inclusive free speech instance.
All cultures and opinions welcome.
Explicit hate speech and harassment strictly forbidden.
We federate with all servers: we don't block any servers.