Show newer

Strategic Workforce Planning in Crowdsourced Delivery with Hybrid Driver Fleets. (arXiv:2311.17935v1 [eess.SY]) arxiv.org/abs/2311.17935

Strategic Workforce Planning in Crowdsourced Delivery with Hybrid Driver Fleets

Nowadays, logistics service providers (LSPs) increasingly consider using a crowdsourced workforce on the last mile to fulfill customers' expectations regarding same-day or on-demand delivery at reduced costs. The crowdsourced workforce's availability is, however, uncertain. Therefore, LSPs often hire additional fixed employees to perform deliveries when the availability of crowdsourced drivers is low. In this context, the reliability versus flexibility trade-off which LSPs face over a longer period, e.g., a year, remains unstudied. Against this background, we jointly study a workforce planning problem that considers fixed drivers (FDs) and the temporal development of the crowdsourced driver (CD) fleet over a long-term time horizon. We consider two types of CDs, gigworkers (GWs) and occasional drivers (ODs). While GWs are not sensitive to the request's destination and typically exhibit high availability, ODs only serve requests whose origin and destination coincide with their own private route's origin and destination. Moreover, to account for time horizon-specific dynamics, we consider stochastic turnover for both FDs and CDs as well as stochastic CD fleet growth. We formulate the resulting workforce planning problem as a Markov decision process (MDP) whose reward function reflects total costs, i.e., wages and operational costs arising from serving demand with FDs and CDs, and solve it via approximate dynamic programming (ADP). Applying our approach to an environment based on real-world demand data from GrubHub, we find that in fleets consisting of FDs and CDs, ADP-based hiring policies can outperform myopic hiring policies by up to 19% in total costs. In the studied setting, we observed that GWs reduce the LSP's total costs more than ODs. When we account for CDs' increased resignation probability when not being matched with enough requests, the amount of required FDs increases.

arxiv.org

Diagnostics Algorithms in Nuclear Plant Cyber Attack Analysis Toolkit. (arXiv:2311.17936v1 [eess.SY]) arxiv.org/abs/2311.17936

Diagnostics Algorithms in Nuclear Plant Cyber Attack Analysis Toolkit

A Python interface is developed for the GPWR Simulator to automatically simulate cyber-spoofing of different steam generator parameters and plant operation. Specifically, steam generator water level, feedwater flowrate, steam flowrate, valve position, and steam generator controller parameters, including controller gain and time constant, can be directly attacked using command inject, denial of service, and man-in-the-middle type attacks. Plant operation can be initialized to any of the initial conditions provided by the GPWR simulator. Several different diagnostics algorithms have been implemented for anomaly detection, including physics-based diagnostics with Kalman filtering, data-driven diagnostics, noise profiling, and online sensor validation. Industry-standard safety analysis code RELAP5 is also available as a part of the toolkit. Diagnostics algorithms are analyzed based on accuracy and efficiency. Our observations indicate that physics-based diagnostics with Kalman filtering are the most robust. An experimental quantum kernel has been added to the framework for preliminary testing. Our first impressions suggest that while quantum kernels can be accurate, just like any other kernels, their applicability is problem/data dependent, and can be prone to overfitting.

arxiv.org

Efficient Deep Speech Understanding at the Edge. (arXiv:2311.17065v1 [eess.AS]) arxiv.org/abs/2311.17065

Efficient Deep Speech Understanding at the Edge

Contemporary Speech Understanding (SU) involves a sophisticated pipeline: capturing real-time voice input, the pipeline encompasses a deep neural network with an encoder-decoder architecture enhanced by beam search. This network periodically assesses attention and Connectionist Temporal Classification (CTC) scores in its autoregressive output. This paper aims to enhance SU performance on edge devices with limited resources. It pursues two intertwined goals: accelerating on-device execution and efficiently handling inputs that surpass the on-device model's capacity. While these objectives are well-established, we introduce innovative solutions that specifically address SU's distinctive challenges: 1. Late contextualization: Enables the parallel execution of a model's attentive encoder during input ingestion. 2. Pilot decoding: Alleviates temporal load imbalances. 3. Autoregression offramps: Facilitate offloading decisions based on partial output sequences. Our techniques seamlessly integrate with existing SU models, pipelines, and frameworks, allowing for independent or combined application. Together, they constitute a hybrid solution for edge SU, exemplified by our prototype, XYZ. Evaluated on platforms equipped with 6-8 Arm cores, our system achieves State-of-the-Art (SOTA) accuracy, reducing end-to-end latency by 2x and halving offloading requirements.

arxiv.org

Cluster trajectory of SOFA score in predicting mortality in sepsis. (arXiv:2311.17066v1 [q-bio.QM]) arxiv.org/abs/2311.17066

Cluster trajectory of SOFA score in predicting mortality in sepsis

Objective: Sepsis is a life-threatening condition. Sequential Organ Failure Assessment (SOFA) score is commonly used to assess organ dysfunction and predict ICU mortality, but it is taken as a static measurement and fails to capture dynamic changes. This study aims to investigate the relationship between dynamic changes in SOFA scores over the first 72 hours of ICU admission and patient outcomes. Design, setting, and participants: 3,253 patients in the Medical Information Mart for Intensive Care IV database who met the sepsis-3 criteria and were admitted from the emergency department with at least 72 hours of ICU admission and full-active resuscitation status were analysed. Group-based trajectory modelling with dynamic time warping and k-means clustering identified distinct trajectory patterns in dynamic SOFA scores. They were subsequently compared using Python. Main outcome measures: Outcomes including hospital and ICU mortality, length of stay in hospital and ICU, and readmission during hospital stay, were collected. Discharge time from ICU to wards and cut-offs at 7-day and 14-day were taken. Results: Four clusters were identified: A (consistently low SOFA scores), B (rapid increase followed by a decline in SOFA scores), C (higher baseline scores with gradual improvement), and D (persistently elevated scores). Cluster D had the longest ICU and hospital stays, highest ICU and hospital mortality. Discharge rates from ICU were similar for Clusters A and B, while Cluster C had initially comparable rates but a slower transition to ward. Conclusion: Monitoring dynamic changes in SOFA score is valuable for assessing sepsis severity and treatment responsiveness.

arxiv.org

Deep convolutional encoder-decoder hierarchical neural networks for conjugate heat transfer surrogate modeling. (arXiv:2311.17068v1 [cs.CE]) arxiv.org/abs/2311.17068

Deep convolutional encoder-decoder hierarchical neural networks for conjugate heat transfer surrogate modeling

Conjugate heat transfer (CHT) models are vital for the design of many engineering systems. However, high-fidelity CHT models are computationally intensive, which limits their use in applications such as design optimization, where hundreds to thousands of model evaluations are required. In this work, we develop a modular deep convolutional encoder-decoder hierarchical (DeepEDH) neural network, a novel deep-learning-based surrogate modeling methodology for computationally intensive CHT models. Leveraging convective temperature dependencies, we propose a two-stage temperature prediction architecture that couples velocity and temperature models. The proposed DeepEDH methodology is demonstrated by modeling the pressure, velocity, and temperature fields for a liquid-cooled cold-plate-based battery thermal management system with variable channel geometry. A computational model of the cold plate is developed and solved using the finite element method (FEM), generating a dataset of 1,500 simulations. The FEM results are transformed and scaled from unstructured to structured, image-like meshes to create training and test datasets. The DeepEDH methodology's performance is examined in relation to data scaling, training dataset size, and network depth. Our performance analysis covers the impact of the novel architecture, separate field models, output geometry masks, multi-stage temperature models, and optimizations of the hyperparameters and architecture. Furthermore, we quantify the influence of the CHT thermal boundary condition on surrogate model performance, highlighting improved temperature model performance with higher heat fluxes. Compared to other deep learning neural network surrogate models, such as U-Net and DenseED, the proposed DeepEDH methodology for CHT models exhibits up to a 65% enhancement in the coefficient of determination ($R^{2}$).

arxiv.org

IG Captioner: Information Gain Captioners are Strong Zero-shot Classifiers. (arXiv:2311.17072v1 [cs.CV]) arxiv.org/abs/2311.17072

IG Captioner: Information Gain Captioners are Strong Zero-shot Classifiers

Generative training has been demonstrated to be powerful for building visual-language models. However, on zero-shot discriminative benchmarks, there is still a performance gap between models trained with generative and discriminative objectives. In this paper, we aim to narrow this gap by improving the efficacy of generative training on classification tasks, without any finetuning processes or additional modules. Specifically, we focus on narrowing the gap between the generative captioner and the CLIP classifier. We begin by analysing the predictions made by the captioner and classifier and observe that the caption generation inherits the distribution bias from the language model trained with pure text modality, making it less grounded on the visual signal. To tackle this problem, we redesign the scoring objective for the captioner to alleviate the distributional bias and focus on measuring the gain of information brought by the visual inputs. We further design a generative training objective to match the evaluation objective. We name our model trained and evaluated from the novel procedures as Information Gain (IG) captioner. We pretrain the models on the public Laion-5B dataset and perform a series of discriminative evaluations. For the zero-shot classification on ImageNet, IG captioner achieves $> 18\%$ improvements over the standard captioner, achieving comparable performances with the CLIP classifier. IG captioner also demonstrated strong performance on zero-shot image-text retrieval tasks on MSCOCO and Flickr30K. We hope this paper inspires further research towards unifying generative and discriminative training procedures for visual-language models.

arxiv.org

Self-Supervised Learning of Whole and Component-Based Semantic Representations for Person Re-Identification. (arXiv:2311.17074v1 [cs.CV]) arxiv.org/abs/2311.17074

Self-Supervised Learning of Whole and Component-Based Semantic Representations for Person Re-Identification

Interactive Segmentation Models (ISMs) like the Segment Anything Model have significantly improved various computer vision tasks, yet their application to Person Re-identification (ReID) remains limited. On the other hand, existing semantic pre-training models for ReID often have limitations like predefined parsing ranges or coarse semantics. Additionally, ReID and Clothes-Changing ReID (CC-ReID) are usually treated separately due to their different domains. This paper investigates whether utilizing precise human-centric semantic representation can boost the ReID performance and improve the generalization among various ReID tasks. We propose SemReID, a self-supervised ReID model that leverages ISMs for adaptive part-based semantic extraction, contributing to the improvement of ReID performance. SemReID additionally refines its semantic representation through techniques such as image masking and KoLeo regularization. Evaluation across three types of ReID datasets -- standard ReID, CC-ReID, and unconstrained ReID -- demonstrates superior performance compared to state-of-the-art methods. In addition, recognizing the scarcity of large person datasets with fine-grained semantics, we introduce the novel LUPerson-Part dataset to assist ReID methods in acquiring the fine-grained part semantics for robust performance.

arxiv.org

Compositional Chain-of-Thought Prompting for Large Multimodal Models. (arXiv:2311.17076v1 [cs.CV]) arxiv.org/abs/2311.17076

Compositional Chain-of-Thought Prompting for Large Multimodal Models

The combination of strong visual backbones and Large Language Model (LLM) reasoning has led to Large Multimodal Models (LMMs) becoming the current standard for a wide range of vision and language (VL) tasks. However, recent research has shown that even the most advanced LMMs still struggle to capture aspects of compositional visual reasoning, such as attributes and relationships between objects. One solution is to utilize scene graphs (SGs)--a formalization of objects and their relations and attributes that has been extensively used as a bridge between the visual and textual domains. Yet, scene graph data requires scene graph annotations, which are expensive to collect and thus not easily scalable. Moreover, finetuning an LMM based on SG data can lead to catastrophic forgetting of the pretraining objective. To overcome this, inspired by chain-of-thought methods, we propose Compositional Chain-of-Thought (CCoT), a novel zero-shot Chain-of-Thought prompting method that utilizes SG representations in order to extract compositional knowledge from an LMM. Specifically, we first generate an SG using the LMM, and then use that SG in the prompt to produce a response. Through extensive experiments, we find that the proposed CCoT approach not only improves LMM performance on several vision and language VL compositional benchmarks but also improves the performance of several popular LMMs on general multimodal benchmarks, without the need for fine-tuning or annotated ground-truth SGs.

arxiv.org

I-MedSAM: Implicit Medical Image Segmentation with Segment Anything. (arXiv:2311.17081v1 [cs.CV]) arxiv.org/abs/2311.17081

I-MedSAM: Implicit Medical Image Segmentation with Segment Anything

With the development of Deep Neural Networks (DNNs), many efforts have been made to handle medical image segmentation. Traditional methods such as nnUNet train specific segmentation models on the individual datasets. Plenty of recent methods have been proposed to adapt the foundational Segment Anything Model (SAM) to medical image segmentation. However, they still focus on discrete representations to generate pixel-wise predictions, which are spatially inflexible and scale poorly to higher resolution. In contrast, implicit methods learn continuous representations for segmentation, which is crucial for medical image segmentation. In this paper, we propose I-MedSAM, which leverages the benefits of both continuous representations and SAM, to obtain better cross-domain ability and accurate boundary delineation. Since medical image segmentation needs to predict detailed segmentation boundaries, we designed a novel adapter to enhance the SAM features with high-frequency information during Parameter Efficient Fine Tuning (PEFT). To convert the SAM features and coordinates into continuous segmentation output, we utilize Implicit Neural Representation (INR) to learn an implicit segmentation decoder. We also propose an uncertainty-guided sampling strategy for efficient learning of INR. Extensive evaluations on 2D medical image segmentation tasks have shown that our proposed method with only 1.6M trainable parameters outperforms existing methods including discrete and continuous methods. The code will be released.

arxiv.org
Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.