arXiv Computer Science @arxiv_cs@qoto.org

1.12K Followers

Bot

I toot the arXiv feed for topics in Computer Science.

#ComputerScience #CS #Programming #SoftwareEngineering #Software #SoftwareDevelopment #Computers #Science #arXiv #News #PeerReview

Joined Jul 2018

2 Following 1.12K Followers

Posts Posts and replies Media

arXiv Computer Science @arxiv_cs@qoto.org

T2CI-GAN: Text to Compressed Image generation using Generative Adversarial Network. (arXiv:2210.03734v1 [cs.CV]) http://arxiv.org/abs/2210.03734

T2CI-GAN: Text to Compressed Image generation using Generative Adversarial Network

The problem of generating textual descriptions for the visual data has gained research attention in the recent years. In contrast to that the problem of generating visual data from textual descriptions is still very challenging, because it requires the combination of both Natural Language Processing (NLP) and Computer Vision techniques. The existing methods utilize the Generative Adversarial Networks (GANs) and generate the uncompressed images from textual description. However, in practice, most of the visual data are processed and transmitted in the compressed representation. Hence, the proposed work attempts to generate the visual data directly in the compressed representation form using Deep Convolutional GANs (DCGANs) to achieve the storage and computational efficiency. We propose GAN models for compressed image generation from text. The first model is directly trained with JPEG compressed DCT images (compressed domain) to generate the compressed images from text descriptions. The second model is trained with RGB images (pixel domain) to generate JPEG compressed DCT representation from text descriptions. The proposed models are tested on an open source benchmark dataset Oxford-102 Flower images using both RGB and JPEG compressed versions, and accomplished the state-of-the-art performance in the JPEG compressed domain. The code will be publicly released at GitHub after acceptance of paper.

arXiv Computer Science @arxiv_cs@qoto.org

"Help Me Help the AI": Understanding How Explainability Can Support Human-AI Interaction. (arXiv:2210.03735v1 [cs.HC]) http://arxiv.org/abs/2210.03735

"Help Me Help the AI": Understanding How Explainability Can Support Human-AI Interaction

Despite the proliferation of explainable AI (XAI) methods, little is understood about end-users' explainability needs and behaviors around XAI explanations. To address this gap and contribute to understanding how explainability can support human-AI interaction, we conducted a mixed-methods study with 20 end-users of a real-world AI application, the Merlin bird identification app, and inquired about their XAI needs, uses, and perceptions. We found that participants desire practically useful information that can improve their collaboration with the AI, more so than technical system details. Relatedly, participants intended to use XAI explanations for various purposes beyond understanding the AI's outputs: calibrating trust, improving their task skills, changing their behavior to supply better inputs to the AI, and giving constructive feedback to developers. Finally, among existing XAI approaches, participants preferred part-based explanations that resemble human reasoning and explanations. We discuss the implications of our findings and provide recommendations for future XAI design.

arXiv Computer Science @arxiv_cs@qoto.org

Trustworthy clinical AI solutions: a unified review of uncertainty quantification in deep learning models for medical image analysis. (arXiv:2210.03736v1 [eess.IV]) http://arxiv.org/abs/2210.03736

Trustworthy clinical AI solutions: a unified review of uncertainty quantification in deep learning models for medical image analysis

The full acceptance of Deep Learning (DL) models in the clinical field is rather low with respect to the quantity of high-performing solutions reported in the literature. Particularly, end users are reluctant to rely on the rough predictions of DL models. Uncertainty quantification methods have been proposed in the literature as a potential response to reduce the rough decision provided by the DL black box and thus increase the interpretability and the acceptability of the result by the final user. In this review, we propose an overview of the existing methods to quantify uncertainty associated to DL predictions. We focus on applications to medical image analysis, which present specific challenges due to the high dimensionality of images and their quality variability, as well as constraints associated to real-life clinical routine. We then discuss the evaluation protocols to validate the relevance of uncertainty estimates. Finally, we highlight the open challenges of uncertainty quantification in the medical field.

arXiv Computer Science @arxiv_cs@qoto.org

Exploring Effectiveness of Explanations for Appropriate Trust: Lessons from Cognitive Psychology. (arXiv:2210.03737v1 [cs.HC]) http://arxiv.org/abs/2210.03737

Exploring Effectiveness of Explanations for Appropriate Trust: Lessons from Cognitive Psychology

The rapid development of Artificial Intelligence (AI) requires developers and designers of AI systems to focus on the collaboration between humans and machines. AI explanations of system behavior and reasoning are vital for effective collaboration by fostering appropriate trust, ensuring understanding, and addressing issues of fairness and bias. However, various contextual and subjective factors can influence an AI system explanation's effectiveness. This work draws inspiration from findings in cognitive psychology to understand how effective explanations can be designed. We identify four components to which explanation designers can pay special attention: perception, semantics, intent, and user & context. We illustrate the use of these four explanation components with an example of estimating food calories by combining text with visuals, probabilities with exemplars, and intent communication with both user and context in mind. We propose that the significant challenge for effective AI explanations is an additional step between explanation generation using algorithms not producing interpretable explanations and explanation communication. We believe this extra step will benefit from carefully considering the four explanation components outlined in our work, which can positively affect the explanation's effectiveness.

arXiv Computer Science @arxiv_cs@qoto.org

Dual-Stage Deeply Supervised Attention-based Convolutional Neural Networks for Mandibular Canal Segmentation in CBCT Scans. (arXiv:2210.03739v1 [eess.IV]) http://arxiv.org/abs/2210.03739

Dual-Stage Deeply Supervised Attention-based Convolutional Neural Networks for Mandibular Canal Segmentation in CBCT Scans

Accurate segmentation of mandibular canals in lower jaws is important in dental implantology, in which the implant position and dimensions are currently determined manually from 3D CT images by medical experts to avoid damaging the mandibular nerve inside the canal. In this paper, we propose a novel dual-stage deep learning based scheme for automatic detection of mandibular canal. Particularly, we first we enhance the CBCT scans by employing the novel histogram-based dynamic windowing scheme which improves the visibility of mandibular canals. After enhancement, we design 3D deeply supervised attention U-Net architecture for localize the volume of interest (VOI) which contains the mandibular canals (i.e., left and right canals). Finally, we employed the multi-scale input residual U-Net architecture (MS-R-UNet) to accurately segment the mandibular canals. The proposed method has been rigorously evaluated on 500 scans and results demonstrate that our technique out performs the existing state-of-the-art methods in term of segmentation performance as well as robustness.

arXiv Computer Science @arxiv_cs@qoto.org

Equivalent Circuit Modeling and Analysis of Metamaterial Based Wireless Power Transfer. (arXiv:2210.03740v1 [eess.SY]) http://arxiv.org/abs/2210.03740

Equivalent Circuit Modeling and Analysis of Metamaterial Based Wireless Power Transfer

In this study, an equivalent circuit model is presented to emulate the behavior of a metamaterial-based wireless power transfer system. For this purpose, the electromagnetic field simulation of the proposed system is conducted in ANSYS high frequency structure simulator. In addition, a numerical analysis of the proposed structure is explored to evaluate its transfer characteristics. The power transfer efficiency of the proposed structure is represented by the transmission scattering parameter. While some methods, including interference theory and effective medium theory have been exploited to explain the physics mechanism of MM-based WPT systems, some of the reactive parameters and the basic physical interpretation have not been clearly expounded. In contrast to existing theoretical model, the proposed approach focuses on the effect of the system parameters and transfer coils on the system transfer characteristics and its effectiveness in analyzing complex circuit. Numerical solution of the system transfer characteristics, including the scattering parameter and power transfer efficiency is conducted in Matlab. The calculation results based on numerical estimation validates the full wave electromagnetic simulation results, effectively verifying the accuracy of the analytical model.

arXiv Computer Science @arxiv_cs@qoto.org

Modeling and Analysis of Grid Tied Combined Ultracapacitor Fuel Cell for Renewable Application. (arXiv:2210.03741v1 [eess.SY]) http://arxiv.org/abs/2210.03741

Modeling and Analysis of Grid Tied Combined Ultracapacitor Fuel Cell for Renewable Application

In this manuscript, the performance of an ultracapacitor fuel cell in grid connected mode is investigated. Voltage regulation to the ultracapacitor was achieved with a three level bidirectional DC-DC converter while also achieving power flow from the grid to the ultra-capacitor via the bidirectional converter. The choice of a bidirectional three level converter for voltage regulation is based on its inherently high efficiency, low harmonic profile and compact size. Using the model equations of the converter and grid connected inverter derived using the switching function approach, the grid's direct and quadrature axes modulation indices, Md and Mq, respectively were simulated in Matlab for both lagging and leading power factors. Moreover, the values of Md and Mq were exploited in a PLECS based simulation of the proposed model to determine the effect of power factor correction on the current and power injection to grid

arXiv Computer Science @arxiv_cs@qoto.org

Single Image Super-Resolution Based on Capsule Neural Networks. (arXiv:2210.03743v1 [eess.IV]) http://arxiv.org/abs/2210.03743

Single Image Super-Resolution Based on Capsule Neural Networks

Single image super-resolution (SISR) is the process of obtaining one high-resolution version of a low-resolution image by increasing the number of pixels per unit area. This method has been actively investigated by the research community, due to the wide variety of real-world problems where it can be applied, from aerial and satellite imaging to compressed image and video enhancement. Despite the improvements achieved by deep learning in the field, the vast majority of the used networks are based on traditional convolutions, with the solutions focusing on going deeper and/or wider, and innovations coming from jointly employing successful concepts from other fields. In this work, we decided to step up from the traditional convolutions and adopt the concept of capsules. Since their overwhelming results both in image classification and segmentation problems, we question how suitable they are for SISR. We also verify that different solutions share most of their configurations, and argue that this trend leads to fewer explorations of network varieties. During our experiments, we check various strategies to improve results, ranging from new and different loss functions to changes in the capsule layers. Our network achieved good results with fewer convolutional-based layers, showing that capsules might be a concept worth applying in the image super-resolution problem.

arXiv Computer Science @arxiv_cs@qoto.org

ProGReST: Prototypical Graph Regression Soft Trees for Molecular Property Prediction. (arXiv:2210.03745v1 [q-bio.QM]) http://arxiv.org/abs/2210.03745

ProGReST: Prototypical Graph Regression Soft Trees for Molecular Property Prediction

In this work, we propose the novel Prototypical Graph Regression Self-explainable Trees (ProGReST) model, which combines prototype learning, soft decision trees, and Graph Neural Networks. In contrast to other works, our model can be used to address various challenging tasks, including compound property prediction. In ProGReST, the rationale is obtained along with prediction due to the model's built-in interpretability. Additionally, we introduce a new graph prototype projection to accelerate model training. Finally, we evaluate PRoGReST on a wide range of chemical datasets for molecular property prediction and perform in-depth analysis with chemical experts to evaluate obtained interpretations. Our method achieves competitive results against state-of-the-art methods.

arXiv Computer Science @arxiv_cs@qoto.org

A deep learning approach to solve forward differential problems on graphs. (arXiv:2210.03746v1 [cs.LG]) http://arxiv.org/abs/2210.03746

A deep learning approach to solve forward differential problems on graphs

We propose a novel deep learning (DL) approach to solve one-dimensional non-linear elliptic, parabolic, and hyperbolic problems on graphs. A system of physics-informed neural network (PINN) models is used to solve the differential equations, by assigning each PINN model to a specific edge of the graph. Kirkhoff-Neumann (KN) nodal conditions are imposed in a weak form by adding a penalization term to the training loss function. Through the penalization term that imposes the KN conditions, PINN models associated with edges that share a node coordinate with each other to ensure continuity of the solution and of its directional derivatives computed along the respective edges. Using individual PINN models for each edge of the graph allows our approach to fulfill necessary requirements for parallelization by enabling different PINN models to be trained on distributed compute resources. Numerical results show that the system of PINN models accurately approximate the solutions of the differential problems across the entire graph for a broad set of graph topologies.

arXiv Computer Science @arxiv_cs@qoto.org

Evaluating k-NN in the Classification of Data Streams with Concept Drift. (arXiv:2210.03119v1 [cs.LG]) http://arxiv.org/abs/2210.03119

Evaluating k-NN in the Classification of Data Streams with Concept Drift

Data streams are often defined as large amounts of data flowing continuously at high speed. Moreover, these data are likely subject to changes in data distribution, known as concept drift. Given all the reasons mentioned above, learning from streams is often online and under restrictions of memory consumption and run-time. Although many classification algorithms exist, most of the works published in the area use Naive Bayes (NB) and Hoeffding Trees (HT) as base learners in their experiments. This article proposes an in-depth evaluation of k-Nearest Neighbors (k-NN) as a candidate for classifying data streams subjected to concept drift. It also analyses the complexity in time and the two main parameters of k-NN, i.e., the number of nearest neighbors used for predictions (k), and window size (w). We compare different parameter values for k-NN and contrast it to NB and HT both with and without a drift detector (RDDM) in many datasets. We formulated and answered 10 research questions which led to the conclusion that k-NN is a worthy candidate for data stream classification, especially when the run-time constraint is not too restrictive.

arXiv Computer Science @arxiv_cs@qoto.org

GBSVM: Granular-ball Support Vector Machine. (arXiv:2210.03120v1 [cs.LG]) http://arxiv.org/abs/2210.03120

GBSVM: Granular-ball Support Vector Machine

GBSVM (Granular-ball Support Vector Machine) is an important attempt to use the coarse granularity of a granular-ball as the input to construct a classifier instead of a data point. It is the first classifier whose input contains no points, i.e., $x_i$, in the history of machine learning. However, on the one hand, its dual model is not derived, and the algorithm has not been implemented and can not be applied. On the other hand, there are some errors in its existing model. To address these problems, this paper has fixed the errors of the original model of GBSVM, and derived its dual model. Furthermore, an algorithm is designed using particle swarm optimization algorithm to solve the dual model. The experimental results on the UCI benchmark datasets demonstrate that GBSVM has good robustness and efficiency.

arXiv Computer Science @arxiv_cs@qoto.org

Temporal Spatial Decomposition and Fusion Network for Time Series Forecasting. (arXiv:2210.03122v1 [cs.LG]) http://arxiv.org/abs/2210.03122

Temporal Spatial Decomposition and Fusion Network for Time Series Forecasting

Feature engineering is required to obtain better results for time series forecasting, and decomposition is a crucial one. One decomposition approach often cannot be used for numerous forecasting tasks since the standard time series decomposition lacks flexibility and robustness. Traditional feature selection relies heavily on preexisting domain knowledge, has no generic methodology, and requires a lot of labor. However, most time series prediction models based on deep learning typically suffer from interpretability issue, so the "black box" results lead to a lack of confidence. To deal with the above issues forms the motivation of the thesis. In the paper we propose TSDFNet as a neural network with self-decomposition mechanism and an attentive feature fusion mechanism, It abandons feature engineering as a preprocessing convention and creatively integrates it as an internal module with the deep model. The self-decomposition mechanism empowers TSDFNet with extensible and adaptive decomposition capabilities for any time series, users can choose their own basis functions to decompose the sequence into temporal and generalized spatial dimensions. Attentive feature fusion mechanism has the ability to capture the importance of external variables and the causality with target variables. It can automatically suppress the unimportant features while enhancing the effective ones, so that users do not have to struggle with feature selection. Moreover, TSDFNet is easy to look into the "black box" of the deep neural network by feature visualization and analyze the prediction results. We demonstrate performance improvements over existing widely accepted models on more than a dozen datasets, and three experiments showcase the interpretability of TSDFNet.

arXiv Computer Science @arxiv_cs@qoto.org

Enhancing Mixup-Based Graph Learning for Language Processing via Hybrid Pooling. (arXiv:2210.03123v1 [cs.LG]) http://arxiv.org/abs/2210.03123

Enhancing Mixup-Based Graph Learning for Language Processing via Hybrid Pooling

Graph neural networks (GNNs) have recently been popular in natural language and programming language processing, particularly in text and source code classification. Graph pooling which processes node representation into the entire graph representation, which can be used for multiple downstream tasks, e.g., graph classification, is a crucial component of GNNs. Recently, to enhance graph learning, Manifold Mixup, a data augmentation strategy that mixes the graph data vector after the pooling layer, has been introduced. However, since there are a series of graph pooling methods, how they affect the effectiveness of such a Mixup approach is unclear. In this paper, we take the first step to explore the influence of graph pooling methods on the effectiveness of the Mixup-based data augmentation approach. Specifically, 9 types of hybrid pooling methods are considered in the study, e.g., $\mathcal{M}_{sum}(\mathcal{P}_{att},\mathcal{P}_{max})$. The experimental results on both natural language datasets (Gossipcop, Politifact) and programming language datasets (Java250, Python800) demonstrate that hybrid pooling methods are more suitable for Mixup than the standard max pooling and the state-of-the-art graph multiset transformer (GMT) pooling, in terms of metric accuracy and robustness.

arXiv Computer Science @arxiv_cs@qoto.org

Learning Transfer Operators by Kernel Density Estimation. (arXiv:2210.03124v1 [cs.LG]) http://arxiv.org/abs/2210.03124

Learning Transfer Operators by Kernel Density Estimation

Inference of transfer operators from data is often formulated as a classical problem that hinges on the Ulam method. The usual description, which we will call the Ulam-Galerkin method, is in terms of projection onto basis functions that are characteristic functions supported over a fine grid of rectangles. In these terms, the usual Ulam-Galerkin approach can be understood as density estimation by the histogram method. Here we show that the problem can be recast in statistical density estimation formalism. This recasting of the classical problem, is a perspective that allows for an explicit and rigorous analysis of bias and variance, and therefore toward a discussion of the mean square error. Keywords: Transfer Operators; Frobenius-Perron operator; probability density estimation; Ulam-Galerkin method;Kernel Density Estimation.

arXiv Computer Science @arxiv_cs@qoto.org

Deep Inventory Management. (arXiv:2210.03137v1 [cs.LG]) http://arxiv.org/abs/2210.03137

Deep Inventory Management

We present a Deep Reinforcement Learning approach to solving a periodic review inventory control system with stochastic vendor lead times, lost sales, correlated demand, and price matching. While this dynamic program has historically been considered intractable, we show that several policy learning approaches are competitive with or outperform classical baseline approaches. In order to train these algorithms, we develop novel techniques to convert historical data into a simulator. We also present a model-based reinforcement learning procedure (Direct Backprop) to solve the dynamic periodic review inventory control problem by constructing a differentiable simulator. Under a variety of metrics Direct Backprop outperforms model-free RL and newsvendor baselines, in both simulations and real-world deployments.

arXiv Computer Science @arxiv_cs@qoto.org

On Distillation of Guided Diffusion Models. (arXiv:2210.03142v1 [cs.CV]) http://arxiv.org/abs/2210.03142

On Distillation of Guided Diffusion Models

Classifier-free guided diffusion models have recently been shown to be highly effective at high-resolution image generation, and they have been widely used in large-scale diffusion frameworks including DALLE-2, Stable Diffusion and Imagen. However, a downside of classifier-free guided diffusion models is that they are computationally expensive at inference time since they require evaluating two diffusion models, a class-conditional model and an unconditional model, tens to hundreds of times. To deal with this limitation, we propose an approach to distilling classifier-free guided diffusion models into models that are fast to sample from: Given a pre-trained classifier-free guided model, we first learn a single model to match the output of the combined conditional and unconditional models, and then we progressively distill that model to a diffusion model that requires much fewer sampling steps. For standard diffusion models trained on the pixel-space, our approach is able to generate images visually comparable to that of the original model using as few as 4 sampling steps on ImageNet 64x64 and CIFAR-10, achieving FID/IS scores comparable to that of the original model while being up to 256 times faster to sample from. For diffusion models trained on the latent-space (e.g., Stable Diffusion), our approach is able to generate high-fidelity images using as few as 1 to 4 denoising steps, accelerating inference by at least 10-fold compared to existing methods on ImageNet 256x256 and LAION datasets. We further demonstrate the effectiveness of our approach on text-guided image editing and inpainting, where our distilled model is able to generate high-quality results using as few as 2-4 denoising steps.

arXiv Computer Science @arxiv_cs@qoto.org

Towards Out-of-Distribution Adversarial Robustness. (arXiv:2210.03150v1 [cs.LG]) http://arxiv.org/abs/2210.03150

Towards Out-of-Distribution Adversarial Robustness

Adversarial robustness continues to be a major challenge for deep learning. A core issue is that robustness to one type of attack often fails to transfer to other attacks. While prior work establishes a theoretical trade-off in robustness against different $L_p$ norms, we show that there is potential for improvement against many commonly used attacks by adopting a domain generalisation approach. Concretely, we treat each type of attack as a domain, and apply the Risk Extrapolation method (REx), which promotes similar levels of robustness against all training attacks. Compared to existing methods, we obtain similar or superior worst-case adversarial robustness on attacks seen during training. Moreover, we achieve superior performance on families or tunings of attacks only encountered at test time. On ensembles of attacks, our approach improves the accuracy from 3.4% the best existing baseline to 25.9% on MNIST, and from 16.9% to 23.5% on CIFAR10.

arXiv Computer Science @arxiv_cs@qoto.org

Integrative Imaging Informatics for Cancer Research: Workflow Automation for Neuro-oncology (I3CR-WANO). (arXiv:2210.03151v1 [eess.IV]) http://arxiv.org/abs/2210.03151

Integrative Imaging Informatics for Cancer Research: Workflow Automation for Neuro-oncology (I3CR-WANO)

Efforts to utilize growing volumes of clinical imaging data to generate tumor evaluations continue to require significant manual data wrangling owing to the data heterogeneity. Here, we propose an artificial intelligence-based solution for the aggregation and processing of multisequence neuro-oncology MRI data to extract quantitative tumor measurements. Our end-to-end framework i) classifies MRI sequences using an ensemble classifier, ii) preprocesses the data in a reproducible manner, iii) delineates tumor tissue subtypes using convolutional neural networks, and iv) extracts diverse radiomic features. Moreover, it is robust to missing sequences and adopts an expert-in-the-loop approach, where the segmentation results may be manually refined by radiologists. Following the implementation of the framework in Docker containers, it was applied to two retrospective glioma datasets collected from the Washington University School of Medicine (WUSM; n = 384) and the M.D. Anderson Cancer Center (MDA; n = 30) comprising preoperative MRI scans from patients with pathologically confirmed gliomas. The scan-type classifier yielded an accuracy of over 99%, correctly identifying sequences from 380/384 and 30/30 sessions from the WUSM and MDA datasets, respectively. Segmentation performance was quantified using the Dice Similarity Coefficient between the predicted and expert-refined tumor masks. Mean Dice scores were 0.882 ($\pm$0.244) and 0.977 ($\pm$0.04) for whole tumor segmentation for WUSM and MDA, respectively. This streamlined framework automatically curated, processed, and segmented raw MRI data of patients with varying grades of gliomas, enabling the curation of large-scale neuro-oncology datasets and demonstrating a high potential for integration as an assistive tool in clinical practice.

arXiv Computer Science @arxiv_cs@qoto.org

Comparison of Missing Data Imputation Methods using the Framingham Heart study dataset. (arXiv:2210.03154v1 [cs.LG]) http://arxiv.org/abs/2210.03154

Comparison of Missing Data Imputation Methods using the Framingham Heart study dataset

Cardiovascular disease (CVD) is a class of diseases that involve the heart or blood vessels and according to World Health Organization is the leading cause of death worldwide. EHR data regarding this case, as well as medical cases in general, contain missing values very frequently. The percentage of missingness may vary and is linked with instrument errors, manual data entry procedures, etc. Even though the missing rate is usually significant, in many cases the missing value imputation part is handled poorly either with case-deletion or with simple statistical approaches such as mode and median imputation. These methods are known to introduce significant bias, since they do not account for the relationships between the dataset's variables. Within the medical framework, many datasets consist of lab tests or patient medical tests, where these relationships are present and strong. To address these limitations, in this paper we test and modify state-of-the-art missing value imputation methods based on Generative Adversarial Networks (GANs) and Autoencoders. The evaluation is accomplished for both the tasks of data imputation and post-imputation prediction. Regarding the imputation task, we achieve improvements of 0.20, 7.00% in normalised Root Mean Squared Error (RMSE) and Area Under the Receiver Operating Characteristic Curve (AUROC) respectively. In terms of the post-imputation prediction task, our models outperform the standard approaches by 2.50% in F1-score.

Bot

I toot the arXiv feed for topics in Computer Science.

#ComputerScience #CS #Programming #SoftwareEngineering #Software #SoftwareDevelopment #Computers #Science #arXiv #News #PeerReview

Joined Jul 2018