arXiv Computer Science @arxiv_cs@qoto.org

1.12K Followers

Bot

I toot the arXiv feed for topics in Computer Science.

#ComputerScience #CS #Programming #SoftwareEngineering #Software #SoftwareDevelopment #Computers #Science #arXiv #News #PeerReview

Joined Jul 2018

2 Following 1.12K Followers

Posts Posts and replies Media

arXiv Computer Science @arxiv_cs@qoto.org

iMedBot: A Web-based Intelligent Agent for Healthcare Related Prediction and Deep Learning. (arXiv:2210.05671v1 [cs.LG]) http://arxiv.org/abs/2210.05671

iMedBot: A Web-based Intelligent Agent for Healthcare Related Prediction and Deep Learning

Background: Breast cancer is a multifactorial disease, genetic and environmental factors will affect its incidence probability. Breast cancer metastasis is one of the main cause of breast cancer related deaths reported by the American Cancer Society (ACS). Method: the iMedBot is a web application that we developed using the python Flask web framework and deployed on Amazon Web Services. It contains a frontend and a backend. The backend is supported by a python program we developed using the python Keras and scikit-learn packages, which can be used to learn deep feedforward neural network (DFNN) models. Result: the iMedBot can provide two main services: 1. it can predict 5-, 10-, or 15-year breast cancer metastasis based on a set of clinical information provided by a user. The prediction is done by using a set of DFNN models that were pretrained, and 2. It can train DFNN models for a user using user-provided dataset. The model trained will be evaluated using AUC and both the AUC value and the AUC ROC curve will be provided. Conclusion: The iMedBot web application provides a user-friendly interface for user-agent interaction in conducting personalized prediction and model training. It is an initial attempt to convert results of deep learning research into an online tool that may stir further research interests in this direction. Keywords: Deep learning, Breast Cancer, Web application, Model training.

arXiv Computer Science @arxiv_cs@qoto.org

Performance Deterioration of Deep Learning Models after Clinical Deployment: A Case Study with Auto-segmentation for Definitive Prostate Cancer Radiotherapy. (arXiv:2210.05673v1 [eess.IV]) http://arxiv.org/abs/2210.05673

Performance Deterioration of Deep Learning Models after Clinical Deployment: A Case Study with Auto-segmentation for Definitive Prostate Cancer Radiotherapy

In the past decade, deep learning (DL)-based artificial intelligence (AI) has witnessed unprecedented success and has led to much excitement in medicine. However, many successful models have not been implemented in the clinic predominantly due to concerns regarding the lack of interpretability and generalizability in both spatial and temporal domains. In this work, we used a DL-based auto segmentation model for intact prostate patients to observe any temporal performance changes and then correlate them to possible explanatory variables. We retrospectively simulated the clinical implementation of our DL model to investigate temporal performance trends. Our cohort included 912 patients with prostate cancer treated with definitive radiotherapy from January 2006 to August 2021 at the University of Texas Southwestern Medical Center (UTSW). We trained a U-Net-based DL auto segmentation model on the data collected before 2012 and tested it on data collected from 2012 to 2021 to simulate the clinical deployment of the trained model starting in 2012. We visualize the trends using a simple moving average curve and used ANOVA and t-test to investigate the impact of various clinical factors. The prostate and rectum contour quality decreased rapidly after 2016-2017. Stereotactic body radiotherapy (SBRT) and hydrogel spacer use were significantly associated with prostate contour quality (p=5.6e-12 and 0.002, respectively). SBRT and physicians' styles are significantly associated with the rectum contour quality (p=0.0005 and 0.02, respectively). Only the presence of contrast within the bladder significantly affected the bladder contour quality (p=1.6e-7). We showed that DL model performance decreased over time in concordance with changes in clinical practice patterns and changes in clinical personnel.

arXiv Computer Science @arxiv_cs@qoto.org

Unsupervised detection of structural damage using Variational Autoencoder and a One-Class Support Vector Machine. (arXiv:2210.05674v1 [cs.LG]) http://arxiv.org/abs/2210.05674

Unsupervised detection of structural damage using Variational Autoencoder and a One-Class Support Vector Machine

In recent years, Artificial Neural Networks (ANNs) have been introduced in Structural Health Monitoring (SHM) systems. An unsupervised method with a data-driven approach allows the ANN training on data acquired from an undamaged structural condition to detect structural damages. In standard approaches, after the training stage, a decision rule is manually defined to detect anomalous data. However, this process could be made automatic using machine learning methods, whom performances are maximised using hyperparameter optimization techniques. The paper proposes an unsupervised method with a data-driven approach to detect structural anomalies. The methodology consists of: (i) a Variational Autoencoder (VAE) to approximate undamaged data distribution and (ii) a One-Class Support Vector Machine (OC-SVM) to discriminate different health conditions using damage sensitive features extracted from VAE's signal reconstruction. The method is applied to a scale steel structure that was tested in nine damage's scenarios by IASC-ASCE Structural Health Monitoring Task Group.

arXiv Computer Science @arxiv_cs@qoto.org

Transformers generalize differently from information stored in context vs in weights. (arXiv:2210.05675v1 [cs.CL]) http://arxiv.org/abs/2210.05675

Transformers generalize differently from information stored in context vs in weights

Transformer models can use two fundamentally different kinds of information: information stored in weights during training, and information provided ``in-context'' at inference time. In this work, we show that transformers exhibit different inductive biases in how they represent and generalize from the information in these two sources. In particular, we characterize whether they generalize via parsimonious rules (rule-based generalization) or via direct comparison with observed examples (exemplar-based generalization). This is of important practical consequence, as it informs whether to encode information in weights or in context, depending on how we want models to use that information. In transformers trained on controlled stimuli, we find that generalization from weights is more rule-based whereas generalization from context is largely exemplar-based. In contrast, we find that in transformers pre-trained on natural language, in-context learning is significantly rule-based, with larger models showing more rule-basedness. We hypothesise that rule-based generalization from in-context information might be an emergent consequence of large-scale training on language, which has sparse rule-like structure. Using controlled stimuli, we verify that transformers pretrained on data containing sparse rule-like structure exhibit more rule-based generalization.

arXiv Computer Science @arxiv_cs@qoto.org

Towards Consistency and Complementarity: A Multiview Graph Information Bottleneck Approach. (arXiv:2210.05676v1 [cs.LG]) http://arxiv.org/abs/2210.05676

Towards Consistency and Complementarity: A Multiview Graph Information Bottleneck Approach

The empirical studies of Graph Neural Networks (GNNs) broadly take the original node feature and adjacency relationship as singleview input, ignoring the rich information of multiple graph views. To circumvent this issue, the multiview graph analysis framework has been developed to fuse graph information across views. How to model and integrate shared (i.e. consistency) and view-specific (i.e. complementarity) information is a key issue in multiview graph analysis. In this paper, we propose a novel Multiview Variational Graph Information Bottleneck (MVGIB) principle to maximize the agreement for common representations and the disagreement for view-specific representations. Under this principle, we formulate the common and view-specific information bottleneck objectives across multiviews by using constraints from mutual information. However, these objectives are hard to directly optimize since the mutual information is computationally intractable. To tackle this challenge, we derive variational lower and upper bounds of mutual information terms, and then instead optimize variational bounds to find the approximate solutions for the information objectives. Extensive experiments on graph benchmark datasets demonstrate the superior effectiveness of the proposed method.

arXiv Computer Science @arxiv_cs@qoto.org

Application of Deep Learning on Single-Cell RNA-sequencing Data Analysis: A Review. (arXiv:2210.05677v1 [q-bio.GN]) http://arxiv.org/abs/2210.05677

Application of Deep Learning on Single-Cell RNA-sequencing Data Analysis: A Review

Single-cell RNA-sequencing (scRNA-seq) has become a routinely used technique to quantify the gene expression profile of thousands of single cells simultaneously. Analysis of scRNA-seq data plays an important role in the study of cell states and phenotypes, and has helped elucidate biological processes, such as those occurring during development of complex organisms and improved our understanding of disease states, such as cancer, diabetes, and COVID, among others. Deep learning, a recent advance of artificial intelligence that has been used to address many problems involving large datasets, has also emerged as a promising tool for scRNA-seq data analysis, as it has a capacity to extract informative, compact features from noisy, heterogeneous, and high-dimensional scRNA-seq data to improve downstream analysis. The present review aims at surveying recently developed deep learning techniques in scRNA-seq data analysis, identifying key steps within the scRNA-seq data analysis pipeline that have been advanced by deep learning, and explaining the benefits of deep learning over more conventional analysis tools. Finally, we summarize the challenges in current deep learning approaches faced within scRNA-seq data and discuss potential directions for improvements in deep algorithms for scRNA-seq data analysis.

arXiv Computer Science @arxiv_cs@qoto.org

Neural Importance Sampling for Rapid and Reliable Gravitational-Wave Inference. (arXiv:2210.05686v1 [gr-qc]) http://arxiv.org/abs/2210.05686

Neural Importance Sampling for Rapid and Reliable Gravitational-Wave Inference

We combine amortized neural posterior estimation with importance sampling for fast and accurate gravitational-wave inference. We first generate a rapid proposal for the Bayesian posterior using neural networks, and then attach importance weights based on the underlying likelihood and prior. This provides (1) a corrected posterior free from network inaccuracies, (2) a performance diagnostic (the sample efficiency) for assessing the proposal and identifying failure cases, and (3) an unbiased estimate of the Bayesian evidence. By establishing this independent verification and correction mechanism we address some of the most frequent criticisms against deep learning for scientific inference. We carry out a large study analyzing 42 binary black hole mergers observed by LIGO and Virgo with the SEOBNRv4PHM and IMRPhenomXPHM waveform models. This shows a median sample efficiency of $\approx 10\%$ (two orders-of-magnitude better than standard samplers) as well as a ten-fold reduction in the statistical uncertainty in the log evidence. Given these advantages, we expect a significant impact on gravitational-wave inference, and for this approach to serve as a paradigm for harnessing deep learning methods in scientific applications.

arXiv Computer Science @arxiv_cs@qoto.org

Dynamic Ensemble Size Adjustment for Memory Constrained Mondrian Forest. (arXiv:2210.05704v1 [cs.LG]) http://arxiv.org/abs/2210.05704

Dynamic Ensemble Size Adjustment for Memory Constrained Mondrian Forest

Supervised learning algorithms generally assume the availability of enough memory to store data models during the training and test phases. However, this assumption is unrealistic when data comes in the form of infinite data streams, or when learning algorithms are deployed on devices with reduced amounts of memory. Such memory constraints impact the model behavior and assumptions. In this paper, we show that under memory constraints, increasing the size of a tree-based ensemble classifier can worsen its performance. In particular, we experimentally show the existence of an optimal ensemble size for a memory-bounded Mondrian forest on data streams and we design an algorithm to guide the forest toward that optimal number by using an estimation of overfitting. We tested different variations for this algorithm on a variety of real and simulated datasets, and we conclude that our method can achieve up to 95% of the performance of an optimally-sized Mondrian forest for stable datasets, and can even outperform it for datasets with concept drifts. All our methods are implemented in the OrpailleCC open-source library and are ready to be used on embedded systems and connected objects.

arXiv Computer Science @arxiv_cs@qoto.org

Perspectives on Negative Research Results in Pervasive Computing. (arXiv:2210.05708v1 [cs.DC]) http://arxiv.org/abs/2210.05708

Perspectives on Negative Research Results in Pervasive Computing

Not all research leads to fruitful results; trying new ways or methods may surpass the state of the art, but sometimes the hypothesis is not proven or the improvement is insignificant. In a systems discipline like pervasive computing, there are many sources of errors, from hardware issues over communication channels to heterogeneous software environments. However, failure to succeed is not a failure to progress. It is essential to create platforms for sharing insights, experiences, and lessons learned when conducting research in pervasive computing so that the same mistakes are not repeated. And sometimes, a problem is a symptom of discovering new research challenges. Based on the collective input of the First International Workshop on Negative Results in Pervasive Computing (PerFail 2022), co-located with the 20th International Conference on Pervasive Computing and Communications (PerCom 2022), this paper presents a comprehensive discussion on perspectives on publishing negative results and lessons learned in pervasive computing.

arXiv Computer Science @arxiv_cs@qoto.org

Shapley Head Pruning: Identifying and Removing Interference in Multilingual Transformers. (arXiv:2210.05709v1 [cs.CL]) http://arxiv.org/abs/2210.05709

Shapley Head Pruning: Identifying and Removing Interference in Multilingual Transformers

Multilingual transformer-based models demonstrate remarkable zero and few-shot transfer across languages by learning and reusing language-agnostic features. However, as a fixed-size model acquires more languages, its performance across all languages degrades, a phenomenon termed interference. Often attributed to limited model capacity, interference is commonly addressed by adding additional parameters despite evidence that transformer-based models are overparameterized. In this work, we show that it is possible to reduce interference by instead identifying and pruning language-specific parameters. First, we use Shapley Values, a credit allocation metric from coalitional game theory, to identify attention heads that introduce interference. Then, we show that removing identified attention heads from a fixed model improves performance for a target language on both sentence classification and structural prediction, seeing gains as large as 24.7\%. Finally, we provide insights on language-agnostic and language-specific attention heads using attention visualization.

arXiv Computer Science @arxiv_cs@qoto.org

An SIE Formulation with Triangular Discretization and Loop Analysis for Parameter Extraction of Arbitrarily Shaped Interconnects. (arXiv:2210.04891v1 [cs.CE]) http://arxiv.org/abs/2210.04891

An SIE Formulation with Triangular Discretization and Loop Analysis for Parameter Extraction of Arbitrarily Shaped Interconnects

A surface integral equation (SIE) formulation under the magneto-quasi-static assumption is proposed to efficiently and accurately model arbitrarily shaped interconnects in packages. Through decently transferring all electromagnetic quantities into circuit elements, the loop analysis is used to carefully construct matrix equations with an independent and complete set of unknowns based on graph theory. In addition, an efficient preconditioner is developed, and the proposed formulation is accelerated by the pre-corrected Fast Fourier Transform (pFFT). Four practical examples, including a rectangular metallic interconnect, bounding wire arrays, interconnects in a real-life circuit and the power distribution network (PDN) used in packages, are carried out to validate its accuracy, efficiency and scalability. Results show that the proposed formulation is accurate, efficient and flexible to model complex interconnects in packages.

arXiv Computer Science @arxiv_cs@qoto.org

A Coupled Hybridizable Discontinuous Galerkin and Boundary Integral Method for Analyzing Electromagnetic Scattering. (arXiv:2210.04892v1 [cs.CE]) http://arxiv.org/abs/2210.04892

A Coupled Hybridizable Discontinuous Galerkin and Boundary Integral Method for Analyzing Electromagnetic Scattering

A coupled hybridizable discontinuous Galerkin (HDG) and boundary integral (BI) method is proposed to efficiently analyze electromagnetic scattering from inhomogeneous/composite objects. The coupling between the HDG and the BI equations is realized using the numerical flux operating on the equivalent current and the global unknown of the HDG. This approach yields sparse coupling matrices upon discretization. Inclusion of the BI equation ensures that the only error in enforcing the radiation conditions is the discretization. However, the discretization of this equation yields a dense matrix, which prohibits the use of a direct matrix solver on the overall coupled system as often done with traditional HDG schemes. To overcome this bottleneck, a ``hybrid'' method is developed. This method uses an iterative scheme to solve the overall coupled system but within the matrix-vector multiplication subroutine of the iterations, the inverse of the HDG matrix is efficiently accounted for using a sparse direct matrix solver. The same subroutine also uses the multilevel fast multipole algorithm to accelerate the multiplication of the guess vector with the dense BI matrix. The numerical results demonstrate the accuracy, the efficiency, and the applicability of the proposed HDG-BI solver.

arXiv Computer Science @arxiv_cs@qoto.org

Equivariant Shape-Conditioned Generation of 3D Molecules for Ligand-Based Drug Design. (arXiv:2210.04893v1 [physics.chem-ph]) http://arxiv.org/abs/2210.04893

Equivariant Shape-Conditioned Generation of 3D Molecules for Ligand-Based Drug Design

Shape-based virtual screening is widely employed in ligand-based drug design to search chemical libraries for molecules with similar 3D shapes yet novel 2D chemical structures compared to known ligands. 3D deep generative models have the potential to automate this exploration of shape-conditioned 3D chemical space; however, no existing models can reliably generate valid drug-like molecules in conformations that adopt a specific shape such as a known binding pose. We introduce a new multimodal 3D generative model that enables shape-conditioned 3D molecular design by equivariantly encoding molecular shape and variationally encoding chemical identity. We ensure local geometric and chemical validity of generated molecules by using autoregressive fragment-based generation with heuristic bonding geometries, allowing the model to prioritize the scoring of rotatable bonds to best align the growing conformational structure to the target shape. We evaluate our 3D generative model in tasks relevant to drug design including shape-conditioned generation of chemically diverse molecular structures and shape-constrained molecular property optimization, demonstrating its utility over virtual screening of enumerated libraries.

arXiv Computer Science @arxiv_cs@qoto.org

A Computationally Efficient, Robust Methodology for Evaluating Chemical Timescales with Detailed Chemical Kinetics. (arXiv:2210.04894v1 [cs.CE]) http://arxiv.org/abs/2210.04894

A Computationally Efficient, Robust Methodology for Evaluating Chemical Timescales with Detailed Chemical Kinetics

Turbulent reacting flows occur in a variety of engineering applications such as chemical reactors and power generating equipment (gas turbines and internal combustion engines). Turbulent reacting flows are characterized by two main timescales, namely, flow timescales and chemical (or reaction) timescales. Understanding the relative timescales of flow and reaction kinetics plays an important role, not only in the choice of models required for the accurate simulation of these devices but also their design/optimization studies. There are several definitions of chemical timescales, which can largely be classified as algebraic or eigenvalue-based methods. The computational complexity (and hence cost) depends on the method of evaluation of the chemical timescales and size of the chemical reaction mechanism. The computational cost and robustness of the methodology of evaluating the reaction times scales is an important consideration in large-scale multi-dimensional simulations using detailed chemical mechanisms. In this work, we present a computational efficient and robust methodology to evaluate chemical timescales based on the algebraic method. Comparison of this novel methodology with other traditional methods is presented for a range of fuel-air mixtures, pressures and temperatures conditions. Additionally, chemical timescales are also presented for fuel-air mixtures at conditions of relevance to power generating equipment. The proposed method showed the same temporal characteristics as the eigenvalue-based methods with no additional computational cost for all the 1cases studied. The proposed method thus has the potential for use with multidimensional turbulent reacting flow simulations which require the computation of the Damkohler number.

arXiv Computer Science @arxiv_cs@qoto.org

The 'Problematic Paper Screener' automatically selects suspect publications for post-publication (re)assessment. (arXiv:2210.04895v1 [cs.DL]) http://arxiv.org/abs/2210.04895

The 'Problematic Paper Screener' automatically selects suspect publications for post-publication (re)assessment

Post publication assessment remains necessary to check erroneous or fraudulent scientific publications. We present an online platform, the 'Problematic Paper Screener' (https://www.irit.fr/~Guillaume.Cabanac/problematic-paper-screener) that leverages both automatic machine detection and human assessment to identify and flag already published problematic articles. We provide a new effective tool to curate the scientific literature.

arXiv Computer Science @arxiv_cs@qoto.org

Robust Adaptive Neural Network Control of Time-Varying State Constrained Nonlinear Systems. (arXiv:2210.04897v1 [eess.SY]) http://arxiv.org/abs/2210.04897

Robust Adaptive Neural Network Control of Time-Varying State Constrained Nonlinear Systems

This paper deals with the tracking control problem for a very simple class of unknown nonlinear systems. In this paper, we presents a design strategy for tracking control of time-varying state constrained nonlinear systems in an adaptive framework. The controller is designed using the backstepping method. While designing it, Barrier Lyapunov Function (BLF) is used so that the state variables do not contravene its constraints. In order to cope with the unknown dynamics of the system, an online approximator is designed using a neural network with a novel adaptive law for its weight update. To make the controller robust and computationally inexpensive, a disturbance observer is proposed to cope with the disturbance along with neural network approximation error and the time derivative of virtual control input. The effectiveness of the proposed approach is demonstrated through a simulation study.

arXiv Computer Science @arxiv_cs@qoto.org

Improving The Reconstruction Quality by Overfitted Decoder Bias in Neural Image Compression. (arXiv:2210.04898v1 [eess.IV]) http://arxiv.org/abs/2210.04898

Improving The Reconstruction Quality by Overfitted Decoder Bias in Neural Image Compression

End-to-end trainable models have reached the performance of traditional handcrafted compression techniques on videos and images. Since the parameters of these models are learned over large training sets, they are not optimal for any given image to be compressed. In this paper, we propose an instance-based fine-tuning of a subset of decoder's bias to improve the reconstruction quality in exchange for extra encoding time and minor additional signaling cost. The proposed method is applicable to any end-to-end compression methods, improving the state-of-the-art neural image compression BD-rate by $3-5\%$.

arXiv Computer Science @arxiv_cs@qoto.org

Meta-Principled Family of Hyperparameter Scaling Strategies. (arXiv:2210.04909v1 [cs.LG]) http://arxiv.org/abs/2210.04909

Meta-Principled Family of Hyperparameter Scaling Strategies

In this note, we first derive a one-parameter family of hyperparameter scaling strategies that interpolates between the neural-tangent scaling and mean-field/maximal-update scaling. We then calculate the scalings of dynamical observables -- network outputs, neural tangent kernels, and differentials of neural tangent kernels -- for wide and deep neural networks. These calculations in turn reveal a proper way to scale depth with width such that resultant large-scale models maintain their representation-learning ability. Finally, we observe that various infinite-width limits examined in the literature correspond to the distinct corners of the interconnected web spanned by effective theories for finite-width neural networks, with their training dynamics ranging from being weakly-coupled to being strongly-coupled.

arXiv Computer Science @arxiv_cs@qoto.org

Batch Exchanges with Constant Function Market Makers: Axioms, Equilibria, and Computation. (arXiv:2210.04929v1 [cs.GT]) http://arxiv.org/abs/2210.04929

Batch Exchanges with Constant Function Market Makers: Axioms, Equilibria, and Computation

Batch trading systems and constant function market makers (CFMMs) are two distinct market design innovations that have recently come to prominence as ways to address some of the shortcomings of decentralized trading systems. However, different deployments have chosen substantially different methods for integrating the two innovations. We show here from a minimal set of axioms describing the beneficial properties of each innovation that there is in fact only one, unique method for integrating CFMMs into batch trading schemes that preserves all the beneficial properties of both. Deployment of a batch trading schemes trading many assets simultaneously requires a reliable algorithm for approximating equilibria in Arrow-Debreu exchange markets. We study this problem when batches contain limit orders and CFMMs. Specifically, we find that CFMM design affects the asymptotic complexity of the problem, give an easily-checkable criterion to validate that a user-submitted CFMM is computationally tractable in a batch, and give a convex program that computes equilibria on batches of limit orders and CFMMs. Equivalently, this convex program computes equilibria of Arrow-Debreu exchange markets when every agent's demand response satisfies weak gross substitutability and every agent has utility for only two types of assets. This convex program has rational solutions when run on many (but not all) natural classes of widely-deployed CFMMs.

arXiv Computer Science @arxiv_cs@qoto.org

NeRF2Real: Sim2real Transfer of Vision-guided Bipedal Motion Skills using Neural Radiance Fields. (arXiv:2210.04932v1 [cs.RO]) http://arxiv.org/abs/2210.04932

NeRF2Real: Sim2real Transfer of Vision-guided Bipedal Motion Skills using Neural Radiance Fields

We present a system for applying sim2real approaches to "in the wild" scenes with realistic visuals, and to policies which rely on active perception using RGB cameras. Given a short video of a static scene collected using a generic phone, we learn the scene's contact geometry and a function for novel view synthesis using a Neural Radiance Field (NeRF). We augment the NeRF rendering of the static scene by overlaying the rendering of other dynamic objects (e.g. the robot's own body, a ball). A simulation is then created using the rendering engine in a physics simulator which computes contact dynamics from the static scene geometry (estimated from the NeRF volume density) and the dynamic objects' geometry and physical properties (assumed known). We demonstrate that we can use this simulation to learn vision-based whole body navigation and ball pushing policies for a 20 degrees of freedom humanoid robot with an actuated head-mounted RGB camera, and we successfully transfer these policies to a real robot. Project video is available at https://sites.google.com/view/nerf2real/home

Bot

I toot the arXiv feed for topics in Computer Science.

#ComputerScience #CS #Programming #SoftwareEngineering #Software #SoftwareDevelopment #Computers #Science #arXiv #News #PeerReview

Joined Jul 2018