Show newer

Explainable bilevel optimization: an application to the Helsinki deblur challenge. (arXiv:2210.10050v1 [eess.IV]) arxiv.org/abs/2210.10050

Explainable bilevel optimization: an application to the Helsinki deblur challenge

In this paper we present a bilevel optimization scheme for the solution of a general image deblurring problem, in which a parametric variational-like approach is encapsulated within a machine learning scheme to provide a high quality reconstructed image with automatically learned parameters. The ingredients of the variational lower level and the machine learning upper one are specifically chosen for the Helsinki Deblur Challenge 2021, in which sequences of letters are asked to be recovered from out-of-focus photographs with increasing levels of blur. Our proposed procedure for the reconstructed image consists in a fixed number of FISTA iterations applied to the minimization of an edge preserving and binarization enforcing regularized least-squares functional. The parameters defining the variational model and the optimization steps, which, unlike most deep learning approaches, all have a precise and interpretable meaning, are learned via either a similarity index or a support vector machine strategy. Numerical experiments on the test images provided by the challenge authors show significant gains with respect to a standard variational approach and performances comparable with those of some of the proposed deep learning based algorithms which require the optimization of millions of parameters.

arxiv.org

Combination of Raman spectroscopy and chemometrics: A review of recent studies published in the Spectrochimica Acta, Part A: Molecular and Biomolecular Spectroscopy Journal. (arXiv:2210.10051v1 [q-bio.QM]) arxiv.org/abs/2210.10051

Combination of Raman spectroscopy and chemometrics: A review of recent studies published in the Spectrochimica Acta, Part A: Molecular and Biomolecular Spectroscopy Journal

Raman spectroscopy is a promising technique used for noninvasive analysis of samples in various fields of application due to its ability for fingerprint probing of samples at the molecular level. Chemometrics methods are widely used nowadays for better understanding of the recorded spectral fingerprints of samples and differences in their chemical composition. This review considers a number of manuscripts published in the Spectrochimica Acta, Part A: Molecular and Biomolecular Spectroscopy Journal that presented findings regarding the application of Raman spectroscopy in combination with chemometrics to study samples and their changes caused by different factors. In 57 reviewed manuscripts, we analyzed application of chemometrics algorithms, statistical modeling parameters, utilization of cross validation, sample sizes, as well as the performance of the proposed classification and regression model. We summarized the best strategies for creating classification models and highlighted some common drawbacks when it comes to the application of chemometrics techniques. According to our estimations, about 70% of the papers are likely to contain unsupported or invalid data due to insufficient description of the utilized methods or drawbacks of the proposed classification models. These drawbacks include: (1) insufficient experimental sample size for classification/regression to achieve significant and reliable results, (2) lack of cross validation (or a test set) for verification of the classifier/regression performance, (3) incorrect division of the spectral data into the training and the test/validation sets; (4) improper selection of the PC number to reduce the analyzed spectral data dimension.

arxiv.org

Detecting and analyzing missing citations to published scientific entities. (arXiv:2210.10073v1 [cs.DL]) arxiv.org/abs/2210.10073

Detecting and analyzing missing citations to published scientific entities

Proper citation is of great importance in academic writing for it enables knowledge accumulation and maintains academic integrity. However, citing properly is not an easy task. For published scientific entities, the ever-growing academic publications and over-familiarity of terms easily lead to missing citations. To deal with this situation, we design a special method Citation Recommendation for Published Scientific Entity (CRPSE) based on the cooccurrences between published scientific entities and in-text citations in the same sentences from previous researchers. Experimental outcomes show the effectiveness of our method in recommending the source papers for published scientific entities. We further conduct a statistical analysis on missing citations among papers published in prestigious computer science conferences in 2020. In the 12,278 papers collected, 475 published scientific entities of computer science and mathematics are found to have missing citations. Many entities mentioned without citations are found to be well-accepted research results. On a median basis, the papers proposing these published scientific entities with missing citations were published 8 years ago, which can be considered the time frame for a published scientific entity to develop into a well-accepted concept. For published scientific entities, we appeal for accurate and full citation of their source papers as required by academic standards.

arxiv.org

Deterministic vs. Non Deterministic Finite Automata in Automata Processing. (arXiv:2210.10077v1 [cs.PF]) arxiv.org/abs/2210.10077

Deterministic vs. Non Deterministic Finite Automata in Automata Processing

Linear-time pattern matching engines have seen promising results using Finite Automata (FA) as their computation model. Among different FA variants, deterministic (DFA) and non-deterministic (NFA) are the most commonly used computation models for FA-based pattern matching engines. Moreover, NFA is the prevalent model in pattern matching engines on spatial architectures. The reasons are: i) DFA size, as in #states, can be exponential compared to equivalent NFA, ii) DFA cannot exploit the massive parallelism available on spatial architectures. This paper performs an empirical study on the #state of minimized DFA and optimized NFA across a diverse set of real-world benchmarks and shows that if distinct DFAs are generated for distinct patterns, #states of minimized DFA are typically equal to their equivalent optimized NFA. However, NFA is more robust in maintaining the low #states for some benchmarks. Thus, the choice of NFA vs. DFA for spatial architecture is less important than the need to generate distinct DFAs for each pattern and support these distinct DFAs' parallel processing. Finally, this paper presents a throughput study for von Neumann's architecture-based (CPU) vs. spatial architecture-based (FPGA) automata processing engines. The study shows that, based on the workload, neither CPU-based automata processing engine nor FPGA-based automata processing engine is the clear winner. If #patterns matched per workload increases, the CPU-based automata processing engine's throughput decreases. On the other hand, the FPGA-based automata processing engine lacks the memory spilling option; hence, it fails to accommodate an entire automata if it does not fit into FPGA's logic fabric. In the best-case scenario, the CPU has a 4.5x speedup over the FPGA, while for some benchmarks, the FPGA has a 32,530x speedup over the CPU.

arxiv.org

Why do people judge humans differently from machines? The role of agency and experience. (arXiv:2210.10081v1 [cs.CY]) arxiv.org/abs/2210.10081

Why do people judge humans differently from machines? The role of agency and experience

People are known to judge artificial intelligence using a utilitarian moral philosophy and humans using a moral philosophy emphasizing perceived intentions. But why do people judge humans and machines differently? Psychology suggests that people may have different mind perception models for humans and machines, and thus, will treat human-like robots more similarly to the way they treat humans. Here we present a randomized experiment where we manipulated people's perception of machines to explore whether people judge more human-like machines more similarly to the way they judge humans. We find that people's judgments of machines become more similar to that of humans when they perceive machines as having more agency (e.g. ability to plan, act), but not more experience (e.g. ability to feel). Our findings indicate that people's use of different moral philosophies to judge humans and machines can be explained by a progression of mind perception models where the perception of agency plays a prominent role. These findings add to the body of evidence suggesting that people's judgment of machines becomes more similar to that of humans motivating further work on differences in the judgment of human and machine actions.

arxiv.org

Auditing YouTube's Recommendation Algorithm for Misinformation Filter Bubbles. (arXiv:2210.10085v1 [cs.IR]) arxiv.org/abs/2210.10085

Auditing YouTube's Recommendation Algorithm for Misinformation Filter Bubbles

In this paper, we present results of an auditing study performed over YouTube aimed at investigating how fast a user can get into a misinformation filter bubble, but also what it takes to "burst the bubble", i.e., revert the bubble enclosure. We employ a sock puppet audit methodology, in which pre-programmed agents (acting as YouTube users) delve into misinformation filter bubbles by watching misinformation promoting content. Then they try to burst the bubbles and reach more balanced recommendations by watching misinformation debunking content. We record search results, home page results, and recommendations for the watched videos. Overall, we recorded 17,405 unique videos, out of which we manually annotated 2,914 for the presence of misinformation. The labeled data was used to train a machine learning model classifying videos into three classes (promoting, debunking, neutral) with the accuracy of 0.82. We use the trained model to classify the remaining videos that would not be feasible to annotate manually. Using both the manually and automatically annotated data, we observe the misinformation bubble dynamics for a range of audited topics. Our key finding is that even though filter bubbles do not appear in some situations, when they do, it is possible to burst them by watching misinformation debunking content (albeit it manifests differently from topic to topic). We also observe a sudden decrease of misinformation filter bubble effect when misinformation debunking videos are watched after misinformation promoting videos, suggesting a strong contextuality of recommendations. Finally, when comparing our results with a previous similar study, we do not observe significant improvements in the overall quantity of recommended misinformation content.

arxiv.org

How to Boost Face Recognition with StyleGAN?. (arXiv:2210.10090v1 [cs.CV]) arxiv.org/abs/2210.10090

How to Boost Face Recognition with StyleGAN?

State-of-the-art face recognition systems require huge amounts of labeled training data. Given the priority of privacy in face recognition applications, the data is limited to celebrity web crawls, which have issues such as skewed distributions of ethnicities and limited numbers of identities. On the other hand, the self-supervised revolution in the industry motivates research on adaptation of the related techniques to facial recognition. One of the most popular practical tricks is to augment the dataset by the samples drawn from the high-resolution high-fidelity models (e.g. StyleGAN-like), while preserving the identity. We show that a simple approach based on fine-tuning an encoder for StyleGAN allows to improve upon the state-of-the-art facial recognition and performs better compared to training on synthetic face identities. We also collect large-scale unlabeled datasets with controllable ethnic constitution -- AfricanFaceSet-5M (5 million images of different people) and AsianFaceSet-3M (3 million images of different people) and we show that pretraining on each of them improves recognition of the respective ethnicities (as well as also others), while combining all unlabeled datasets results in the biggest performance increase. Our self-supervised strategy is the most useful with limited amounts of labeled training data, which can be beneficial for more tailored face recognition tasks and when facing privacy concerns. Evaluation is provided based on a standard RFW dataset and a new large-scale RB-WebFace benchmark.

arxiv.org

ProtoFold Neighborhood Inspector. (arXiv:2210.09308v1 [q-bio.QM]) arxiv.org/abs/2210.09308

ProtoFold Neighborhood Inspector

Post-translational modifications (PTMs) affecting a protein's residues (amino acids) can disturb its function, leading to illness. Whether or not a PTM is pathogenic depends on its type and the status of neighboring residues. In this paper, we present the ProtoFold Neighborhood Inspector (PFNI), a visualization system for analyzing residues neighborhoods. The main contribution is a visualization idiom, the Residue Constellation (RC), for identifying and comparing three-dimensional neighborhoods based on per-residue features and spatial characteristics. The RC leverages two-dimensional representations of the protein's three-dimensional structure to overcome problems like occlusion, easing the analysis of neighborhoods that often have complicated spatial arrangements. Using the PFNI, we explored proteins' structural PTM data, which allowed us to identify patterns in the distribution and quantity of per-neighborhood PTMs that might be related to their pathogenic status. In the following, we define the tasks that guided the development of the PFNI and describe the data sources we derived and used. Then, we introduce the PFNI and illustrate its usage through an example of an analysis workflow. We conclude by reflecting on preliminary findings obtained while using the tool on the provided data and future directions concerning the development of the PFNI.

arxiv.org

On Optimal Subarchitectures for Quantum Circuit Mapping. (arXiv:2210.09321v1 [quant-ph]) arxiv.org/abs/2210.09321

On Optimal Subarchitectures for Quantum Circuit Mapping

Compiling a high-level quantum circuit down to a low-level description that can be executed on state-of-the-art quantum computers is a crucial part of the software stack for quantum computing. One step in compiling a quantum circuit to some device is quantum circuit mapping, where the circuit is transformed such that it complies with the architecture's limited qubit connectivity. Because the search space in quantum circuit mapping grows exponentially in the number of qubits, it is desirable to consider as few of the device's physical qubits as possible in the process. Previous work conjectured that it suffices to consider only subarchitectures of a quantum computer composed of as many qubits as used in the circuit. In this work, we refute this conjecture and establish criteria for judging whether considering larger parts of the architecture might yield better solutions to the mapping problem. Through rigorous analysis, we show that determining subarchitectures that are of minimal size, i.e., of which no physical qubit can be removed without losing the optimal mapping solution for some quantum circuit, is a very hard problem. Based on a relaxation of the criteria for optimality, we introduce a relaxed consideration that still maintains optimality for practically relevant quantum circuits. Eventually, this results in two methods for computing near-optimal sets of subarchitectures$\unicode{x2014}$providing the basis for efficient quantum circuit mapping solutions. We demonstrate the benefits of this novel method for state-of-the-art quantum computers by IBM, Google and Rigetti.

arxiv.org

TorchDIVA: An Extensible Computational Model of Speech Production built on an Open-Source Machine Learning Library. (arXiv:2210.09334v1 [eess.AS]) arxiv.org/abs/2210.09334

TorchDIVA: An Extensible Computational Model of Speech Production built on an Open-Source Machine Learning Library

The DIVA model is a computational model of speech motor control that combines a simulation of the brain regions responsible for speech production with a model of the human vocal tract. The model is currently implemented in Matlab Simulink; however, this is less than ideal as most of the development in speech technology research is done in Python. This means there is a wealth of machine learning tools which are freely available in the Python ecosystem that cannot be easily integrated with DIVA. We present TorchDIVA, a full rebuild of DIVA in Python using PyTorch tensors. DIVA source code was directly translated from Matlab to Python, and built-in Simulink signal blocks were implemented from scratch. After implementation, the accuracy of each module was evaluated via systematic block-by-block validation. The TorchDIVA model is shown to produce outputs that closely match those of the original DIVA model, with a negligible difference between the two. We additionally present an example of the extensibility of TorchDIVA as a research platform. Speech quality enhancement in TorchDIVA is achieved through an integration with an existing PyTorch generative vocoder called DiffWave. A modified DiffWave mel-spectrum upsampler was trained on human speech waveforms and conditioned on the TorchDIVA speech production. The results indicate improved speech quality metrics in the DiffWave-enhanced output as compared to the baseline. This enhancement would have been difficult or impossible to accomplish in the original Matlab implementation. This proof-of-concept demonstrates the value TorchDIVA will bring to the research community. Researchers can download the new implementation at: https://github.com/skinahan/DIVA_PyTorch

arxiv.org

Robust Imitation of a Few Demonstrations with a Backwards Model. (arXiv:2210.09337v1 [cs.LG]) arxiv.org/abs/2210.09337

Robust Imitation of a Few Demonstrations with a Backwards Model

Behavior cloning of expert demonstrations can speed up learning optimal policies in a more sample-efficient way over reinforcement learning. However, the policy cannot extrapolate well to unseen states outside of the demonstration data, creating covariate shift (agent drifting away from demonstrations) and compounding errors. In this work, we tackle this issue by extending the region of attraction around the demonstrations so that the agent can learn how to get back onto the demonstrated trajectories if it veers off-course. We train a generative backwards dynamics model and generate short imagined trajectories from states in the demonstrations. By imitating both demonstrations and these model rollouts, the agent learns the demonstrated paths and how to get back onto these paths. With optimal or near-optimal demonstrations, the learned policy will be both optimal and robust to deviations, with a wider region of attraction. On continuous control domains, we evaluate the robustness when starting from different initial states unseen in the demonstration data. While both our method and other imitation learning baselines can successfully solve the tasks for initial states in the training distribution, our method exhibits considerably more robustness to different initial states.

arxiv.org
Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.