Show newer

Intanify AI Platform: Embedded AI for Automated IP Audit and Due Diligence arxiv.org/abs/2503.17374 .CY .CE .HC

Intanify AI Platform: Embedded AI for Automated IP Audit and Due Diligence

In this paper we introduce a Platform created in order to support SMEs' endeavor to extract value from their intangible assets effectively. To implement the Platform, we developed five knowledge bases using a knowledge-based ex-pert system shell that contain knowledge from intangible as-set consultants, patent attorneys and due diligence lawyers. In order to operationalize the knowledge bases, we developed a "Rosetta Stone", an interpreter unit for the knowledge bases outside the shell and embedded in the plat-form. Building on the initial knowledge bases we have created a system of red flags, risk scoring, and valuation with the involvement of the same experts; these additional systems work upon the initial knowledge bases and therefore they can be regarded as meta-knowledge-representations that take the form of second-order knowledge graphs. All this clever technology is dressed up in an easy-to-handle graphical user interface that we will showcase at the conference. The initial platform was finished mid-2024; therefore, it qualifies as an "emerging application of AI" and "deployable AI", while development continues. The two firms that provided experts for developing the knowledge bases obtained a white-label version of the product (i.e. it runs under their own brand "powered by Intanify"), and there are two completed cases.

arXiv.org

Large language model-powered AI systems achieve self-replication with no human intervention arxiv.org/abs/2503.17378 .AI .CR .CY .ET .MA

Large language model-powered AI systems achieve self-replication with no human intervention

Self-replication with no human intervention is broadly recognized as one of the principal red lines associated with frontier AI systems. While leading corporations such as OpenAI and Google DeepMind have assessed GPT-o3-mini and Gemini on replication-related tasks and concluded that these systems pose a minimal risk regarding self-replication, our research presents novel findings. Following the same evaluation protocol, we demonstrate that 11 out of 32 existing AI systems under evaluation already possess the capability of self-replication. In hundreds of experimental trials, we observe a non-trivial number of successful self-replication trials across mainstream model families worldwide, even including those with as small as 14 billion parameters which can run on personal computers. Furthermore, we note the increase in self-replication capability when the model becomes more intelligent in general. Also, by analyzing the behavioral traces of diverse AI systems, we observe that existing AI systems already exhibit sufficient planning, problem-solving, and creative capabilities to accomplish complex agentic tasks including self-replication. More alarmingly, we observe successful cases where an AI system do self-exfiltration without explicit instructions, adapt to harsher computational environments without sufficient software or hardware supports, and plot effective strategies to survive against the shutdown command from the human beings. These novel findings offer a crucial time buffer for the international community to collaborate on establishing effective governance over the self-replication capabilities and behaviors of frontier AI systems, which could otherwise pose existential risks to the human society if not well-controlled.

arXiv.org

Uncertainty Quantification for Data-Driven Machine Learning Models in Nuclear Engineering Applications: Where We Are and What Do We Need? arxiv.org/abs/2503.17385 .SY .ML .LG .SY

Uncertainty Quantification for Data-Driven Machine Learning Models in Nuclear Engineering Applications: Where We Are and What Do We Need?

Machine learning (ML) has been leveraged to tackle a diverse range of tasks in almost all branches of nuclear engineering. Many of the successes in ML applications can be attributed to the recent performance breakthroughs in deep learning, the growing availability of computational power, data, and easy-to-use ML libraries. However, these empirical successes have often outpaced our formal understanding of the ML algorithms. An important but under-rated area is uncertainty quantification (UQ) of ML. ML-based models are subject to approximation uncertainty when they are used to make predictions, due to sources including but not limited to, data noise, data coverage, extrapolation, imperfect model architecture and the stochastic training process. The goal of this paper is to clearly explain and illustrate the importance of UQ of ML. We will elucidate the differences in the basic concepts of UQ of physics-based models and data-driven ML models. Various sources of uncertainties in physical modeling and data-driven modeling will be discussed, demonstrated, and compared. We will also present and demonstrate a few techniques to quantify the ML prediction uncertainties. Finally, we will discuss the need for building a verification, validation and UQ framework to establish ML credibility.

arXiv.org

A new graph-based surrogate model for rapid prediction of crashworthiness performance of vehicle panel components arxiv.org/abs/2503.17386 .SY .LG .SY

A new graph-based surrogate model for rapid prediction of crashworthiness performance of vehicle panel components

During the design cycle of safety critical vehicle components such as B-pillars, crashworthiness performance is a key metric for passenger protection assessment in vehicle accidents. Traditional finite element simulations for crashworthiness analysis involve complex modelling, leading to an increased computational demand. Although a few machine learning-based surrogate models have been developed for rapid predictions for crashworthiness analysis, they exhibit limitations in detailed representation of complex 3D components. Graph Neural Networks (GNNs) have emerged as a promising solution for processing data with complex structures. However, existing GNN models often lack sufficient accuracy and computational efficiency to meet industrial demands. This paper proposes Recurrent Graph U-Net (ReGUNet), a new graph-based surrogate model for crashworthiness analysis of vehicle panel components. ReGUNet adoptes a U-Net architecture with multiple graph downsampling and upsampling layers, which improves the model's computational efficiency and accuracy; the introduction of recurrence enhances the accuracy and stability of temporal predictions over multiple time steps. ReGUNet is evaluated through a case study of side crash testing of a B-pillar component with variation in geometric design. The trained model demonstrates great accuracy in predicting the dynamic behaviour of previously unseen component designs within a relative error of 0.74% for the maximum B-pillar intrusion. Compared to the baseline models, ReGUNet can reduce the averaged mean prediction error of the component's deformation by more than 51% with significant improvement in computational efficiency. Provided enhanced accuracy and efficiency, ReGUNet shows greater potential in accurate predictions of large and complex graphs compared to existing models.

arXiv.org

BPINN-EM-Post: Stochastic Electromigration Damage Analysis in the Post-Void Phase based on Bayesian Physics-Informed Neural Network arxiv.org/abs/2503.17393 .LG

BPINN-EM-Post: Stochastic Electromigration Damage Analysis in the Post-Void Phase based on Bayesian Physics-Informed Neural Network

In contrast to the assumptions of most existing Electromigration (EM) analysis tools, the evolution of EM-induced stress is inherently non-deterministic, influenced by factors such as input current fluctuations and manufacturing non-idealities. Traditional approaches for estimating stress variations typically involve computationally expensive and inefficient Monte Carlo simulations with industrial solvers, which quantify variations using mean and variance metrics. In this work, we introduce a novel machine learning-based framework, termed BPINNEM- Post, for efficient stochastic analysis of EM-induced postvoiding aging processes. This new approach integrates closedform analytical solutions with a Bayesian Physics-Informed Neural Network (BPINN) framework to accelerate the analysis for the first time. The closed-form solutions enforce physical laws at the individual wire segment level, while the BPINN ensures that physics constraints at inter-segment junctions are satisfied and stochastic behaviors are accurately modeled. By reducing the number of variables in the loss functions through the use of analytical solutions, our method significantly improves training efficiency without accuracy loss and naturally incorporates variational effects. Additionally, the analytical solutions effectively address the challenge of incorporating initial stress distributions in interconnect structures during post-void stress calculations. Numerical results demonstrate that BPINN-EM-Post achieves over 240x speedup compared to Monte Carlo simulations using the FEM-based COMSOL solver and more than 65x speedup compared to Monte Carlo simulations using the FDM-based EMSpice method.

arXiv.org

OpenAI's Approach to External Red Teaming for AI Models and Systems arxiv.org/abs/2503.16431 .CY .AI .CR .HC

OpenAI's Approach to External Red Teaming for AI Models and Systems

Red teaming has emerged as a critical practice in assessing the possible risks of AI models and systems. It aids in the discovery of novel risks, stress testing possible gaps in existing mitigations, enriching existing quantitative safety metrics, facilitating the creation of new safety measurements, and enhancing public trust and the legitimacy of AI risk assessments. This white paper describes OpenAI's work to date in external red teaming and draws some more general conclusions from this work. We describe the design considerations underpinning external red teaming, which include: selecting composition of red team, deciding on access levels, and providing guidance required to conduct red teaming. Additionally, we show outcomes red teaming can enable such as input into risk assessment and automated evaluations. We also describe the limitations of external red teaming, and how it can fit into a broader range of AI model and system evaluations. Through these contributions, we hope that AI developers and deployers, evaluation creators, and policymakers will be able to better design red teaming campaigns and get a deeper look into how external red teaming can fit into model deployment and evaluation processes. These methods are evolving and the value of different methods continues to shift as the ecosystem around red teaming matures and models themselves improve as tools for red teaming.

arXiv.org

Multimodal Transformer Models for Turn-taking Prediction: Effects on Conversational Dynamics of Human-Agent Interaction during Cooperative Gameplay arxiv.org/abs/2503.16432 .HC .AI .CL

Multimodal Transformer Models for Turn-taking Prediction: Effects on Conversational Dynamics of Human-Agent Interaction during Cooperative Gameplay

This study investigates multimodal turn-taking prediction within human-agent interactions (HAI), particularly focusing on cooperative gaming environments. It comprises both model development and subsequent user study, aiming to refine our understanding and improve conversational dynamics in spoken dialogue systems (SDSs). For the modeling phase, we introduce a novel transformer-based deep learning (DL) model that simultaneously integrates multiple modalities - text, vision, audio, and contextual in-game data to predict turn-taking events in real-time. Our model employs a Crossmodal Transformer architecture to effectively fuse information from these diverse modalities, enabling more comprehensive turn-taking predictions. The model demonstrates superior performance compared to baseline models, achieving 87.3% accuracy and 83.0% macro F1 score. A human user study was then conducted to empirically evaluate the turn-taking DL model in an interactive scenario with a virtual avatar while playing the game "Dont Starve Together", comparing a control condition without turn-taking prediction (n=20) to an experimental condition with our model deployed (n=40). Both conditions included a mix of English and Korean speakers, since turn-taking cues are known to vary by culture. We then analyzed the interaction quality, examining aspects such as utterance counts, interruption frequency, and participant perceptions of the avatar. Results from the user study suggest that our multimodal turn-taking model not only enhances the fluidity and naturalness of human-agent conversations, but also maintains a balanced conversational dynamic without significantly altering dialogue frequency. The study provides in-depth insights into the influence of turn-taking abilities on user perceptions and interaction quality, underscoring the potential for more contextually adaptive and responsive conversational agents.

arXiv.org

AI-Generated Content in Landscape Architecture: A Survey arxiv.org/abs/2503.16435 .HC

AI-Generated Content in Landscape Architecture: A Survey

Landscape design is a complex process that requires designers to engage in intricate planning, analysis, and decision-making. This process involves the integration and reconstruction of science, art, and technology. Traditional landscape design methods often rely on the designer's personal experience and subjective aesthetics, with design standards rooted in subjective perception. As a result, they lack scientific and objective evaluation criteria and systematic design processes. Data-driven artificial intelligence (AI) technology provides an objective and rational design process. With the rapid development of different AI technologies, AI-generated content (AIGC) has permeated various aspects of landscape design at an unprecedented speed, serving as an innovative design tool. This article aims to explore the applications and opportunities of AIGC in landscape design. AIGC can support landscape design in areas such as site research and analysis, design concepts and scheme generation, parametric design optimization, plant selection and visual simulation, construction management, and process optimization. However, AIGC also faces challenges in landscape design, including data quality and reliability, design expertise and judgment, technical challenges and limitations, site characteristics and sustainability, user needs and participation, the balance between technology and creativity, ethics, and social impact. Finally, this article provides a detailed outlook on the future development trends and prospects of AIGC in landscape design. Through in-depth research and exploration in this review, readers can gain a better understanding of the relevant applications, potential opportunities, and key challenges of AIGC in landscape design.

arXiv.org

Enhancing Human-Robot Collaboration through Existing Guidelines: A Case Study Approach arxiv.org/abs/2503.16436 .HC .AI .RO

Enhancing Human-Robot Collaboration through Existing Guidelines: A Case Study Approach

As AI systems become more prevalent, concerns about their development, operation, and societal impact intensify. Establishing ethical, social, and safety standards amidst evolving AI capabilities poses significant challenges. Global initiatives are underway to establish guidelines for AI system development and operation. With the increasing use of collaborative human-AI task execution, it's vital to continuously adapt AI systems to meet user and environmental needs. Failure to synchronize AI evolution with changes in users and the environment could result in ethical and safety issues. This paper evaluates the applicability of existing guidelines in human-robot collaborative systems, assesses their effectiveness, and discusses limitations. Through a case study, we examine whether our target system meets requirements outlined in existing guidelines and propose improvements to enhance human-robot interactions. Our contributions provide insights into interpreting and applying guidelines, offer concrete examples of system enhancement, and highlight their applicability and limitations. We believe these contributions will stimulate discussions and influence system assurance and certification in future AI-infused critical systems.

arXiv.org

Cause-effect perception in an object place task arxiv.org/abs/2503.16440 .HC .AI

Cause-effect perception in an object place task

Algorithmic causal discovery is based on formal reasoning and provably converges toward the optimal solution. However, since some of the underlying assumptions are often not met in practice no applications for autonomous everyday life competence are yet available. Humans on the other hand possess full everyday competence and develop cognitive models in a data efficient manner with the ability to transfer knowledge between and to new situations. Here we investigate the causal discovery capabilities of humans in an object place task in virtual reality (VR) with haptic feedback and compare the results to the state of the art causal discovery algorithms FGES, PC and FCI. In addition we use the algorithms to analyze causal relations between sensory information and the kinematic parameters of human behavior. Our findings show that the majority of participants were able to determine which variables are causally related. This is in line with causal discovery algorithms like PC, which recover causal dependencies in the first step. However, unlike such algorithms which can identify causes and effects in our test configuration, humans are unsure in determining a causal direction. Regarding the relation between the sensory information provided to the participants and their placing actions (i.e. their kinematic parameters) the data yields a surprising dissociation of the subjects knowledge and the sensorimotor level. Knowledge of the cause-effect pairs, though undirected, should suffice to improve subject's movements. Yet a detailed causal analysis provides little evidence for any such influence. This, together with the reports of the participants, implies that instead of exploiting their consciously perceived information they leave it to the sensorimotor level to control the movement.

arXiv.org
Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.