From f770c3617d2f415ea1e676ae39a93d0f2dbf48b2 Mon Sep 17 00:00:00 2001 From: Automated Date: Mon, 6 May 2024 08:20:06 +0000 Subject: [PATCH] Latest data: Mon May 6 08:20:06 UTC 2024 --- data/downloads/2024-05-06-08h.jsonl | 342 ++++++++++++++++++++++++++++ 1 file changed, 342 insertions(+) create mode 100644 data/downloads/2024-05-06-08h.jsonl diff --git a/data/downloads/2024-05-06-08h.jsonl b/data/downloads/2024-05-06-08h.jsonl new file mode 100644 index 00000000..2c048df1 --- /dev/null +++ b/data/downloads/2024-05-06-08h.jsonl @@ -0,0 +1,342 @@ +{"created":"2024-05-03 17:59:55","title":"Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models","abstract":"We introduce Vibe-Eval: a new open benchmark and framework for evaluating multimodal chat models. Vibe-Eval consists of 269 visual understanding prompts, including 100 of hard difficulty, complete with gold-standard responses authored by experts. Vibe-Eval is open-ended and challenging with dual objectives: (i) vibe checking multimodal chat models for day-to-day tasks and (ii) rigorously testing and probing the capabilities of present frontier models. Notably, our hard set contains >50% questions that all frontier models answer incorrectly. We explore the nuances of designing, evaluating, and ranking models on ultra challenging prompts. We also discuss trade-offs between human and automatic evaluation, and show that automatic model evaluation using Reka Core roughly correlates to human judgment. We offer free API access for the purpose of lightweight evaluation and plan to conduct formal human evaluations for public models that perform well on the Vibe-Eval's automatic scores. We release the evaluation code and data, see https://github.com/reka-ai/reka-vibe-eval","sentences":["We introduce Vibe-Eval: a new open benchmark and framework for evaluating multimodal chat models.","Vibe-Eval consists of 269 visual understanding prompts, including 100 of hard difficulty, complete with gold-standard responses authored by experts.","Vibe-Eval is open-ended and challenging with dual objectives: (i) vibe checking multimodal chat models for day-to-day tasks and (ii) rigorously testing and probing the capabilities of present frontier models.","Notably, our hard set contains >50% questions that all frontier models answer incorrectly.","We explore the nuances of designing, evaluating, and ranking models on ultra challenging prompts.","We also discuss trade-offs between human and automatic evaluation, and show that automatic model evaluation using Reka Core roughly correlates to human judgment.","We offer free API access for the purpose of lightweight evaluation and plan to conduct formal human evaluations for public models that perform well on the Vibe-Eval's automatic scores.","We release the evaluation code and data, see https://github.com/reka-ai/reka-vibe-eval"],"url":"http://arxiv.org/abs/2405.02287v1","category":"cs.CL"} +{"created":"2024-05-03 17:59:27","title":"Accurate standard siren cosmology with joint gravitational-wave and $\u03b3$-ray burst observations","abstract":"Joint gravitational-wave and $\\gamma$-ray bursts (GRB) observations are among the best prospects for standard siren cosmology. However, the strong selection effect for the coincident GRB detection, which is possible only for sources with small inclination angles, induces a systematic uncertainty that is currently not accounted for. We show that this severe source of bias can be removed by inferring the a-priori unknown electromagnetic detection probability directly from multimessenger data. This leads at the same time to an unbiased measurement of the Hubble constant, to constrain the properties of GRB emission, and to accurately measure the viewing angle of each source. Our inference scheme is applicable to real data already in the small-statistics regime, a scenario that might become reality in the near future. Additionally, we introduce a novel likelihood approximant for GW events which treats the dependence on distance and inclination as exact.","sentences":["Joint gravitational-wave and $\\gamma$-ray bursts (GRB) observations are among the best prospects for standard siren cosmology.","However, the strong selection effect for the coincident GRB detection, which is possible only for sources with small inclination angles, induces a systematic uncertainty that is currently not accounted for.","We show that this severe source of bias can be removed by inferring the a-priori unknown electromagnetic detection probability directly from multimessenger data.","This leads at the same time to an unbiased measurement of the Hubble constant, to constrain the properties of GRB emission, and to accurately measure the viewing angle of each source.","Our inference scheme is applicable to real data already in the small-statistics regime, a scenario that might become reality in the near future.","Additionally, we introduce a novel likelihood approximant for GW events which treats the dependence on distance and inclination as exact."],"url":"http://arxiv.org/abs/2405.02286v1","category":"astro-ph.HE"} +{"created":"2024-05-03 17:56:39","title":"Modulational instability in a quasi-one-dimensional Bose-Einstein condensates","abstract":"In this work, we investigate the modulational instability of plane wave solutions within a modified Gross-Pitaevskii equation framework. The equation features cubic and quartic nonlinearity. It models the behaviour of quasi-one-dimensional Bose-Einstein condensates in symmetric Bose-Bose mixtures of ultra-dilute cold atoms. Our study demonstrates the pivotal role of the competition between mean-field attractions and quantum fluctuation-induced repulsions. This competition significantly affects the emergence and evolution of modulational instability. By employing linear stability analysis, we identify the essential conditions that lead to modulational instability. We find that the stability of plane wave solutions significantly depends on the interaction among system parameters. Further development of the instability leads to the fragmentation of the BEC into a chain of quantum droplets. We calculated the quantity of quantum droplets generated during the nonlinear phase of the instability. Our analytical results are corroborated by numerical simulations of the modified quasi-1D Gross-Pitaevskii equation. These simulations vividly depict the formation, interaction, and coalescence of droplets during the nonlinear phase of modulational instability. The investigation shows that linear stability analysis of the modified Gross-Pitaevskii equation, considering quantum fluctuations, precisely forecasts modulational instability phenomena across different domains of parameter spaces.","sentences":["In this work, we investigate the modulational instability of plane wave solutions within a modified Gross-Pitaevskii equation framework.","The equation features cubic and quartic nonlinearity.","It models the behaviour of quasi-one-dimensional Bose-Einstein condensates in symmetric Bose-Bose mixtures of ultra-dilute cold atoms.","Our study demonstrates the pivotal role of the competition between mean-field attractions and quantum fluctuation-induced repulsions.","This competition significantly affects the emergence and evolution of modulational instability.","By employing linear stability analysis, we identify the essential conditions that lead to modulational instability.","We find that the stability of plane wave solutions significantly depends on the interaction among system parameters.","Further development of the instability leads to the fragmentation of the BEC into a chain of quantum droplets.","We calculated the quantity of quantum droplets generated during the nonlinear phase of the instability.","Our analytical results are corroborated by numerical simulations of the modified quasi-1D Gross-Pitaevskii equation.","These simulations vividly depict the formation, interaction, and coalescence of droplets during the nonlinear phase of modulational instability.","The investigation shows that linear stability analysis of the modified Gross-Pitaevskii equation, considering quantum fluctuations, precisely forecasts modulational instability phenomena across different domains of parameter spaces."],"url":"http://arxiv.org/abs/2405.02282v1","category":"cond-mat.quant-gas"} +{"created":"2024-05-03 17:56:18","title":"Reviving Horndeski after GW170817 by Kaluza-Klein compactifications","abstract":"The application of Horndeski theory/ Galileons for late time cosmology is heavily constrained by the strict coincidence in the speed of propagation of gravitational and electromagnetic waves. These constraints presuppose that the minimally coupled photon is not modified, not even at the scales where General Relativity (GR) may need modification. We find that the 4D Galileon obtained from a Kaluza-Klein compactification of its higher dimensional version is a natural unified modification of GR and electromagnetism with automatically \"luminal\" gravitational waves. This property follows without any fine tuning of Galileon potentials for a larger class of theories than previously thought. In particular, the $G_4$ potential is not constrained by the speed test and $G_5$ may also be present. In other words, some Galileon models that have been ruled out since the event GW170817 are, in fact, not necessarily constrained if they arise in 4D from compactifications of their higher dimensional Galileon counterparts. Besides their compelling luminality, the resulting vector Galileons are naturally $U(1)$ gauge invariant.","sentences":["The application of Horndeski theory/ Galileons for late time cosmology is heavily constrained by the strict coincidence in the speed of propagation of gravitational and electromagnetic waves.","These constraints presuppose that the minimally coupled photon is not modified, not even at the scales where General Relativity (GR) may need modification.","We find that the 4D Galileon obtained from a Kaluza-Klein compactification of its higher dimensional version is a natural unified modification of GR and electromagnetism with automatically \"luminal\" gravitational waves.","This property follows without any fine tuning of Galileon potentials for a larger class of theories than previously thought.","In particular, the $G_4$ potential is not constrained by the speed test and $G_5$ may also be present.","In other words, some Galileon models that have been ruled out since the event GW170817 are, in fact, not necessarily constrained if they arise in 4D from compactifications of their higher dimensional Galileon counterparts.","Besides their compelling luminality, the resulting vector Galileons are naturally $U(1)$ gauge invariant."],"url":"http://arxiv.org/abs/2405.02281v1","category":"hep-th"} +{"created":"2024-05-03 17:55:34","title":"DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos","abstract":"Existing VLMs can track in-the-wild 2D video objects while current generative models provide powerful visual priors for synthesizing novel views for the highly under-constrained 2D-to-3D object lifting. Building upon this exciting progress, we present DreamScene4D, the first approach that can generate three-dimensional dynamic scenes of multiple objects from monocular in-the-wild videos with large object motion across occlusions and novel viewpoints. Our key insight is to design a \"decompose-then-recompose\" scheme to factorize both the whole video scene and each object's 3D motion. We first decompose the video scene by using open-vocabulary mask trackers and an adapted image diffusion model to segment, track, and amodally complete the objects and background in the video. Each object track is mapped to a set of 3D Gaussians that deform and move in space and time. We also factorize the observed motion into multiple components to handle fast motion. The camera motion can be inferred by re-rendering the background to match the video frames. For the object motion, we first model the object-centric deformation of the objects by leveraging rendering losses and multi-view generative priors in an object-centric frame, then optimize object-centric to world-frame transformations by comparing the rendered outputs against the perceived pixel and optical flow. Finally, we recompose the background and objects and optimize for relative object scales using monocular depth prediction guidance. We show extensive results on the challenging DAVIS, Kubric, and self-captured videos, detail some limitations, and provide future directions. Besides 4D scene generation, our results show that DreamScene4D enables accurate 2D point motion tracking by projecting the inferred 3D trajectories to 2D, while never explicitly trained to do so.","sentences":["Existing VLMs can track in-the-wild 2D video objects while current generative models provide powerful visual priors for synthesizing novel views for the highly under-constrained 2D-to-3D object lifting.","Building upon this exciting progress, we present DreamScene4D, the first approach that can generate three-dimensional dynamic scenes of multiple objects from monocular in-the-wild videos with large object motion across occlusions and novel viewpoints.","Our key insight is to design a \"decompose-then-recompose\" scheme to factorize both the whole video scene and each object's 3D motion.","We first decompose the video scene by using open-vocabulary mask trackers and an adapted image diffusion model to segment, track, and amodally complete the objects and background in the video.","Each object track is mapped to a set of 3D Gaussians that deform and move in space and time.","We also factorize the observed motion into multiple components to handle fast motion.","The camera motion can be inferred by re-rendering the background to match the video frames.","For the object motion, we first model the object-centric deformation of the objects by leveraging rendering losses and multi-view generative priors in an object-centric frame, then optimize object-centric to world-frame transformations by comparing the rendered outputs against the perceived pixel and optical flow.","Finally, we recompose the background and objects and optimize for relative object scales using monocular depth prediction guidance.","We show extensive results on the challenging DAVIS, Kubric, and self-captured videos, detail some limitations, and provide future directions.","Besides 4D scene generation, our results show that DreamScene4D enables accurate 2D point motion tracking by projecting the inferred 3D trajectories to 2D, while never explicitly trained to do so."],"url":"http://arxiv.org/abs/2405.02280v1","category":"cs.CV"} +{"created":"2024-05-03 17:53:15","title":"An error-mitigated photonic quantum circuit Born machine","abstract":"Generative machine learning models aim to learn the underlying distribution of the data in order to generate new samples. Quantum circuit Born machines (QCBMs) are a popular choice of quantum generative models, which are particularly well suited to near-term devices since they can be implemented on shallow circuits. Within the framework of photonic quantum computing, we design and simulate a QCBM that can be implemented with linear optics. We show that a newly developed error mitigation technique called recycling mitigation greatly improves the training of QCBMs in realistic scenarios with photon loss.","sentences":["Generative machine learning models aim to learn the underlying distribution of the data in order to generate new samples.","Quantum circuit Born machines (QCBMs) are a popular choice of quantum generative models, which are particularly well suited to near-term devices since they can be implemented on shallow circuits.","Within the framework of photonic quantum computing, we design and simulate a QCBM that can be implemented with linear optics.","We show that a newly developed error mitigation technique called recycling mitigation greatly improves the training of QCBMs in realistic scenarios with photon loss."],"url":"http://arxiv.org/abs/2405.02277v1","category":"quant-ph"} +{"created":"2024-05-03 17:49:57","title":"Gotzmann's persistence theorem for smooth projective toric varieties","abstract":"Gotzmann's persistence theorem enables us to confirm the Hilbert polynomial of a subscheme of projective space by checking the Hilbert function in just two points, regardless of the dimension of the ambient space. We generalise this result to products of projective spaces, and then extend our result to any smooth projective toric variety. In the case of products of projective spaces, the number of points depends solely on the Picard rank of the ambient space, rather than on the dimension. For a more general smooth projective toric variety this number depends on the number of Hilbert basis elements of the nef cone.","sentences":["Gotzmann's persistence theorem enables us to confirm the Hilbert polynomial of a subscheme of projective space by checking the Hilbert function in just two points, regardless of the dimension of the ambient space.","We generalise this result to products of projective spaces, and then extend our result to any smooth projective toric variety.","In the case of products of projective spaces, the number of points depends solely on the Picard rank of the ambient space, rather than on the dimension.","For a more general smooth projective toric variety this number depends on the number of Hilbert basis elements of the nef cone."],"url":"http://arxiv.org/abs/2405.02275v1","category":"math.AG"} +{"created":"2024-05-03 17:45:31","title":"Mapping Cone and Morse Theory","abstract":"On a smooth manifold, we associate to any closed differential form a mapping cone algebra. The cohomology of this mapping cone algebra can vary with the de Rham cohomology class of the closed form. We present a novel Morse theoretical description for the mapping cone cohomology. Specifically, we introduce a Morse complex for the mapping cone algebra which is generated by pairs of critical points with the differential defined by gradient flows and an integration of the closed form over spaces of gradient flow lines. We prove that the cohomology of our cone Morse complex is isomorphic to the mapping cone cohomology and hence independent of both the Riemannian metric and the Morse function used to define the complex. We also obtain sharp inequalities that bound the dimension of the mapping cone cohomology in terms of the number of Morse critical points and the properties of the specified closed form. Our results are widely applicable, especially for any manifold equipped with a geometric structure described by a closed differential form. We also obtain a bound on the difference between the number of Morse critical points and the Betti numbers.","sentences":["On a smooth manifold, we associate to any closed differential form a mapping cone algebra.","The cohomology of this mapping cone algebra can vary with the de Rham cohomology class of the closed form.","We present a novel Morse theoretical description for the mapping cone cohomology.","Specifically, we introduce a Morse complex for the mapping cone algebra which is generated by pairs of critical points with the differential defined by gradient flows and an integration of the closed form over spaces of gradient flow lines.","We prove that the cohomology of our cone Morse complex is isomorphic to the mapping cone cohomology and hence independent of both the Riemannian metric and the Morse function used to define the complex.","We also obtain sharp inequalities that bound the dimension of the mapping cone cohomology in terms of the number of Morse critical points and the properties of the specified closed form.","Our results are widely applicable, especially for any manifold equipped with a geometric structure described by a closed differential form.","We also obtain a bound on the difference between the number of Morse critical points and the Betti numbers."],"url":"http://arxiv.org/abs/2405.02272v1","category":"math.DG"} +{"created":"2024-05-03 17:37:26","title":"On the structures of subset sets in higher dimension","abstract":"A given subset $A$ of natural numbers is said to be complete if every element of $\\N$ is the sum of distinct terms taken from $A$. This topic is strongly connected to the knapsack problem which is known to be NP complete. The main goal of the paper is to study the structure of subset sums in a higher dimension. We show 'dense' sets and generalized arithmetic progrssions in subset sums of certain sets.","sentences":["A given subset $A$ of natural numbers is said to be complete if every element of $\\N$ is the sum of distinct terms taken from $A$.","This topic is strongly connected to the knapsack problem which is known to be NP complete.","The main goal of the paper is to study the structure of subset sums in a higher dimension.","We show 'dense' sets and generalized arithmetic progrssions in subset sums of certain sets."],"url":"http://arxiv.org/abs/2405.02269v1","category":"math.CO"} +{"created":"2024-05-03 17:34:57","title":"Structural Pruning of Pre-trained Language Models via Neural Architecture Search","abstract":"Pre-trained language models (PLM), for example BERT or RoBERTa, mark the state-of-the-art for natural language understanding task when fine-tuned on labeled data. However, their large size poses challenges in deploying them for inference in real-world applications, due to significant GPU memory requirements and high inference latency. This paper explores neural architecture search (NAS) for structural pruning to find sub-parts of the fine-tuned network that optimally trade-off efficiency, for example in terms of model size or latency, and generalization performance. We also show how we can utilize more recently developed two-stage weight-sharing NAS approaches in this setting to accelerate the search process. Unlike traditional pruning methods with fixed thresholds, we propose to adopt a multi-objective approach that identifies the Pareto optimal set of sub-networks, allowing for a more flexible and automated compression process.","sentences":["Pre-trained language models (PLM), for example BERT or RoBERTa, mark the state-of-the-art for natural language understanding task when fine-tuned on labeled data.","However, their large size poses challenges in deploying them for inference in real-world applications, due to significant GPU memory requirements and high inference latency.","This paper explores neural architecture search (NAS) for structural pruning to find sub-parts of the fine-tuned network that optimally trade-off efficiency, for example in terms of model size or latency, and generalization performance.","We also show how we can utilize more recently developed two-stage weight-sharing NAS approaches in this setting to accelerate the search process.","Unlike traditional pruning methods with fixed thresholds, we propose to adopt a multi-objective approach that identifies the Pareto optimal set of sub-networks, allowing for a more flexible and automated compression process."],"url":"http://arxiv.org/abs/2405.02267v1","category":"cs.LG"} +{"created":"2024-05-03 17:34:02","title":"On the test-time zero-shot generalization of vision-language models: Do we really need prompt learning?","abstract":"The development of large vision-language models, notably CLIP, has catalyzed research into effective adaptation techniques, with a particular focus on soft prompt tuning. Conjointly, test-time augmentation, which utilizes multiple augmented views of a single image to enhance zero-shot generalization, is emerging as a significant area of interest. This has predominantly directed research efforts toward test-time prompt tuning. In contrast, we introduce a robust MeanShift for Test-time Augmentation (MTA), which surpasses prompt-based methods without requiring this intensive training procedure. This positions MTA as an ideal solution for both standalone and API-based applications. Additionally, our method does not rely on ad hoc rules (e.g., confidence threshold) used in some previous test-time augmentation techniques to filter the augmented views. Instead, MTA incorporates a quality assessment variable for each view directly into its optimization process, termed as the inlierness score. This score is jointly optimized with a density mode seeking process, leading to an efficient training- and hyperparameter-free approach. We extensively benchmark our method on 15 datasets and demonstrate MTA's superiority and computational efficiency. Deployed easily as plug-and-play module on top of zero-shot models and state-of-the-art few-shot methods, MTA shows systematic and consistent improvements.","sentences":["The development of large vision-language models, notably CLIP, has catalyzed research into effective adaptation techniques, with a particular focus on soft prompt tuning.","Conjointly, test-time augmentation, which utilizes multiple augmented views of a single image to enhance zero-shot generalization, is emerging as a significant area of interest.","This has predominantly directed research efforts toward test-time prompt tuning.","In contrast, we introduce a robust MeanShift for Test-time Augmentation (MTA), which surpasses prompt-based methods without requiring this intensive training procedure.","This positions MTA as an ideal solution for both standalone and API-based applications.","Additionally, our method does not rely on ad hoc rules (e.g., confidence threshold) used in some previous test-time augmentation techniques to filter the augmented views.","Instead, MTA incorporates a quality assessment variable for each view directly into its optimization process, termed as the inlierness score.","This score is jointly optimized with a density mode seeking process, leading to an efficient training- and hyperparameter-free approach.","We extensively benchmark our method on 15 datasets and demonstrate MTA's superiority and computational efficiency.","Deployed easily as plug-and-play module on top of zero-shot models and state-of-the-art few-shot methods, MTA shows systematic and consistent improvements."],"url":"http://arxiv.org/abs/2405.02266v1","category":"cs.CV"} +{"created":"2024-05-03 17:32:55","title":"Characterizing randomness in parameterized quantum circuits through expressibility and average entanglement","abstract":"While scalable error correction schemes and fault tolerant quantum computing seem not to be universally accessible in the near sight, the efforts of many researchers have been directed to the exploration of the contemporary available quantum hardware. Due to these limitations, the depth and dimension of the possible quantum circuits are restricted. This motivates the study of circuits with parameterized operations that can be classically optimized in hybrid methods as variational quantum algorithms (VQAs), enabling the reduction of circuit depth and size. The characteristics of these Parameterized Quantum Circuits (PQCs) are still not fully understood outside the scope of their principal application, motivating the study of their intrinsic properties. In this work, we analyse the generation of random states in PQCs under restrictions on the qubits connectivities, justified by different quantum computer architectures. We apply the expressibility quantifier and the average entanglement as diagnostics for the characteristics of the generated states and classify the circuits depending on the topology of the quantum computer where they can be implemented. As a function of the number of layers and qubits, circuits following a Ring topology will have the highest entanglement and expressibility values, followed by Linear/All-to-all almost together and the Star topology. In addition to the characterization of the differences between the entanglement and expressibility of these circuits, we also place a connection between how steep is the increase on the uniformity of the distribution of the generated states and the generation of entanglement. Circuits generating average and standard deviation for entanglement closer to values obtained with the truly uniformly random ensemble of unitaries present a steeper evolution when compared to others.","sentences":["While scalable error correction schemes and fault tolerant quantum computing seem not to be universally accessible in the near sight, the efforts of many researchers have been directed to the exploration of the contemporary available quantum hardware.","Due to these limitations, the depth and dimension of the possible quantum circuits are restricted.","This motivates the study of circuits with parameterized operations that can be classically optimized in hybrid methods as variational quantum algorithms (VQAs), enabling the reduction of circuit depth and size.","The characteristics of these Parameterized Quantum Circuits (PQCs) are still not fully understood outside the scope of their principal application, motivating the study of their intrinsic properties.","In this work, we analyse the generation of random states in PQCs under restrictions on the qubits connectivities, justified by different quantum computer architectures.","We apply the expressibility quantifier and the average entanglement as diagnostics for the characteristics of the generated states and classify the circuits depending on the topology of the quantum computer where they can be implemented.","As a function of the number of layers and qubits, circuits following a Ring topology will have the highest entanglement and expressibility values, followed by Linear/All-to-all almost together and the Star topology.","In addition to the characterization of the differences between the entanglement and expressibility of these circuits, we also place a connection between how steep is the increase on the uniformity of the distribution of the generated states and the generation of entanglement.","Circuits generating average and standard deviation for entanglement closer to values obtained with the truly uniformly random ensemble of unitaries present a steeper evolution when compared to others."],"url":"http://arxiv.org/abs/2405.02265v1","category":"quant-ph"} +{"created":"2024-05-03 17:20:56","title":"Spatio-temporal spectral transfers in fluid dynamics","abstract":"Motivated by previous work on kinetic energy cascades in the ocean and atmosphere, we develop a spatio-temporal spectral transfer tool that can be used to study scales of variability in generalized dynamical systems. In particular, we use generalized time-frequency methods from signal analysis to broaden the applicability of frequency transfers from theoretical to practical applications such as the study of ocean or atmosphere data or simulation output. We also show that triad interactions in wavenumber used to study kinetic energy and enstrophy cascades can be generalized to study triad interactions in frequency or wavenumber-frequency. We study the effects of sweeping on the locality of frequency transfers and frequency triad interactions to better understand the locality of spatio-temporal frequency transfers. As an illustrative example, we use the spatio-temporal spectral transfer tool to study the results of a simulation of two-dimensional homogeneous isotropic turbulence. This simulated fluid is forced at a well-defined wavenumber and frequency with dissipation occurring at both large and small scales, making this one of the first studies of \"modulated turbulence\" in two dimensions. Our results show that the spatio-temporal transfers we develop in this paper are robust to potential practical problems such as low sampling rates or nonstationarity in time series of interest. We anticipate that this method will be a useful tool in studying scales of spatio-temporal variability in a wide range of fluids applications as higher resolution data and simulations become more widely available.","sentences":["Motivated by previous work on kinetic energy cascades in the ocean and atmosphere, we develop a spatio-temporal spectral transfer tool that can be used to study scales of variability in generalized dynamical systems.","In particular, we use generalized time-frequency methods from signal analysis to broaden the applicability of frequency transfers from theoretical to practical applications such as the study of ocean or atmosphere data or simulation output.","We also show that triad interactions in wavenumber used to study kinetic energy and enstrophy cascades can be generalized to study triad interactions in frequency or wavenumber-frequency.","We study the effects of sweeping on the locality of frequency transfers and frequency triad interactions to better understand the locality of spatio-temporal frequency transfers.","As an illustrative example, we use the spatio-temporal spectral transfer tool to study the results of a simulation of two-dimensional homogeneous isotropic turbulence.","This simulated fluid is forced at a well-defined wavenumber and frequency with dissipation occurring at both large and small scales, making this one of the first studies of \"modulated turbulence\" in two dimensions.","Our results show that the spatio-temporal transfers we develop in this paper are robust to potential practical problems such as low sampling rates or nonstationarity in time series of interest.","We anticipate that this method will be a useful tool in studying scales of spatio-temporal variability in a wide range of fluids applications as higher resolution data and simulations become more widely available."],"url":"http://arxiv.org/abs/2405.02259v1","category":"physics.flu-dyn"} +{"created":"2024-05-03 17:19:05","title":"Cryogenic optical beam steering for superconducting device calibration","abstract":"We have developed a calibration system based on a micro-electromechanical systems (MEMS) mirror that is capable of delivering an optical beam over a wavelength range of 180 -- 2000 nm (0.62 -- 6.89 eV) in a sub-Kelvin environment. This portable, integrated system can steer the beam over a $\\sim$3 cm $\\times$ 3 cm area on the surface of any sensor with a precision of $\\sim$100 $\\mu$m, enabling characterization of device response as a function of position. This fills a critical need in the landscape of calibration tools for sub-Kelvin devices, including those used for dark matter detection and quantum computing. These communities have a shared goal of understanding the impact of ionizing radiation on device performance, which can be pursued with our system. This paper describes the design of the first-generation calibration system and the results from successfully testing its performance at room temperature and 20 mK.","sentences":["We have developed a calibration system based on a micro-electromechanical systems (MEMS) mirror that is capable of delivering an optical beam over a wavelength range of 180 -- 2000 nm (0.62 -- 6.89 eV) in a sub-Kelvin environment.","This portable, integrated system can steer the beam over a $\\sim$3 cm $\\times$ 3 cm area on the surface of any sensor with a precision of $\\sim$100 $\\mu$m, enabling characterization of device response as a function of position.","This fills a critical need in the landscape of calibration tools for sub-Kelvin devices, including those used for dark matter detection and quantum computing.","These communities have a shared goal of understanding the impact of ionizing radiation on device performance, which can be pursued with our system.","This paper describes the design of the first-generation calibration system and the results from successfully testing its performance at room temperature and 20 mK."],"url":"http://arxiv.org/abs/2405.02258v1","category":"quant-ph"} +{"created":"2024-05-03 17:07:45","title":"Geometric Fabrics: a Safe Guiding Medium for Policy Learning","abstract":"Robotics policies are always subjected to complex, second order dynamics that entangle their actions with resulting states. In reinforcement learning (RL) contexts, policies have the burden of deciphering these complicated interactions over massive amounts of experience and complex reward functions to learn how to accomplish tasks. Moreover, policies typically issue actions directly to controllers like Operational Space Control (OSC) or joint PD control, which induces straightline motion towards these action targets in task or joint space. However, straightline motion in these spaces for the most part do not capture the rich, nonlinear behavior our robots need to exhibit, shifting the burden of discovering these behaviors more completely to the agent. Unlike these simpler controllers, geometric fabrics capture a much richer and desirable set of behaviors via artificial, second order dynamics grounded in nonlinear geometry. These artificial dynamics shift the uncontrolled dynamics of a robot via an appropriate control law to form behavioral dynamics. Behavioral dynamics unlock a new action space and safe, guiding behavior over which RL policies are trained. Behavioral dynamics enable bang-bang-like RL policy actions that are still safe for real robots, simplify reward engineering, and help sequence real-world, high-performance policies. We describe the framework more generally and create a specific instantiation for the problem of dexterous, in-hand reorientation of a cube by a highly actuated robot hand.","sentences":["Robotics policies are always subjected to complex, second order dynamics that entangle their actions with resulting states.","In reinforcement learning (RL) contexts, policies have the burden of deciphering these complicated interactions over massive amounts of experience and complex reward functions to learn how to accomplish tasks.","Moreover, policies typically issue actions directly to controllers like Operational Space Control (OSC) or joint PD control, which induces straightline motion towards these action targets in task or joint space.","However, straightline motion in these spaces for the most part do not capture the rich, nonlinear behavior our robots need to exhibit, shifting the burden of discovering these behaviors more completely to the agent.","Unlike these simpler controllers, geometric fabrics capture a much richer and desirable set of behaviors via artificial, second order dynamics grounded in nonlinear geometry.","These artificial dynamics shift the uncontrolled dynamics of a robot via an appropriate control law to form behavioral dynamics.","Behavioral dynamics unlock a new action space and safe, guiding behavior over which RL policies are trained.","Behavioral dynamics enable bang-bang-like RL policy actions that are still safe for real robots, simplify reward engineering, and help sequence real-world, high-performance policies.","We describe the framework more generally and create a specific instantiation for the problem of dexterous, in-hand reorientation of a cube by a highly actuated robot hand."],"url":"http://arxiv.org/abs/2405.02250v1","category":"cs.RO"} +{"created":"2024-05-03 17:00:57","title":"Deep Learning of ab initio Hessians for Transition State Optimization","abstract":"Identifying transition states -- saddle points on the potential energy surface connecting reactant and product minima -- is central to predicting kinetic barriers and understanding chemical reaction mechanisms. In this work, we train an equivariant neural network potential, NewtonNet, on an ab initio dataset of thousands of organic reactions from which we derive the analytical Hessians from the fully differentiable machine learning (ML) model. By reducing the computational cost by several orders of magnitude relative to the Density Functional Theory (DFT) ab initio source, we can afford to use the learned Hessians at every step for the saddle point optimizations. We have implemented our ML Hessian algorithm in Sella, an open source software package designed to optimize atomic systems to find saddle point structures, in order to compare transition state optimization against quasi-Newton Hessian updates using DFT or the ML model. We show that the full ML Hessian robustly finds the transition states of 240 unseen organic reactions, even when the quality of the initial guess structures are degraded, while reducing the number of optimization steps to convergence by 2--3$\\times$ compared to the quasi-Newton DFT and ML methods. All data generation, NewtonNet model, and ML transition state finding methods are available in an automated workflow.","sentences":["Identifying transition states -- saddle points on the potential energy surface connecting reactant and product minima -- is central to predicting kinetic barriers and understanding chemical reaction mechanisms.","In this work, we train an equivariant neural network potential, NewtonNet, on an ab initio dataset of thousands of organic reactions from which we derive the analytical Hessians from the fully differentiable machine learning (ML) model.","By reducing the computational cost by several orders of magnitude relative to the Density Functional Theory (DFT) ab initio source, we can afford to use the learned Hessians at every step for the saddle point optimizations.","We have implemented our ML Hessian algorithm in Sella, an open source software package designed to optimize atomic systems to find saddle point structures, in order to compare transition state optimization against quasi-Newton Hessian updates using DFT or the ML model.","We show that the full ML Hessian robustly finds the transition states of 240 unseen organic reactions, even when the quality of the initial guess structures are degraded, while reducing the number of optimization steps to convergence by 2--3$\\times$ compared to the quasi-Newton DFT and ML methods.","All data generation, NewtonNet model, and ML transition state finding methods are available in an automated workflow."],"url":"http://arxiv.org/abs/2405.02247v1","category":"physics.chem-ph"} +{"created":"2024-05-03 17:00:00","title":"What matters when building vision-language models?","abstract":"The growing interest in vision-language models (VLMs) has been driven by improvements in large language models and vision transformers. Despite the abundance of literature on this subject, we observe that critical decisions regarding the design of VLMs are often not justified. We argue that these unsupported decisions impede progress in the field by making it difficult to identify which choices improve model performance. To address this issue, we conduct extensive experiments around pre-trained models, architecture choice, data, and training methods. Our consolidation of findings includes the development of Idefics2, an efficient foundational VLM of 8 billion parameters. Idefics2 achieves state-of-the-art performance within its size category across various multimodal benchmarks, and is often on par with models four times its size. We release the model (base, instructed, and chat) along with the datasets created for its training.","sentences":["The growing interest in vision-language models (VLMs) has been driven by improvements in large language models and vision transformers.","Despite the abundance of literature on this subject, we observe that critical decisions regarding the design of VLMs are often not justified.","We argue that these unsupported decisions impede progress in the field by making it difficult to identify which choices improve model performance.","To address this issue, we conduct extensive experiments around pre-trained models, architecture choice, data, and training methods.","Our consolidation of findings includes the development of Idefics2, an efficient foundational VLM of 8 billion parameters.","Idefics2 achieves state-of-the-art performance within its size category across various multimodal benchmarks, and is often on par with models four times its size.","We release the model (base, instructed, and chat) along with the datasets created for its training."],"url":"http://arxiv.org/abs/2405.02246v1","category":"cs.CV"} +{"created":"2024-05-03 16:56:46","title":"Mean field games with common noise via Malliavin calculus","abstract":"In this work, we present an alternative proof to the existence of equilibria for mean field games with common noise. By adapting a compactness criterion for Malliavin differentiable random variables to random processes, we obtain strong equilibria, in which the conditional mean and optimal control are adapted and defined on the original probability space. The proof is simplified by the assumption that players interact through a conditional mean process instead of conditional probability measures as in the general case.","sentences":["In this work, we present an alternative proof to the existence of equilibria for mean field games with common noise.","By adapting a compactness criterion for Malliavin differentiable random variables to random processes, we obtain strong equilibria, in which the conditional mean and optimal control are adapted and defined on the original probability space.","The proof is simplified by the assumption that players interact through a conditional mean process instead of conditional probability measures as in the general case."],"url":"http://arxiv.org/abs/2405.02244v1","category":"math.PR"} +{"created":"2024-05-03 16:53:24","title":"Towards Improving Learning from Demonstration Algorithms via MCMC Methods","abstract":"Behavioral cloning, or more broadly, learning from demonstrations (LfD) is a priomising direction for robot policy learning in complex scenarios. Albeit being straightforward to implement and data-efficient, behavioral cloning has its own drawbacks, limiting its efficacy in real robot setups. In this work, we take one step towards improving learning from demonstration algorithms by leveraging implicit energy-based policy models. Results suggest that in selected complex robot policy learning scenarios, treating supervised policy learning with an implicit model generally performs better, on average, than commonly used neural network-based explicit models, especially in the cases of approximating potentially discontinuous and multimodal functions.","sentences":["Behavioral cloning, or more broadly, learning from demonstrations (LfD) is a priomising direction for robot policy learning in complex scenarios.","Albeit being straightforward to implement and data-efficient, behavioral cloning has its own drawbacks, limiting its efficacy in real robot setups.","In this work, we take one step towards improving learning from demonstration algorithms by leveraging implicit energy-based policy models.","Results suggest that in selected complex robot policy learning scenarios, treating supervised policy learning with an implicit model generally performs better, on average, than commonly used neural network-based explicit models, especially in the cases of approximating potentially discontinuous and multimodal functions."],"url":"http://arxiv.org/abs/2405.02243v1","category":"cs.RO"} +{"created":"2024-05-03 16:52:01","title":"WeightedPose: Generalizable Cross-Pose Estimation via Weighted SVD","abstract":"We present a novel method for robotic manipulation tasks in human environments that require reasoning about the 3D geometric relationship between a pair of objects. Traditional end-to-end trained policies, which map from pixel observations to low-level robot actions, struggle to reason about complex pose relationships and have difficulty generalizing to unseen object configurations. To address these challenges, we propose a method that learns to reason about the 3D geometric relationship between objects, focusing on the relationship between key parts on one object with respect to key parts on another object. Our standalone model utilizes Weighted SVD to reason about both pose relationships between articulated parts and between free-floating objects. This approach allows the robot to understand the relationship between the oven door and the oven body, as well as the relationship between the lasagna plate and the oven, for example. By considering the 3D geometric relationship between objects, our method enables robots to perform complex manipulation tasks that reason about object-centric representations. We open source the code and demonstrate the results here","sentences":["We present a novel method for robotic manipulation tasks in human environments that require reasoning about the 3D geometric relationship between a pair of objects.","Traditional end-to-end trained policies, which map from pixel observations to low-level robot actions, struggle to reason about complex pose relationships and have difficulty generalizing to unseen object configurations.","To address these challenges, we propose a method that learns to reason about the 3D geometric relationship between objects, focusing on the relationship between key parts on one object with respect to key parts on another object.","Our standalone model utilizes Weighted SVD to reason about both pose relationships between articulated parts and between free-floating objects.","This approach allows the robot to understand the relationship between the oven door and the oven body, as well as the relationship between the lasagna plate and the oven, for example.","By considering the 3D geometric relationship between objects, our method enables robots to perform complex manipulation tasks that reason about object-centric representations.","We open source the code and demonstrate the results here"],"url":"http://arxiv.org/abs/2405.02241v1","category":"cs.RO"} +{"created":"2024-05-03 16:50:02","title":"Secure and Efficient General Matrix Multiplication On Cloud Using Homomorphic Encryption","abstract":"Despite the cloud enormous technical and financial advantages, security and privacy have always been the primary concern for adopting cloud computing facility, especially for government agencies and commercial sectors with high-security requirements. Homomorphic Encryption (HE) has recently emerged as an effective tool in assuring privacy and security for sensitive applications by allowing computing on encrypted data. One major obstacle to employing HE-based computation, however, is its excessive computational cost, which is multiple magnitudes higher than its counterpart based on the plaintext. In this paper, we study the problem of how to reduce the HE-based computational cost for general Matrix Multiplication (MM), i.e., a fundamental building block for numerous practical applications, by taking advantage of the Single Instruction Multiple Data (SIMD) operation supported by HE schemes. Specifically, we develop a novel element-wise algorithm for general matrix multiplication, based on which we propose two HE-based General Matrix Multiplication (HEGMM) algorithms to reduce the HE computation cost. Our experimental results show that our algorithms can significantly outperform the state-of-the-art approaches of HE-based matrix multiplication.","sentences":["Despite the cloud enormous technical and financial advantages, security and privacy have always been the primary concern for adopting cloud computing facility, especially for government agencies and commercial sectors with high-security requirements.","Homomorphic Encryption (HE) has recently emerged as an effective tool in assuring privacy and security for sensitive applications by allowing computing on encrypted data.","One major obstacle to employing HE-based computation, however, is its excessive computational cost, which is multiple magnitudes higher than its counterpart based on the plaintext.","In this paper, we study the problem of how to reduce the HE-based computational cost for general Matrix Multiplication (MM), i.e., a fundamental building block for numerous practical applications, by taking advantage of the Single Instruction Multiple Data (SIMD) operation supported by HE schemes.","Specifically, we develop a novel element-wise algorithm for general matrix multiplication, based on which we propose two HE-based General Matrix Multiplication (HEGMM) algorithms to reduce the HE computation cost.","Our experimental results show that our algorithms can significantly outperform the state-of-the-art approaches of HE-based matrix multiplication."],"url":"http://arxiv.org/abs/2405.02238v1","category":"cs.CR"} +{"created":"2024-05-03 16:43:09","title":"Interaction-controlled transport in a two-dimensional massless-massive Dirac system: Transition from degenerate to nondegenerate regimes","abstract":"The resistivity of two-dimensional (2D) metals generally exhibits insensitivity to electron-electron scattering. However, it's worth noting that Galilean invariance may not hold true in systems characterized by a spectrum containing multiple electronic branches or in scenarios involving electron-hole plasma. In the context of our study, we focus on 2D electrons confined within a triple quantum well (TQW) based on HgTe. This system displays a coexistence of energy bands featuring both linear and parabolic-like spectra at low energy and, therefore, lacks the Galilean invariance. This research employs a combined theoretical and experimental approach to investigate the transport properties of this two-component system across various regimes. By manipulating carrier density and temperature, we tune our system from a fully degenerate regime, where resistance follows a temperature-dependent behavior proportional to $T^2$, to a regime where both types of electrons adhere to Boltzmann statistics. In the non-degenerate regime, electron interactions lead to resistance that is weakly dependent on temperature. Notably, our experimental observations closely align with the theoretical predictions derived in this study. This work establishes the HgTe-based TQW as a promising platform for exploring different interaction dominant scenarios for the massless-massive Dirac system.9 pages, 8 figures","sentences":["The resistivity of two-dimensional (2D) metals generally exhibits insensitivity to electron-electron scattering.","However, it's worth noting that Galilean invariance may not hold true in systems characterized by a spectrum containing multiple electronic branches or in scenarios involving electron-hole plasma.","In the context of our study, we focus on 2D electrons confined within a triple quantum well (TQW) based on HgTe.","This system displays a coexistence of energy bands featuring both linear and parabolic-like spectra at low energy and, therefore, lacks the Galilean invariance.","This research employs a combined theoretical and experimental approach to investigate the transport properties of this two-component system across various regimes.","By manipulating carrier density and temperature, we tune our system from a fully degenerate regime, where resistance follows a temperature-dependent behavior proportional to $T^2$, to a regime where both types of electrons adhere to Boltzmann statistics.","In the non-degenerate regime, electron interactions lead to resistance that is weakly dependent on temperature.","Notably, our experimental observations closely align with the theoretical predictions derived in this study.","This work establishes the HgTe-based TQW as a promising platform for exploring different interaction dominant scenarios for the massless-massive Dirac system.9 pages, 8 figures"],"url":"http://arxiv.org/abs/2405.02233v1","category":"cond-mat.mes-hall"} +{"created":"2024-05-03 16:39:34","title":"Estimating microlensing parameters from observables and stellar isochrones with pyLIMASS","abstract":"We present pyLIMASS, a novel algorithm for estimating the physical properties of the lensing system in microlensing events. The main idea of pyLIMASS is to combine all available information regarding the microlensing event, defined as observables, and to estimate the parameter distributions of the system, such as the lens mass and distance. The algorithm is based on isochrones for the stars model and combine the observables using a Gaussian Mixtures approach. After describing the mathematical formalism and its implementation, we discuss the algorithm's performance on simulated and published events. Generally, the pyLIMASS estimations are in good agreement (i.e., within 1-$\\sigma$) with the results of the selected published events, making it an effective tool to estimate the lens properties and their distribution. The applicability of the method was tested by using a catalog of realistically simulated events that could be observed by the future Galactic Bulge Time Domain Survey of the Nancy Grace Roman Space Telescope. By solely using constraints from the Roman lightcurves and images, pyLIMASS estimates the masses of the lens of the Roman catalog with a median precision of 20% with almost no bias.","sentences":["We present pyLIMASS, a novel algorithm for estimating the physical properties of the lensing system in microlensing events.","The main idea of pyLIMASS is to combine all available information regarding the microlensing event, defined as observables, and to estimate the parameter distributions of the system, such as the lens mass and distance.","The algorithm is based on isochrones for the stars model and combine the observables using a Gaussian Mixtures approach.","After describing the mathematical formalism and its implementation, we discuss the algorithm's performance on simulated and published events.","Generally, the pyLIMASS estimations are in good agreement (i.e., within 1-$\\sigma$) with the results of the selected published events, making it an effective tool to estimate the lens properties and their distribution.","The applicability of the method was tested by using a catalog of realistically simulated events that could be observed by the future Galactic Bulge Time Domain Survey of the Nancy Grace Roman Space Telescope.","By solely using constraints from the Roman lightcurves and images, pyLIMASS estimates the masses of the lens of the Roman catalog with a median precision of 20% with almost no bias."],"url":"http://arxiv.org/abs/2405.02230v1","category":"astro-ph.IM"} +{"created":"2024-05-03 16:38:51","title":"REASONS: A benchmark for REtrieval and Automated citationS Of scieNtific Sentences using Public and Proprietary LLMs","abstract":"Automatic citation generation for sentences in a document or report is paramount for intelligence analysts, cybersecurity, news agencies, and education personnel. In this research, we investigate whether large language models (LLMs) are capable of generating references based on two forms of sentence queries: (a) Direct Queries, LLMs are asked to provide author names of the given research article, and (b) Indirect Queries, LLMs are asked to provide the title of a mentioned article when given a sentence from a different article. To demonstrate where LLM stands in this task, we introduce a large dataset called REASONS comprising abstracts of the 12 most popular domains of scientific research on arXiv. From around 20K research articles, we make the following deductions on public and proprietary LLMs: (a) State-of-the-art, often called anthropomorphic GPT-4 and GPT-3.5, suffers from high pass percentage (PP) to minimize the hallucination rate (HR). When tested with Perplexity.ai (7B), they unexpectedly made more errors; (b) Augmenting relevant metadata lowered the PP and gave the lowest HR; (c) Advance retrieval-augmented generation (RAG) using Mistral demonstrates consistent and robust citation support on indirect queries and matched performance to GPT-3.5 and GPT-4. The HR across all domains and models decreased by an average of 41.93% and the PP was reduced to 0% in most cases. In terms of generation quality, the average F1 Score and BLEU were 68.09% and 57.51%, respectively; (d) Testing with adversarial samples showed that LLMs, including the Advance RAG Mistral, struggle to understand context, but the extent of this issue was small in Mistral and GPT-4-Preview. Our study con tributes valuable insights into the reliability of RAG for automated citation generation tasks.","sentences":["Automatic citation generation for sentences in a document or report is paramount for intelligence analysts, cybersecurity, news agencies, and education personnel.","In this research, we investigate whether large language models (LLMs) are capable of generating references based on two forms of sentence queries: (a) Direct Queries, LLMs are asked to provide author names of the given research article, and (b) Indirect Queries, LLMs are asked to provide the title of a mentioned article when given a sentence from a different article.","To demonstrate where LLM stands in this task, we introduce a large dataset called REASONS comprising abstracts of the 12 most popular domains of scientific research on arXiv.","From around 20K research articles, we make the following deductions on public and proprietary LLMs: (a) State-of-the-art, often called anthropomorphic GPT-4 and GPT-3.5, suffers from high pass percentage (PP) to minimize the hallucination rate (HR).","When tested with Perplexity.ai (7B), they unexpectedly made more errors; (b) Augmenting relevant metadata lowered the PP and gave the lowest HR; (c) Advance retrieval-augmented generation (RAG) using Mistral demonstrates consistent and robust citation support on indirect queries and matched performance to GPT-3.5 and GPT-4.","The HR across all domains and models decreased by an average of 41.93% and the PP was reduced to 0% in most cases.","In terms of generation quality, the average F1 Score and BLEU were 68.09% and 57.51%, respectively; (d) Testing with adversarial samples showed that LLMs, including the Advance RAG Mistral, struggle to understand context, but the extent of this issue was small in Mistral and GPT-4-Preview.","Our study con tributes valuable insights into the reliability of RAG for automated citation generation tasks."],"url":"http://arxiv.org/abs/2405.02228v1","category":"cs.CL"} +{"created":"2024-05-03 16:35:10","title":"Comparing the decoherence effects due to black holes versus ordinary matter","abstract":"Recently a certain thought experiment was discussed which involves the decoherence of a quantum system due to a black hole. Here we show how this phenomenon is consistent with standard ideas about quantum black holes. In other words, modeling the black hole as a quantum system at finite temperature one obtains the same answer. We demonstrate this by analyzing the problem in terms of an effective theory that can apply both for the black hole case and for an ordinary matter system, showing that the same qualitative effect is present for ordinary matter at finite temperature.","sentences":["Recently a certain thought experiment was discussed which involves the decoherence of a quantum system due to a black hole.","Here we show how this phenomenon is consistent with standard ideas about quantum black holes.","In other words, modeling the black hole as a quantum system at finite temperature one obtains the same answer.","We demonstrate this by analyzing the problem in terms of an effective theory that can apply both for the black hole case and for an ordinary matter system, showing that the same qualitative effect is present for ordinary matter at finite temperature."],"url":"http://arxiv.org/abs/2405.02227v1","category":"hep-th"} +{"created":"2024-05-03 16:32:09","title":"Fair Risk Control: A Generalized Framework for Calibrating Multi-group Fairness Risks","abstract":"This paper introduces a framework for post-processing machine learning models so that their predictions satisfy multi-group fairness guarantees. Based on the celebrated notion of multicalibration, we introduce $(\\mathbf{s},\\mathcal{G}, \\alpha)-$GMC (Generalized Multi-Dimensional Multicalibration) for multi-dimensional mappings $\\mathbf{s}$, constraint set $\\mathcal{G}$, and a pre-specified threshold level $\\alpha$. We propose associated algorithms to achieve this notion in general settings. This framework is then applied to diverse scenarios encompassing different fairness concerns, including false negative rate control in image segmentation, prediction set conditional uncertainty quantification in hierarchical classification, and de-biased text generation in language models. We conduct numerical studies on several datasets and tasks.","sentences":["This paper introduces a framework for post-processing machine learning models so that their predictions satisfy multi-group fairness guarantees.","Based on the celebrated notion of multicalibration, we introduce $(\\mathbf{s},\\mathcal{G}, \\alpha)-$GMC (Generalized Multi-Dimensional Multicalibration) for multi-dimensional mappings $\\mathbf{s}$, constraint set $\\mathcal{G}$, and a pre-specified threshold level $\\alpha$. We propose associated algorithms to achieve this notion in general settings.","This framework is then applied to diverse scenarios encompassing different fairness concerns, including false negative rate control in image segmentation, prediction set conditional uncertainty quantification in hierarchical classification, and de-biased text generation in language models.","We conduct numerical studies on several datasets and tasks."],"url":"http://arxiv.org/abs/2405.02225v1","category":"stat.ML"} +{"created":"2024-05-03 16:31:11","title":"A close binary lens revealed by the microlensing event Gaia20bof","abstract":"During the last 25 years, hundreds of binary stars and planets have been discovered towards the Galactic Bulge by microlensing surveys. Thanks to a new generation of large-sky surveys, it is now possible to regularly detect microlensing events across the entire sky. The OMEGA Key Projet at the Las Cumbres Observatory carries out automated follow-up observations of microlensing events alerted by these surveys with the aim of identifying and characterizing exoplanets as well as stellar remnants. In this study, we present the analysis of the binary lens event Gaia20bof. By automatically requesting additional observations, the OMEGA Key Project obtained dense time coverage of an anomaly near the peak of the event, allowing characterization of the lensing system. The observed anomaly in the lightcurve is due to a binary lens. However, several models can explain the observations. Spectroscopic observations indicate that the source is located at $\\le2.0$ kpc, in agreement with the parallax measurements from Gaia. While the models are currently degenerate, future observations, especially the Gaia astrometric time series as well as high-resolution imaging, will provide extra constraints to distinguish between them.","sentences":["During the last 25 years, hundreds of binary stars and planets have been discovered towards the Galactic Bulge by microlensing surveys.","Thanks to a new generation of large-sky surveys, it is now possible to regularly detect microlensing events across the entire sky.","The OMEGA Key Projet at the Las Cumbres Observatory carries out automated follow-up observations of microlensing events alerted by these surveys with the aim of identifying and characterizing exoplanets as well as stellar remnants.","In this study, we present the analysis of the binary lens event Gaia20bof.","By automatically requesting additional observations, the OMEGA Key Project obtained dense time coverage of an anomaly near the peak of the event, allowing characterization of the lensing system.","The observed anomaly in the lightcurve is due to a binary lens.","However, several models can explain the observations.","Spectroscopic observations indicate that the source is located at $\\le2.0$ kpc, in agreement with the parallax measurements from Gaia.","While the models are currently degenerate, future observations, especially the Gaia astrometric time series as well as high-resolution imaging, will provide extra constraints to distinguish between them."],"url":"http://arxiv.org/abs/2405.02223v1","category":"astro-ph.EP"} +{"created":"2024-05-03 16:20:14","title":"Natural disorder distributions from measurement","abstract":"We consider scenarios where the dynamics of a quantum system are partially determined by prior local measurements of some interacting environmental degrees of freedom. The resulting effective system dynamics are described by a disordered Hamiltonian, with spacetime-varying parameter values drawn from distributions that are generically neither flat nor Gaussian. This class of scenarios is a natural extension of those where a fully non-dynamical environmental degree of freedom determines a universal coupling constant for the system. Using a family of quasi-exactly solvable anharmonic oscillators, we consider environmental ground states of nonlinearly coupled degrees of freedom, unrestricted by a weak coupling expansion, which include strongly quantum non-Gaussian states. We derive the properties of distributions for both quadrature and photon number measurements. Measurement-induced disorder of this kind is likely realizable in laboratory quantum systems and, given a notion of naturally occurring measurement, suggests a new class of scenarios for the dynamics of quantum systems in particle physics and cosmology.","sentences":["We consider scenarios where the dynamics of a quantum system are partially determined by prior local measurements of some interacting environmental degrees of freedom.","The resulting effective system dynamics are described by a disordered Hamiltonian, with spacetime-varying parameter values drawn from distributions that are generically neither flat nor Gaussian.","This class of scenarios is a natural extension of those where a fully non-dynamical environmental degree of freedom determines a universal coupling constant for the system.","Using a family of quasi-exactly solvable anharmonic oscillators, we consider environmental ground states of nonlinearly coupled degrees of freedom, unrestricted by a weak coupling expansion, which include strongly quantum non-Gaussian states.","We derive the properties of distributions for both quadrature and photon number measurements.","Measurement-induced disorder of this kind is likely realizable in laboratory quantum systems and, given a notion of naturally occurring measurement, suggests a new class of scenarios for the dynamics of quantum systems in particle physics and cosmology."],"url":"http://arxiv.org/abs/2405.02214v1","category":"quant-ph"} +{"created":"2024-05-03 16:19:24","title":"Automatic Programming: Large Language Models and Beyond","abstract":"Automatic programming has seen increasing popularity due to the emergence of tools like GitHub Copilot which rely on Large Language Models (LLMs). At the same time, automatically generated code faces challenges during deployment due to concerns around quality and trust. In this article, we study automated coding in a general sense and study the concerns around code quality, security and related issues of programmer responsibility. These are key issues for organizations while deciding on the usage of automatically generated code. We discuss how advances in software engineering such as program repair and analysis can enable automatic programming. We conclude with a forward looking view, focusing on the programming environment of the near future, where programmers may need to switch to different roles to fully utilize the power of automatic programming. Automated repair of automatically generated programs from LLMs, can help produce higher assurance code from LLMs, along with evidence of assurance","sentences":["Automatic programming has seen increasing popularity due to the emergence of tools like GitHub Copilot which rely on Large Language Models (LLMs).","At the same time, automatically generated code faces challenges during deployment due to concerns around quality and trust.","In this article, we study automated coding in a general sense and study the concerns around code quality, security and related issues of programmer responsibility.","These are key issues for organizations while deciding on the usage of automatically generated code.","We discuss how advances in software engineering such as program repair and analysis can enable automatic programming.","We conclude with a forward looking view, focusing on the programming environment of the near future, where programmers may need to switch to different roles to fully utilize the power of automatic programming.","Automated repair of automatically generated programs from LLMs, can help produce higher assurance code from LLMs, along with evidence of assurance"],"url":"http://arxiv.org/abs/2405.02213v1","category":"cs.SE"} +{"created":"2024-05-03 16:08:05","title":"Water Structure and Electric Fields at the Interface of Oil Droplets","abstract":"Mesoscale water-hydrophobic interfaces are of fundamental importance in multiple disciplines, but their molecular properties have remained elusive for decades due to experimental complications and alternate theoretical explanations. Surface-specific spectroscopies, such as vibrational sum-frequency techniques, suffer from either sample preparation issues or the need for complex spectral corrections. Here, we report on a robust \"in solution\" interface-selective Raman spectroscopy approach using multivariate curve resolution to probe hexadecane in water emulsions. Computationally, we use the recently developed monomer field model for Raman spectroscopy to help interpret the interfacial spectra. Unlike with vibrational sum frequency techniques, our interfacial spectra are readily comparable to the spectra of bulk water, yielding new insights. The combination of experiment and theory show that the interface leads to reduced tetrahedral order and weaker hydrogen bonding, giving rise to a substantial water population with dangling OH at the interface. Additionally, the stretching mode of these free OH experiences a ~80 cm-1 red-shift due to a strong electric field which we attribute to the negative zeta potential that is general to oil droplets. These findings are either opposite to, or absent in, the molecular hydrophobic interface formed by small solutes. Together, water structural disorder and enhanced electrostatics are an emergent feature at the mesoscale interface of oil-water emulsions, with an estimated interfacial electric field of ~35-70 MV/cm that is important for chemical reactivity.","sentences":["Mesoscale water-hydrophobic interfaces are of fundamental importance in multiple disciplines, but their molecular properties have remained elusive for decades due to experimental complications and alternate theoretical explanations.","Surface-specific spectroscopies, such as vibrational sum-frequency techniques, suffer from either sample preparation issues or the need for complex spectral corrections.","Here, we report on a robust \"in solution\" interface-selective Raman spectroscopy approach using multivariate curve resolution to probe hexadecane in water emulsions.","Computationally, we use the recently developed monomer field model for Raman spectroscopy to help interpret the interfacial spectra.","Unlike with vibrational sum frequency techniques, our interfacial spectra are readily comparable to the spectra of bulk water, yielding new insights.","The combination of experiment and theory show that the interface leads to reduced tetrahedral order and weaker hydrogen bonding, giving rise to a substantial water population with dangling OH at the interface.","Additionally, the stretching mode of these free OH experiences a ~80 cm-1 red-shift due to a strong electric field which we attribute to the negative zeta potential that is general to oil droplets.","These findings are either opposite to, or absent in, the molecular hydrophobic interface formed by small solutes.","Together, water structural disorder and enhanced electrostatics are an emergent feature at the mesoscale interface of oil-water emulsions, with an estimated interfacial electric field of ~35-70 MV/cm that is important for chemical reactivity."],"url":"http://arxiv.org/abs/2405.02207v1","category":"physics.chem-ph"} +{"created":"2024-05-03 15:53:52","title":"Possible Causes of False General Relativity Violations in Gravitational Wave Observations","abstract":"General relativity (GR) has proven to be a highly successful theory of gravity since its inception. The theory has thrivingly passed numerous experimental tests, predominantly in weak gravity, low relative speeds, and linear regimes, but also in the strong-field and very low-speed regimes with binary pulsars. Observable gravitational waves (GWs) originate from regions of spacetime where gravity is extremely strong, making them a unique tool for testing GR, in previously inaccessible regions of large curvature, relativistic speeds, and strong gravity. Since their first detection, GWs have been extensively used to test GR, but no deviations have been found so far. Given GR's tremendous success in explaining current astronomical observations and laboratory experiments, accepting any deviation from it requires a very high level of statistical confidence and consistency of the deviation across GW sources. In this paper, we compile a comprehensive list of potential causes that can lead to a false identification of a GR violation in standard tests of GR on data from current and future ground-based GW detectors. These causes include detector noise, signal overlaps, gaps in the data, detector calibration, source model inaccuracy, missing physics in the source and in the underlying environment model, source misidentification, and mismodeling of the astrophysical population. We also provide a rough estimate of when each of these causes will become important for tests of GR for different detector sensitivities. We argue that each of these causes should be thoroughly investigated, quantified, and ruled out before claiming a GR violation in GW observations.","sentences":["General relativity (GR) has proven to be a highly successful theory of gravity since its inception.","The theory has thrivingly passed numerous experimental tests, predominantly in weak gravity, low relative speeds, and linear regimes, but also in the strong-field and very low-speed regimes with binary pulsars.","Observable gravitational waves (GWs) originate from regions of spacetime where gravity is extremely strong, making them a unique tool for testing GR, in previously inaccessible regions of large curvature, relativistic speeds, and strong gravity.","Since their first detection, GWs have been extensively used to test GR, but no deviations have been found so far.","Given GR's tremendous success in explaining current astronomical observations and laboratory experiments, accepting any deviation from it requires a very high level of statistical confidence and consistency of the deviation across GW sources.","In this paper, we compile a comprehensive list of potential causes that can lead to a false identification of a GR violation in standard tests of GR on data from current and future ground-based GW detectors.","These causes include detector noise, signal overlaps, gaps in the data, detector calibration, source model inaccuracy, missing physics in the source and in the underlying environment model, source misidentification, and mismodeling of the astrophysical population.","We also provide a rough estimate of when each of these causes will become important for tests of GR for different detector sensitivities.","We argue that each of these causes should be thoroughly investigated, quantified, and ruled out before claiming a GR violation in GW observations."],"url":"http://arxiv.org/abs/2405.02197v1","category":"gr-qc"} +{"created":"2024-05-03 15:52:44","title":"GTA: a new General Tensor Accelerator with Better Area Efficiency and Data Reuse","abstract":"Recently, tensor algebra have witnessed significant applications across various domains. Each operator in tensor algebra features different computational workload and precision. However, current general accelerators, such as VPU, GPGPU, and CGRA, support tensor operators with low energy and area efficiency. This paper conducts an in-depth exploration of general accelerator for tensor processing. First, we find the similarity between matrix multiplication and precision multiplication, and create a method classifying tensor operators. Then, we implement two discoveries and introduce the systolic architecture into general-purpose accelerator. Therefore, we propose a new General Tensor Accelerator (GTA), which has a better area efficiency and data reuse. Furthermore, we create a large hardware scheduling space consisting of dataflow, precision and array resize. Our evaluation results demonstrate that GTA is able to achieves 7.76X, 5.35X, 8.76X memory efficiency and 6.45X, 3.39X, 25.83X speedup over of VPU, GPGPU and CGRA.","sentences":["Recently, tensor algebra have witnessed significant applications across various domains.","Each operator in tensor algebra features different computational workload and precision.","However, current general accelerators, such as VPU, GPGPU, and CGRA, support tensor operators with low energy and area efficiency.","This paper conducts an in-depth exploration of general accelerator for tensor processing. ","First, we find the similarity between matrix multiplication and precision multiplication, and create a method classifying tensor operators.","Then, we implement two discoveries and introduce the systolic architecture into general-purpose accelerator.","Therefore, we propose a new General Tensor Accelerator (GTA), which has a better area efficiency and data reuse.","Furthermore, we create a large hardware scheduling space consisting of dataflow, precision and array resize.","Our evaluation results demonstrate that GTA is able to achieves 7.76X, 5.35X, 8.76X memory efficiency and 6.45X, 3.39X, 25.83X speedup over of VPU, GPGPU and CGRA."],"url":"http://arxiv.org/abs/2405.02196v1","category":"cs.AR"} +{"created":"2024-05-03 15:50:33","title":"Coherent XUV super continuum emission from atomic bound states","abstract":"Coherent supercontinuum radiation in the extreme-ultraviolet (XUV) range is indispensable for synthesizing attosecond light pulses and for exploring transient atomic structures. Here, we report the striking observations of coherent XUV supercontinuum (XSC) extended from below to far above the ionization threshold, which exhibits completely different temporal and spatial properties comparing to the conventional rescattering induced high harmonic generation (HHG). We demonstrate that the strong-field created coherence among bound orbitals strongly distort the atomic transition energies during the pulse, leading to coherent emission spanning tens of electron-volts, in contrast to the line emission via free-induction decay occurring after the pulse. The supposed non-radiating bound dark states contribute as well by emitting dressed energy through dark-to-bright emission mechanism. All the processes modulated at sub-cycle time scale jointly form this new-type coherent XSC. This work achieves the strong-field attosecond control of the exotic atomic radiation dynamics and provides the means of simultaneous generation of separated attosecond sources, i.e., XSC and HHG, with potential advancing attosecond interferometry.","sentences":["Coherent supercontinuum radiation in the extreme-ultraviolet (XUV) range is indispensable for synthesizing attosecond light pulses and for exploring transient atomic structures.","Here, we report the striking observations of coherent XUV supercontinuum (XSC) extended from below to far above the ionization threshold, which exhibits completely different temporal and spatial properties comparing to the conventional rescattering induced high harmonic generation (HHG).","We demonstrate that the strong-field created coherence among bound orbitals strongly distort the atomic transition energies during the pulse, leading to coherent emission spanning tens of electron-volts, in contrast to the line emission via free-induction decay occurring after the pulse.","The supposed non-radiating bound dark states contribute as well by emitting dressed energy through dark-to-bright emission mechanism.","All the processes modulated at sub-cycle time scale jointly form this new-type coherent XSC.","This work achieves the strong-field attosecond control of the exotic atomic radiation dynamics and provides the means of simultaneous generation of separated attosecond sources, i.e., XSC and HHG, with potential advancing attosecond interferometry."],"url":"http://arxiv.org/abs/2405.02194v1","category":"physics.atom-ph"} +{"created":"2024-05-03 15:50:26","title":"Relic gravitons and non-stationary processes","abstract":"Stationary processes do not accurately describe the diffuse backgrounds of relic gravitons whose correlations are homogeneous in space (i.e. only dependent upon the distance between the two spatial locations) but not in time. The symmetries of the autocorrelations ultimately reflect the quantum mechanical origin of the diffuse backgrounds and lead to non-stationary observables at late time. In particular, large oscillations are believed to arise in the spectral energy density that is customarily (but approximately) related to the tensor power spectrum. When the full expression of the spectral energy density is employed the amplitudes of oscillation are instead suppressed in the large-scale limit and the non-stationary features of the late-time signal practically disappear. For similar reasons the relations between the spectral energy density and the spectral amplitude are ambiguous in the presence of non-stationary features. While it is debatable if the non-stationary features are (or will be) directly detectable, we argue that the spectral amplitude following from the Wiener-Khintchine theorem is generally inappropriate for a consistent description of the relic signal. Nevertheless the strong oscillatory behaviour of the late-time observables is naturally smeared out provided the spectral energy density is selected as pivotal variable.","sentences":["Stationary processes do not accurately describe the diffuse backgrounds of relic gravitons whose correlations are homogeneous in space (i.e. only dependent upon the distance between the two spatial locations) but not in time.","The symmetries of the autocorrelations ultimately reflect the quantum mechanical origin of the diffuse backgrounds and lead to non-stationary observables at late time.","In particular, large oscillations are believed to arise in the spectral energy density that is customarily (but approximately) related to the tensor power spectrum.","When the full expression of the spectral energy density is employed the amplitudes of oscillation are instead suppressed in the large-scale limit and the non-stationary features of the late-time signal practically disappear.","For similar reasons the relations between the spectral energy density and the spectral amplitude are ambiguous in the presence of non-stationary features.","While it is debatable if the non-stationary features are (or will be) directly detectable, we argue that the spectral amplitude following from the Wiener-Khintchine theorem is generally inappropriate for a consistent description of the relic signal.","Nevertheless the strong oscillatory behaviour of the late-time observables is naturally smeared out provided the spectral energy density is selected as pivotal variable."],"url":"http://arxiv.org/abs/2405.02193v1","category":"gr-qc"} +{"created":"2024-05-03 15:45:38","title":"The effective field theory of multi-field inflationary fluctuations","abstract":"We build an effective field theory of multi-field inflationary fluctuations based on the adiabatic perturbation and on any number of matter fluctuations in the non-adiabatic sector, without imposing extra symmetries on the latter. Focusing on terms with at most two derivatives in fields' fluctuations, we argue that taking the decoupling limit -- in which gravitational interactions are neglected -- is justified in a quasi de Sitter spacetime with slow-varying Hubble scale. With these working hypotheses, we find simple forms of multi-field mixings (quadratic order) and interactions (cubic order). We explain how to break degeneracies amongst various terms, and we compare the predictions of the effective field theory to those of non-linear sigma models of inflation and more general multi-field Lagrangian in the traditional model approach. We stress that several multi-field cubic interactions are dictated by non-linearly realised spacetime symmetries and are therefore given in terms of parameters already present in the quadratic action. We propose various directions to systematically explore the phenomenology generic to multi-field inflation and beyond the lamppost of known models.","sentences":["We build an effective field theory of multi-field inflationary fluctuations based on the adiabatic perturbation and on any number of matter fluctuations in the non-adiabatic sector, without imposing extra symmetries on the latter.","Focusing on terms with at most two derivatives in fields' fluctuations, we argue that taking the decoupling limit -- in which gravitational interactions are neglected -- is justified in a quasi de Sitter spacetime with slow-varying Hubble scale.","With these working hypotheses, we find simple forms of multi-field mixings (quadratic order) and interactions (cubic order).","We explain how to break degeneracies amongst various terms, and we compare the predictions of the effective field theory to those of non-linear sigma models of inflation and more general multi-field Lagrangian in the traditional model approach.","We stress that several multi-field cubic interactions are dictated by non-linearly realised spacetime symmetries and are therefore given in terms of parameters already present in the quadratic action.","We propose various directions to systematically explore the phenomenology generic to multi-field inflation and beyond the lamppost of known models."],"url":"http://arxiv.org/abs/2405.02190v1","category":"astro-ph.CO"} +{"created":"2024-05-03 15:44:31","title":"Optimistic Regret Bounds for Online Learning in Adversarial Markov Decision Processes","abstract":"The Adversarial Markov Decision Process (AMDP) is a learning framework that deals with unknown and varying tasks in decision-making applications like robotics and recommendation systems. A major limitation of the AMDP formalism, however, is pessimistic regret analysis results in the sense that although the cost function can change from one episode to the next, the evolution in many settings is not adversarial. To address this, we introduce and study a new variant of AMDP, which aims to minimize regret while utilizing a set of cost predictors. For this setting, we develop a new policy search method that achieves a sublinear optimistic regret with high probability, that is a regret bound which gracefully degrades with the estimation power of the cost predictors. Establishing such optimistic regret bounds is nontrivial given that (i) as we demonstrate, the existing importance-weighted cost estimators cannot establish optimistic bounds, and (ii) the feedback model of AMDP is different (and more realistic) than the existing optimistic online learning works. Our result, in particular, hinges upon developing a novel optimistically biased cost estimator that leverages cost predictors and enables a high-probability regret analysis without imposing restrictive assumptions. We further discuss practical extensions of the proposed scheme and demonstrate its efficacy numerically.","sentences":["The Adversarial Markov Decision Process (AMDP) is a learning framework that deals with unknown and varying tasks in decision-making applications like robotics and recommendation systems.","A major limitation of the AMDP formalism, however, is pessimistic regret analysis results in the sense that although the cost function can change from one episode to the next, the evolution in many settings is not adversarial.","To address this, we introduce and study a new variant of AMDP, which aims to minimize regret while utilizing a set of cost predictors.","For this setting, we develop a new policy search method that achieves a sublinear optimistic regret with high probability, that is a regret bound which gracefully degrades with the estimation power of the cost predictors.","Establishing such optimistic regret bounds is nontrivial given that (i) as we demonstrate, the existing importance-weighted cost estimators cannot establish optimistic bounds, and (ii) the feedback model of AMDP is different (and more realistic) than the existing optimistic online learning works.","Our result, in particular, hinges upon developing a novel optimistically biased cost estimator that leverages cost predictors and enables a high-probability regret analysis without imposing restrictive assumptions.","We further discuss practical extensions of the proposed scheme and demonstrate its efficacy numerically."],"url":"http://arxiv.org/abs/2405.02188v1","category":"stat.ML"} +{"created":"2024-05-03 15:38:01","title":"Free extensivity via distributivity","abstract":"We consider the canonical pseudodistributive law between various free limit completion pseudomonads and the free coproduct completion pseudomonad. When the class of limits includes pullbacks, we show that this consideration leads to notions of extensive categories. More precisely, we show that extensive categories with pullbacks and infinitary lextensive categories are the pseudoalgebras for the pseudomonads resulting from the pseudodistributive laws. Moreover, we introduce the notion of doubly-infinitary lextensive category, and we establish that the freely generated such categories are cartesian closed. From this result, we further deduce that, in freely generated infinitary lextensive categories, the objects with a finite number of connected components are exponentiable. We conclude our work with remarks on descent theoretical aspects of this work, along with results concerning non-canonical isomorphisms.","sentences":["We consider the canonical pseudodistributive law between various free limit completion pseudomonads and the free coproduct completion pseudomonad.","When the class of limits includes pullbacks, we show that this consideration leads to notions of extensive categories.","More precisely, we show that extensive categories with pullbacks and infinitary lextensive categories are the pseudoalgebras for the pseudomonads resulting from the pseudodistributive laws.","Moreover, we introduce the notion of doubly-infinitary lextensive category, and we establish that the freely generated such categories are cartesian closed.","From this result, we further deduce that, in freely generated infinitary lextensive categories, the objects with a finite number of connected components are exponentiable.","We conclude our work with remarks on descent theoretical aspects of this work, along with results concerning non-canonical isomorphisms."],"url":"http://arxiv.org/abs/2405.02185v1","category":"math.CT"} +{"created":"2024-05-03 15:30:58","title":"Hybridizable discontinuous Galerkin methods for solving the two-fluid plasma model","abstract":"The two-fluid plasma model has a wide range of timescales which must all be numerically resolved regardless of the timescale on which plasma dynamics occurs. The answer to solving numerically stiff systems is generally to utilize unconditionally stable implicit time advance methods. Hybridizable discontinuous Galerkin (HDG) methods have emerged as a powerful tool for solving stiff partial differential equations. The HDG framework combines the advantages of the discontinuous Galerkin (DG) method, such as high-order accuracy and flexibility in handling mixed hyperbolic/parabolic PDEs with the advantage of classical continuous finite element methods for constructing small numerically stable global systems which can be solved implicitly. In this research we quantify the numerical stability conditions for the two-fluid equations and demonstrate how HDG can be used to avoid the strict stability requirements while maintaining high order accurate results.","sentences":["The two-fluid plasma model has a wide range of timescales which must all be numerically resolved regardless of the timescale on which plasma dynamics occurs.","The answer to solving numerically stiff systems is generally to utilize unconditionally stable implicit time advance methods.","Hybridizable discontinuous Galerkin (HDG) methods have emerged as a powerful tool for solving stiff partial differential equations.","The HDG framework combines the advantages of the discontinuous Galerkin (DG) method, such as high-order accuracy and flexibility in handling mixed hyperbolic/parabolic PDEs with the advantage of classical continuous finite element methods for constructing small numerically stable global systems which can be solved implicitly.","In this research we quantify the numerical stability conditions for the two-fluid equations and demonstrate how HDG can be used to avoid the strict stability requirements while maintaining high order accurate results."],"url":"http://arxiv.org/abs/2405.02182v1","category":"math.NA"} +{"created":"2024-05-03 15:27:51","title":"A Flow-Based Model for Conditional and Probabilistic Electricity Consumption Profile Generation and Prediction","abstract":"Residential Load Profile (RLP) generation and prediction are critical for the operation and planning of distribution networks, particularly as diverse low-carbon technologies are increasingly integrated. This paper introduces a novel flow-based generative model, termed Full Convolutional Profile Flow (FCPFlow), which is uniquely designed for both conditional and unconditional RLP generation, and for probabilistic load forecasting. By introducing two new layers--the invertible linear layer and the invertible normalization layer--the proposed FCPFlow architecture shows three main advantages compared to traditional statistical and contemporary deep generative models: 1) it is well-suited for RLP generation under continuous conditions, such as varying weather and annual electricity consumption, 2) it shows superior scalability in different datasets compared to traditional statistical, and 3) it also demonstrates better modeling capabilities in capturing the complex correlation of RLPs compared with deep generative models.","sentences":["Residential Load Profile (RLP) generation and prediction are critical for the operation and planning of distribution networks, particularly as diverse low-carbon technologies are increasingly integrated.","This paper introduces a novel flow-based generative model, termed Full Convolutional Profile Flow (FCPFlow), which is uniquely designed for both conditional and unconditional RLP generation, and for probabilistic load forecasting.","By introducing two new layers--the invertible linear layer and the invertible normalization layer--the proposed FCPFlow architecture shows three main advantages compared to traditional statistical and contemporary deep generative models: 1) it is well-suited for RLP generation under continuous conditions, such as varying weather and annual electricity consumption, 2) it shows superior scalability in different datasets compared to traditional statistical, and 3) it also demonstrates better modeling capabilities in capturing the complex correlation of RLPs compared with deep generative models."],"url":"http://arxiv.org/abs/2405.02180v1","category":"cs.LG"} +{"created":"2024-05-03 15:27:11","title":"Training-Free Deepfake Voice Recognition by Leveraging Large-Scale Pre-Trained Models","abstract":"Generalization is a main issue for current audio deepfake detectors, which struggle to provide reliable results on out-of-distribution data. Given the speed at which more and more accurate synthesis methods are developed, it is very important to design techniques that work well also on data they were not trained for.In this paper we study the potential of large-scale pre-trained models for audio deepfake detection, with special focus on generalization ability. To this end, the detection problem is reformulated in a speaker verification framework and fake audios are exposed by the mismatch between the voice sample under test and the voice of the claimed identity. With this paradigm, no fake speech sample is necessary in training, cutting off any link with the generation method at the root, and ensuring full generalization ability. Features are extracted by general-purpose large pre-trained models, with no need for training or fine-tuning on specific fake detection or speaker verification datasets. At detection time only a limited set of voice fragments of the identity under test is required. Experiments on several datasets widespread in the community show that detectors based on pre-trained models achieve excellent performance and show strong generalization ability, rivaling supervised methods on in-distribution data and largely overcoming them on out-of-distribution data.","sentences":["Generalization is a main issue for current audio deepfake detectors, which struggle to provide reliable results on out-of-distribution data.","Given the speed at which more and more accurate synthesis methods are developed, it is very important to design techniques that work well also on data they were not trained for.","In this paper we study the potential of large-scale pre-trained models for audio deepfake detection, with special focus on generalization ability.","To this end, the detection problem is reformulated in a speaker verification framework and fake audios are exposed by the mismatch between the voice sample under test and the voice of the claimed identity.","With this paradigm, no fake speech sample is necessary in training, cutting off any link with the generation method at the root, and ensuring full generalization ability.","Features are extracted by general-purpose large pre-trained models, with no need for training or fine-tuning on specific fake detection or speaker verification datasets.","At detection time only a limited set of voice fragments of the identity under test is required.","Experiments on several datasets widespread in the community show that detectors based on pre-trained models achieve excellent performance and show strong generalization ability, rivaling supervised methods on in-distribution data and largely overcoming them on out-of-distribution data."],"url":"http://arxiv.org/abs/2405.02179v1","category":"cs.SD"} +{"created":"2024-05-03 15:26:27","title":"Assessing and Verifying Task Utility in LLM-Powered Applications","abstract":"The rapid development of Large Language Models (LLMs) has led to a surge in applications that facilitate collaboration among multiple agents, assisting humans in their daily tasks. However, a significant gap remains in assessing to what extent LLM-powered applications genuinely enhance user experience and task execution efficiency. This highlights the need to verify utility of LLM-powered applications, particularly by ensuring alignment between the application's functionality and end-user needs. We introduce AgentEval, a novel framework designed to simplify the utility verification process by automatically proposing a set of criteria tailored to the unique purpose of any given application. This allows for a comprehensive assessment, quantifying the utility of an application against the suggested criteria. We present a comprehensive analysis of the effectiveness and robustness of AgentEval for two open source datasets including Math Problem solving and ALFWorld House-hold related tasks. For reproducibility purposes, we make the data, code and all the logs publicly available at https://bit.ly/3w3yKcS .","sentences":["The rapid development of Large Language Models (LLMs) has led to a surge in applications that facilitate collaboration among multiple agents, assisting humans in their daily tasks.","However, a significant gap remains in assessing to what extent LLM-powered applications genuinely enhance user experience and task execution efficiency.","This highlights the need to verify utility of LLM-powered applications, particularly by ensuring alignment between the application's functionality and end-user needs.","We introduce AgentEval, a novel framework designed to simplify the utility verification process by automatically proposing a set of criteria tailored to the unique purpose of any given application.","This allows for a comprehensive assessment, quantifying the utility of an application against the suggested criteria.","We present a comprehensive analysis of the effectiveness and robustness of AgentEval for two open source datasets including Math Problem solving and ALFWorld House-hold related tasks.","For reproducibility purposes, we make the data, code and all the logs publicly available at https://bit.ly/3w3yKcS ."],"url":"http://arxiv.org/abs/2405.02178v1","category":"cs.CL"} +{"created":"2024-05-03 15:25:48","title":"Hoaxpedia: A Unified Wikipedia Hoax Articles Dataset","abstract":"Hoaxes are a recognised form of disinformation created deliberately, with potential serious implications in the credibility of reference knowledge resources such as Wikipedia. What makes detecting Wikipedia hoaxes hard is that they often are written according to the official style guidelines. In this work, we first provide a systematic analysis of the similarities and discrepancies between legitimate and hoax Wikipedia articles, and introduce Hoaxpedia, a collection of 311 Hoax articles (from existing literature as well as official Wikipedia lists) alongside semantically similar real articles. We report results of binary classification experiments in the task of predicting whether a Wikipedia article is real or hoax, and analyze several settings as well as a range of language models. Our results suggest that detecting deceitful content in Wikipedia based on content alone, despite not having been explored much in the past, is a promising direction.","sentences":["Hoaxes are a recognised form of disinformation created deliberately, with potential serious implications in the credibility of reference knowledge resources such as Wikipedia.","What makes detecting Wikipedia hoaxes hard is that they often are written according to the official style guidelines.","In this work, we first provide a systematic analysis of the similarities and discrepancies between legitimate and hoax Wikipedia articles, and introduce Hoaxpedia, a collection of 311 Hoax articles (from existing literature as well as official Wikipedia lists) alongside semantically similar real articles.","We report results of binary classification experiments in the task of predicting whether a Wikipedia article is real or hoax, and analyze several settings as well as a range of language models.","Our results suggest that detecting deceitful content in Wikipedia based on content alone, despite not having been explored much in the past, is a promising direction."],"url":"http://arxiv.org/abs/2405.02175v1","category":"cs.CL"} +{"created":"2024-05-03 15:22:46","title":"Task Synthesis for Elementary Visual Programming in XLogoOnline Environment","abstract":"In recent years, the XLogoOnline programming platform has gained popularity among novice learners. It integrates the Logo programming language with visual programming, providing a visual interface for learning computing concepts. However, XLogoOnline offers only a limited set of tasks, which are inadequate for learners to master the computing concepts that require sufficient practice. To address this, we introduce XLogoSyn, a novel technique for synthesizing high-quality tasks for varying difficulty levels. Given a reference task, XLogoSyn can generate practice tasks at varying difficulty levels that cater to the varied needs and abilities of different learners. XLogoSyn achieves this by combining symbolic execution and constraint satisfaction techniques. Our expert study demonstrates the effectiveness of XLogoSyn. We have also deployed synthesized practice tasks into XLogoOnline, highlighting the educational benefits of these synthesized practice tasks.","sentences":["In recent years, the XLogoOnline programming platform has gained popularity among novice learners.","It integrates the Logo programming language with visual programming, providing a visual interface for learning computing concepts.","However, XLogoOnline offers only a limited set of tasks, which are inadequate for learners to master the computing concepts that require sufficient practice.","To address this, we introduce XLogoSyn, a novel technique for synthesizing high-quality tasks for varying difficulty levels.","Given a reference task, XLogoSyn can generate practice tasks at varying difficulty levels that cater to the varied needs and abilities of different learners.","XLogoSyn achieves this by combining symbolic execution and constraint satisfaction techniques.","Our expert study demonstrates the effectiveness of XLogoSyn.","We have also deployed synthesized practice tasks into XLogoOnline, highlighting the educational benefits of these synthesized practice tasks."],"url":"http://arxiv.org/abs/2405.02173v1","category":"cs.HC"} +{"created":"2024-05-03 15:20:30","title":"Self-Supervised Learning for Real-World Super-Resolution from Dual and Multiple Zoomed Observations","abstract":"In this paper, we consider two challenging issues in reference-based super-resolution (RefSR) for smartphone, (i) how to choose a proper reference image, and (ii) how to learn RefSR in a self-supervised manner. Particularly, we propose a novel self-supervised learning approach for real-world RefSR from observations at dual and multiple camera zooms. Firstly, considering the popularity of multiple cameras in modern smartphones, the more zoomed (telephoto) image can be naturally leveraged as the reference to guide the super-resolution (SR) of the lesser zoomed (ultra-wide) image, which gives us a chance to learn a deep network that performs SR from the dual zoomed observations (DZSR). Secondly, for self-supervised learning of DZSR, we take the telephoto image instead of an additional high-resolution image as the supervision information, and select a center patch from it as the reference to super-resolve the corresponding ultra-wide image patch. To mitigate the effect of the misalignment between ultra-wide low-resolution (LR) patch and telephoto ground-truth (GT) image during training, we first adopt patch-based optical flow alignment and then design an auxiliary-LR to guide the deforming of the warped LR features. To generate visually pleasing results, we present local overlapped sliced Wasserstein loss to better represent the perceptual difference between GT and output in the feature space. During testing, DZSR can be directly deployed to super-solve the whole ultra-wide image with the reference of the telephoto image. In addition, we further take multiple zoomed observations to explore self-supervised RefSR, and present a progressive fusion scheme for the effective utilization of reference images. Experiments show that our methods achieve better quantitative and qualitative performance against state-of-the-arts. Codes are available at https://github.com/cszhilu1998/SelfDZSR_PlusPlus.","sentences":["In this paper, we consider two challenging issues in reference-based super-resolution (RefSR) for smartphone, (i) how to choose a proper reference image, and (ii) how to learn RefSR in a self-supervised manner.","Particularly, we propose a novel self-supervised learning approach for real-world RefSR from observations at dual and multiple camera zooms.","Firstly, considering the popularity of multiple cameras in modern smartphones, the more zoomed (telephoto) image can be naturally leveraged as the reference to guide the super-resolution (SR) of the lesser zoomed (ultra-wide) image, which gives us a chance to learn a deep network that performs SR from the dual zoomed observations (DZSR).","Secondly, for self-supervised learning of DZSR, we take the telephoto image instead of an additional high-resolution image as the supervision information, and select a center patch from it as the reference to super-resolve the corresponding ultra-wide image patch.","To mitigate the effect of the misalignment between ultra-wide low-resolution (LR) patch and telephoto ground-truth (GT) image during training, we first adopt patch-based optical flow alignment and then design an auxiliary-LR to guide the deforming of the warped LR features.","To generate visually pleasing results, we present local overlapped sliced Wasserstein loss to better represent the perceptual difference between GT and output in the feature space.","During testing, DZSR can be directly deployed to super-solve the whole ultra-wide image with the reference of the telephoto image.","In addition, we further take multiple zoomed observations to explore self-supervised RefSR, and present a progressive fusion scheme for the effective utilization of reference images.","Experiments show that our methods achieve better quantitative and qualitative performance against state-of-the-arts.","Codes are available at https://github.com/cszhilu1998/SelfDZSR_PlusPlus."],"url":"http://arxiv.org/abs/2405.02171v1","category":"cs.CV"} +{"created":"2024-05-03 15:14:34","title":"Tracking and forecasting oscillatory data streams using Koopman autoencoders and Kalman filtering","abstract":"Data-driven modelling techniques provide a method for deriving models of dynamical systems directly from complicated data streams. However, tracking and forecasting such data streams poses a significant challenge to most methods, as they assume the underlying process and model does not change over time. In this paper, we apply one such data-driven method, the Koopman autoencoder (KAE), to high-dimensional oscillatory data to generate a low-dimensional latent space and model, where the system's dynamics appear linear. This allows one to accurately track and forecast systems where the underlying model may change over time. States and the model in the reduced order latent space can then be efficiently updated as new data becomes available, using data assimilation techniques such as the ensemble Kalman filter (EnKF), in a technique we call the KAE EnKF. We demonstrate that this approach is able to effectively track and forecast time-varying, nonlinear dynamical systems in synthetic examples. We then apply the KAE EnKF to a video of a physical pendulum, and achieve a significant improvement over current state-of-the-art methods. By generating effective latent space reconstructions, we find that we are able to construct accurate short-term forecasts and efficient adaptations to externally forced changes to the pendulum's frequency.","sentences":["Data-driven modelling techniques provide a method for deriving models of dynamical systems directly from complicated data streams.","However, tracking and forecasting such data streams poses a significant challenge to most methods, as they assume the underlying process and model does not change over time.","In this paper, we apply one such data-driven method, the Koopman autoencoder (KAE), to high-dimensional oscillatory data to generate a low-dimensional latent space and model, where the system's dynamics appear linear.","This allows one to accurately track and forecast systems where the underlying model may change over time.","States and the model in the reduced order latent space can then be efficiently updated as new data becomes available, using data assimilation techniques such as the ensemble Kalman filter (EnKF), in a technique we call the KAE EnKF.","We demonstrate that this approach is able to effectively track and forecast time-varying, nonlinear dynamical systems in synthetic examples.","We then apply the KAE EnKF to a video of a physical pendulum, and achieve a significant improvement over current state-of-the-art methods.","By generating effective latent space reconstructions, we find that we are able to construct accurate short-term forecasts and efficient adaptations to externally forced changes to the pendulum's frequency."],"url":"http://arxiv.org/abs/2405.02166v1","category":"math.DS"} +{"created":"2024-05-03 15:14:19","title":"EEG2TEXT: Open Vocabulary EEG-to-Text Decoding with EEG Pre-Training and Multi-View Transformer","abstract":"Deciphering the intricacies of the human brain has captivated curiosity for centuries. Recent strides in Brain-Computer Interface (BCI) technology, particularly using motor imagery, have restored motor functions such as reaching, grasping, and walking in paralyzed individuals. However, unraveling natural language from brain signals remains a formidable challenge. Electroencephalography (EEG) is a non-invasive technique used to record electrical activity in the brain by placing electrodes on the scalp. Previous studies of EEG-to-text decoding have achieved high accuracy on small closed vocabularies, but still fall short of high accuracy when dealing with large open vocabularies. We propose a novel method, EEG2TEXT, to improve the accuracy of open vocabulary EEG-to-text decoding. Specifically, EEG2TEXT leverages EEG pre-training to enhance the learning of semantics from EEG signals and proposes a multi-view transformer to model the EEG signal processing by different spatial regions of the brain. Experiments show that EEG2TEXT has superior performance, outperforming the state-of-the-art baseline methods by a large margin of up to 5% in absolute BLEU and ROUGE scores. EEG2TEXT shows great potential for a high-performance open-vocabulary brain-to-text system to facilitate communication.","sentences":["Deciphering the intricacies of the human brain has captivated curiosity for centuries.","Recent strides in Brain-Computer Interface (BCI) technology, particularly using motor imagery, have restored motor functions such as reaching, grasping, and walking in paralyzed individuals.","However, unraveling natural language from brain signals remains a formidable challenge.","Electroencephalography (EEG) is a non-invasive technique used to record electrical activity in the brain by placing electrodes on the scalp.","Previous studies of EEG-to-text decoding have achieved high accuracy on small closed vocabularies, but still fall short of high accuracy when dealing with large open vocabularies.","We propose a novel method, EEG2TEXT, to improve the accuracy of open vocabulary EEG-to-text decoding.","Specifically, EEG2TEXT leverages EEG pre-training to enhance the learning of semantics from EEG signals and proposes a multi-view transformer to model the EEG signal processing by different spatial regions of the brain.","Experiments show that EEG2TEXT has superior performance, outperforming the state-of-the-art baseline methods by a large margin of up to 5% in absolute BLEU and ROUGE scores.","EEG2TEXT shows great potential for a high-performance open-vocabulary brain-to-text system to facilitate communication."],"url":"http://arxiv.org/abs/2405.02165v1","category":"cs.CL"} +{"created":"2024-05-03 15:08:39","title":"Mapping the Unseen: Unified Promptable Panoptic Mapping with Dynamic Labeling using Foundation Models","abstract":"In the field of robotics and computer vision, efficient and accurate semantic mapping remains a significant challenge due to the growing demand for intelligent machines that can comprehend and interact with complex environments. Conventional panoptic mapping methods, however, are limited by predefined semantic classes, thus making them ineffective for handling novel or unforeseen objects. In response to this limitation, we introduce the Unified Promptable Panoptic Mapping (UPPM) method. UPPM utilizes recent advances in foundation models to enable real-time, on-demand label generation using natural language prompts. By incorporating a dynamic labeling strategy into traditional panoptic mapping techniques, UPPM provides significant improvements in adaptability and versatility while maintaining high performance levels in map reconstruction. We demonstrate our approach on real-world and simulated datasets. Results show that UPPM can accurately reconstruct scenes and segment objects while generating rich semantic labels through natural language interactions. A series of ablation experiments validated the advantages of foundation model-based labeling over fixed label sets.","sentences":["In the field of robotics and computer vision, efficient and accurate semantic mapping remains a significant challenge due to the growing demand for intelligent machines that can comprehend and interact with complex environments.","Conventional panoptic mapping methods, however, are limited by predefined semantic classes, thus making them ineffective for handling novel or unforeseen objects.","In response to this limitation, we introduce the Unified Promptable Panoptic Mapping (UPPM) method.","UPPM utilizes recent advances in foundation models to enable real-time, on-demand label generation using natural language prompts.","By incorporating a dynamic labeling strategy into traditional panoptic mapping techniques, UPPM provides significant improvements in adaptability and versatility while maintaining high performance levels in map reconstruction.","We demonstrate our approach on real-world and simulated datasets.","Results show that UPPM can accurately reconstruct scenes and segment objects while generating rich semantic labels through natural language interactions.","A series of ablation experiments validated the advantages of foundation model-based labeling over fixed label sets."],"url":"http://arxiv.org/abs/2405.02162v1","category":"cs.CV"} +{"created":"2024-05-03 15:08:25","title":"Simulating the economic impact of rationality through reinforcement learning and agent-based modelling","abstract":"Agent-based models (ABMs) are simulation models used in economics to overcome some of the limitations of traditional frameworks based on general equilibrium assumptions. However, agents within an ABM follow predetermined, not fully rational, behavioural rules which can be cumbersome to design and difficult to justify. Here we leverage multi-agent reinforcement learning (RL) to expand the capabilities of ABMs with the introduction of fully rational agents that learn their policy by interacting with the environment and maximising a reward function. Specifically, we propose a 'Rational macro ABM' (R-MABM) framework by extending a paradigmatic macro ABM from the economic literature. We show that gradually substituting ABM firms in the model with RL agents, trained to maximise profits, allows for a thorough study of the impact of rationality on the economy. We find that RL agents spontaneously learn three distinct strategies for maximising profits, with the optimal strategy depending on the level of market competition and rationality. We also find that RL agents with independent policies, and without the ability to communicate with each other, spontaneously learn to segregate into different strategic groups, thus increasing market power and overall profits. Finally, we find that a higher degree of rationality in the economy always improves the macroeconomic environment as measured by total output, depending on the specific rational policy, this can come at the cost of higher instability. Our R-MABM framework is general, it allows for stable multi-agent learning, and represents a principled and robust direction to extend existing economic simulators.","sentences":["Agent-based models (ABMs) are simulation models used in economics to overcome some of the limitations of traditional frameworks based on general equilibrium assumptions.","However, agents within an ABM follow predetermined, not fully rational, behavioural rules which can be cumbersome to design and difficult to justify.","Here we leverage multi-agent reinforcement learning (RL) to expand the capabilities of ABMs with the introduction of fully rational agents that learn their policy by interacting with the environment and maximising a reward function.","Specifically, we propose a 'Rational macro ABM' (R-MABM) framework by extending a paradigmatic macro ABM from the economic literature.","We show that gradually substituting ABM firms in the model with RL agents, trained to maximise profits, allows for a thorough study of the impact of rationality on the economy.","We find that RL agents spontaneously learn three distinct strategies for maximising profits, with the optimal strategy depending on the level of market competition and rationality.","We also find that RL agents with independent policies, and without the ability to communicate with each other, spontaneously learn to segregate into different strategic groups, thus increasing market power and overall profits.","Finally, we find that a higher degree of rationality in the economy always improves the macroeconomic environment as measured by total output, depending on the specific rational policy, this can come at the cost of higher instability.","Our R-MABM framework is general, it allows for stable multi-agent learning, and represents a principled and robust direction to extend existing economic simulators."],"url":"http://arxiv.org/abs/2405.02161v1","category":"cs.LG"} +{"created":"2024-05-03 15:06:20","title":"Energy-filtered quantum states and the emergence of non-local correlations","abstract":"Energy-filtered quantum states are promising candidates for efficiently simulating thermal states. We explore a protocol designed to transition a product state into an eigenstate located in the middle of the spectrum; this is achieved by gradually reducing its energy variance, which allows us to comprehensively understand the crossover phenomenon and the subsequent convergence towards thermal behavior. We introduce and discuss three energy-filtering regimes (short, medium and long), and we interpret them as stages of thermalization. We show that the properties of the filtered states are locally indistinguishable from those of time-averaged density matrices, routinely employed in the theory of thermalization. On the other hand, unexpected non-local quantum correlations are generated in the medium regimes and are witnessed by the R\\'enyi entanglement entropies of subsystems, which we compute via replica methods. Specifically, two-point correlation functions break cluster decomposition and the entanglement entropy of large regions scales as the logarithm of the volume during the medium filter time.","sentences":["Energy-filtered quantum states are promising candidates for efficiently simulating thermal states.","We explore a protocol designed to transition a product state into an eigenstate located in the middle of the spectrum; this is achieved by gradually reducing its energy variance, which allows us to comprehensively understand the crossover phenomenon and the subsequent convergence towards thermal behavior.","We introduce and discuss three energy-filtering regimes (short, medium and long), and we interpret them as stages of thermalization.","We show that the properties of the filtered states are locally indistinguishable from those of time-averaged density matrices, routinely employed in the theory of thermalization.","On the other hand, unexpected non-local quantum correlations are generated in the medium regimes and are witnessed by the R\\'enyi entanglement entropies of subsystems, which we compute via replica methods.","Specifically, two-point correlation functions break cluster decomposition and the entanglement entropy of large regions scales as the logarithm of the volume during the medium filter time."],"url":"http://arxiv.org/abs/2405.02158v1","category":"quant-ph"} +{"created":"2024-05-03 15:02:55","title":"How to Diversify any Personalized Recommender? A User-centric Pre-processing approach","abstract":"In this paper, we introduce a novel approach to improve the diversity of Top-N recommendations while maintaining recommendation performance. Our approach employs a user-centric pre-processing strategy aimed at exposing users to a wide array of content categories and topics. We personalize this strategy by selectively adding and removing a percentage of interactions from user profiles. This personalization ensures we remain closely aligned with user preferences while gradually introducing distribution shifts. Our pre-processing technique offers flexibility and can seamlessly integrate into any recommender architecture. To evaluate our approach, we run extensive experiments on two publicly available data sets for news and book recommendations. We test various standard and neural network-based recommender system algorithms. Our results show that our approach generates diverse recommendations, ensuring users are exposed to a wider range of items. Furthermore, leveraging pre-processed data for training leads to recommender systems achieving performance levels comparable to, and in some cases, better than those trained on original, unmodified data. Additionally, our approach promotes provider fairness by facilitating exposure to minority or niche categories.","sentences":["In this paper, we introduce a novel approach to improve the diversity of Top-N recommendations while maintaining recommendation performance.","Our approach employs a user-centric pre-processing strategy aimed at exposing users to a wide array of content categories and topics.","We personalize this strategy by selectively adding and removing a percentage of interactions from user profiles.","This personalization ensures we remain closely aligned with user preferences while gradually introducing distribution shifts.","Our pre-processing technique offers flexibility and can seamlessly integrate into any recommender architecture.","To evaluate our approach, we run extensive experiments on two publicly available data sets for news and book recommendations.","We test various standard and neural network-based recommender system algorithms.","Our results show that our approach generates diverse recommendations, ensuring users are exposed to a wider range of items.","Furthermore, leveraging pre-processed data for training leads to recommender systems achieving performance levels comparable to, and in some cases, better than those trained on original, unmodified data.","Additionally, our approach promotes provider fairness by facilitating exposure to minority or niche categories."],"url":"http://arxiv.org/abs/2405.02156v1","category":"cs.IR"} +{"created":"2024-05-03 15:02:41","title":"Multi-method Integration with Confidence-based Weighting for Zero-shot Image Classification","abstract":"This paper introduces a novel framework for zero-shot learning (ZSL), i.e., to recognize new categories that are unseen during training, by using a multi-model and multi-alignment integration method. Specifically, we propose three strategies to enhance the model's performance to handle ZSL: 1) Utilizing the extensive knowledge of ChatGPT and the powerful image generation capabilities of DALL-E to create reference images that can precisely describe unseen categories and classification boundaries, thereby alleviating the information bottleneck issue; 2) Integrating the results of text-image alignment and image-image alignment from CLIP, along with the image-image alignment results from DINO, to achieve more accurate predictions; 3) Introducing an adaptive weighting mechanism based on confidence levels to aggregate the outcomes from different prediction methods. Experimental results on multiple datasets, including CIFAR-10, CIFAR-100, and TinyImageNet, demonstrate that our model can significantly improve classification accuracy compared to single-model approaches, achieving AUROC scores above 96% across all test datasets, and notably surpassing 99% on the CIFAR-10 dataset.","sentences":["This paper introduces a novel framework for zero-shot learning (ZSL), i.e., to recognize new categories that are unseen during training, by using a multi-model and multi-alignment integration method.","Specifically, we propose three strategies to enhance the model's performance to handle ZSL: 1) Utilizing the extensive knowledge of ChatGPT and the powerful image generation capabilities of DALL-E to create reference images that can precisely describe unseen categories and classification boundaries, thereby alleviating the information bottleneck issue; 2) Integrating the results of text-image alignment and image-image alignment from CLIP, along with the image-image alignment results from DINO, to achieve more accurate predictions; 3) Introducing an adaptive weighting mechanism based on confidence levels to aggregate the outcomes from different prediction methods.","Experimental results on multiple datasets, including CIFAR-10, CIFAR-100, and TinyImageNet, demonstrate that our model can significantly improve classification accuracy compared to single-model approaches, achieving AUROC scores above 96% across all test datasets, and notably surpassing 99% on the CIFAR-10 dataset."],"url":"http://arxiv.org/abs/2405.02155v1","category":"cs.CV"} +{"created":"2024-05-03 15:02:21","title":"Neural Context Flows for Learning Generalizable Dynamical Systems","abstract":"Neural Ordinary Differential Equations typically struggle to generalize to new dynamical behaviors created by parameter changes in the underlying system, even when the dynamics are close to previously seen behaviors. The issue gets worse when the changing parameters are unobserved, i.e., their value or influence is not directly measurable when collecting data. We introduce Neural Context Flow (NCF), a framework that encodes said unobserved parameters in a latent context vector as input to a vector field. NCFs leverage differentiability of the vector field with respect to the parameters, along with first-order Taylor expansion to allow any context vector to influence trajectories from other parameters. We validate our method and compare it to established Multi-Task and Meta-Learning alternatives, showing competitive performance in mean squared error for in-domain and out-of-distribution evaluation on the Lotka-Volterra, Glycolytic Oscillator, and Gray-Scott problems. This study holds practical implications for foundational models in science and related areas that benefit from conditional neural ODEs. Our code is openly available at https://github.com/ddrous/ncflow.","sentences":["Neural Ordinary Differential Equations typically struggle to generalize to new dynamical behaviors created by parameter changes in the underlying system, even when the dynamics are close to previously seen behaviors.","The issue gets worse when the changing parameters are unobserved, i.e., their value or influence is not directly measurable when collecting data.","We introduce Neural Context Flow (NCF), a framework that encodes said unobserved parameters in a latent context vector as input to a vector field.","NCFs leverage differentiability of the vector field with respect to the parameters, along with first-order Taylor expansion to allow any context vector to influence trajectories from other parameters.","We validate our method and compare it to established Multi-Task and Meta-Learning alternatives, showing competitive performance in mean squared error for in-domain and out-of-distribution evaluation on the Lotka-Volterra, Glycolytic Oscillator, and Gray-Scott problems.","This study holds practical implications for foundational models in science and related areas that benefit from conditional neural ODEs.","Our code is openly available at https://github.com/ddrous/ncflow."],"url":"http://arxiv.org/abs/2405.02154v1","category":"cs.LG"} +{"created":"2024-05-03 15:00:36","title":"Reconstructing the mid-infrared spectra of galaxies using ultraviolet to submillimeter photometry and Deep Generative Networks","abstract":"The mid-infrared spectra of galaxies are rich in features such as the Polycyclic Aromatic Hydrocarbon (PAH) and silicate dust features which give valuable information about the physics of galaxies and their evolution. For example they can provide information about the relative contribution of star formation and accretion from a supermassive black hole to the power output of galaxies. However, the mid-infrared spectra are currently available for a very small fraction of galaxies that have been detected in deep multi-wavelength surveys of the sky. In this paper we explore whether Deep Generative Network methods can be used to reconstruct mid-infrared spectra in the 5-35{\\mu}m range using the limited multi-wavelength photometry in ~20 bands from the ultraviolet to the submillimeter which is typically available in extragalactic surveys. For this purpose we use simulated spectra computed with a combination of radiative transfer models for starbursts, active galactic nucleus (AGN) tori and host galaxies. We find that our method using Deep Generative Networks, namely Generative Adversarial Networks and Generative Latent Optimization models, can efficiently produce high quality reconstructions of mid-infrared spectra in ~70% of the cases.","sentences":["The mid-infrared spectra of galaxies are rich in features such as the Polycyclic Aromatic Hydrocarbon (PAH) and silicate dust features which give valuable information about the physics of galaxies and their evolution.","For example they can provide information about the relative contribution of star formation and accretion from a supermassive black hole to the power output of galaxies.","However, the mid-infrared spectra are currently available for a very small fraction of galaxies that have been detected in deep multi-wavelength surveys of the sky.","In this paper we explore whether Deep Generative Network methods can be used to reconstruct mid-infrared spectra in the 5-35{\\mu}m range using the limited multi-wavelength photometry in ~20 bands from the ultraviolet to the submillimeter which is typically available in extragalactic surveys.","For this purpose we use simulated spectra computed with a combination of radiative transfer models for starbursts, active galactic nucleus (AGN) tori and host galaxies.","We find that our method using Deep Generative Networks, namely Generative Adversarial Networks and Generative Latent Optimization models, can efficiently produce high quality reconstructions of mid-infrared spectra in ~70% of the cases."],"url":"http://arxiv.org/abs/2405.02153v1","category":"astro-ph.GA"} +{"created":"2024-05-03 14:59:27","title":"On the Three-dimensional Nernst-Planck-Boussinesq System","abstract":"In this paper, we analyze a three-dimensional Nernst-Planck-Boussinesq (NPB) system that describes ionic electrodiffusion in an incompressible viscous fluid. This new model incorporates variational temperature and is forced by buoyancy force stemming from temperature and salinity fluctuations, enhancing its generality and realism. The electromigration term in the NPB system displays a complex nonlinear structure influenced by the reciprocal of the temperature that distinguishes its mathematical aspects from other electrodiffusion models studied in the literature. We address the global existence of weak solutions to the NPB system on the three-dimensional torus for large initial data. In addition, we study the long-time dynamics of these weak solutions and the associated relative entropies and establish their exponential decay in time to steady states.","sentences":["In this paper, we analyze a three-dimensional Nernst-Planck-Boussinesq (NPB) system that describes ionic electrodiffusion in an incompressible viscous fluid.","This new model incorporates variational temperature and is forced by buoyancy force stemming from temperature and salinity fluctuations, enhancing its generality and realism.","The electromigration term in the NPB system displays a complex nonlinear structure influenced by the reciprocal of the temperature that distinguishes its mathematical aspects from other electrodiffusion models studied in the literature.","We address the global existence of weak solutions to the NPB system on the three-dimensional torus for large initial data.","In addition, we study the long-time dynamics of these weak solutions and the associated relative entropies and establish their exponential decay in time to steady states."],"url":"http://arxiv.org/abs/2405.02152v1","category":"math.AP"} +{"created":"2024-05-03 14:58:46","title":"GMP-ATL: Gender-augmented Multi-scale Pseudo-label Enhanced Adaptive Transfer Learning for Speech Emotion Recognition via HuBERT","abstract":"The continuous evolution of pre-trained speech models has greatly advanced Speech Emotion Recognition (SER). However, there is still potential for enhancement in the performance of these methods. In this paper, we present GMP-ATL (Gender-augmented Multi-scale Pseudo-label Adaptive Transfer Learning), a novel HuBERT-based adaptive transfer learning framework for SER. Specifically, GMP-ATL initially employs the pre-trained HuBERT, implementing multi-task learning and multi-scale k-means clustering to acquire frame-level gender-augmented multi-scale pseudo-labels. Then, to fully leverage both obtained frame-level and utterance-level emotion labels, we incorporate model retraining and fine-tuning methods to further optimize GMP-ATL. Experiments on IEMOCAP show that our GMP-ATL achieves superior recognition performance, with a WAR of 80.0\\% and a UAR of 82.0\\%, surpassing state-of-the-art unimodal SER methods, while also yielding comparable results with multimodal SER approaches.","sentences":["The continuous evolution of pre-trained speech models has greatly advanced Speech Emotion Recognition (SER).","However, there is still potential for enhancement in the performance of these methods.","In this paper, we present GMP-ATL (Gender-augmented Multi-scale Pseudo-label Adaptive Transfer Learning), a novel HuBERT-based adaptive transfer learning framework for SER.","Specifically, GMP-ATL initially employs the pre-trained HuBERT, implementing multi-task learning and multi-scale k-means clustering to acquire frame-level gender-augmented multi-scale pseudo-labels.","Then, to fully leverage both obtained frame-level and utterance-level emotion labels, we incorporate model retraining and fine-tuning methods to further optimize GMP-ATL.","Experiments on IEMOCAP show that our GMP-ATL achieves superior recognition performance, with a WAR of 80.0\\% and a UAR of 82.0\\%, surpassing state-of-the-art unimodal SER methods, while also yielding comparable results with multimodal SER approaches."],"url":"http://arxiv.org/abs/2405.02151v1","category":"cs.SD"} +{"created":"2024-05-03 14:56:43","title":"The AI Review Lottery: Widespread AI-Assisted Peer Reviews Boost Paper Scores and Acceptance Rates","abstract":"Journals and conferences worry that peer reviews assisted by artificial intelligence (AI), in particular, large language models (LLMs), may negatively influence the validity and fairness of the peer-review system, a cornerstone of modern science. In this work, we address this concern with a quasi-experimental study of the prevalence and impact of AI-assisted peer reviews in the context of the 2024 International Conference on Learning Representations (ICLR), a large and prestigious machine-learning conference. Our contributions are threefold. Firstly, we obtain a lower bound for the prevalence of AI-assisted reviews at ICLR 2024 using the GPTZero LLM detector, estimating that at least $15.8\\%$ of reviews were written with AI assistance. Secondly, we estimate the impact of AI-assisted reviews on submission scores. Considering pairs of reviews with different scores assigned to the same paper, we find that in $53.4\\%$ of pairs the AI-assisted review scores higher than the human review ($p = 0.002$; relative difference in probability of scoring higher: $+14.4\\%$ in favor of AI-assisted reviews). Thirdly, we assess the impact of receiving an AI-assisted peer review on submission acceptance. In a matched study, submissions near the acceptance threshold that received an AI-assisted peer review were $4.9$ percentage points ($p = 0.024$) more likely to be accepted than submissions that did not. Overall, we show that AI-assisted reviews are consequential to the peer-review process and offer a discussion on future implications of current trends","sentences":["Journals and conferences worry that peer reviews assisted by artificial intelligence (AI), in particular, large language models (LLMs), may negatively influence the validity and fairness of the peer-review system, a cornerstone of modern science.","In this work, we address this concern with a quasi-experimental study of the prevalence and impact of AI-assisted peer reviews in the context of the 2024 International Conference on Learning Representations (ICLR), a large and prestigious machine-learning conference.","Our contributions are threefold.","Firstly, we obtain a lower bound for the prevalence of AI-assisted reviews at ICLR 2024 using the GPTZero LLM detector, estimating that at least $15.8\\%$ of reviews were written with AI assistance.","Secondly, we estimate the impact of AI-assisted reviews on submission scores.","Considering pairs of reviews with different scores assigned to the same paper, we find that in $53.4\\%$ of pairs the AI-assisted review scores higher than the human review ($p = 0.002$; relative difference in probability of scoring higher: $+14.4\\%$ in favor of AI-assisted reviews).","Thirdly, we assess the impact of receiving an AI-assisted peer review on submission acceptance.","In a matched study, submissions near the acceptance threshold that received an AI-assisted peer review were $4.9$ percentage points ($p = 0.024$) more likely to be accepted than submissions that did not.","Overall, we show that AI-assisted reviews are consequential to the peer-review process and offer a discussion on future implications of current trends"],"url":"http://arxiv.org/abs/2405.02150v1","category":"cs.CY"} +{"created":"2024-05-03 14:53:46","title":"Towards a Formal Creativity Theory: Preliminary results in Novelty and Transformativeness","abstract":"Formalizing creativity-related concepts has been a long-term goal of Computational Creativity. To the same end, we explore Formal Learning Theory in the context of creativity. We provide an introduction to the main concepts of this framework and a re-interpretation of terms commonly found in creativity discussions, proposing formal definitions for novelty and transformational creativity. This formalisation marks the beginning of a research branch we call Formal Creativity Theory, exploring how learning can be included as preparation for exploratory behaviour and how learning is a key part of transformational creative behaviour. By employing these definitions, we argue that, while novelty is neither necessary nor sufficient for transformational creativity in general, when using an inspiring set, rather than a sequence of experiences, an agent actually requires novelty for transformational creativity to occur.","sentences":["Formalizing creativity-related concepts has been a long-term goal of Computational Creativity.","To the same end, we explore Formal Learning Theory in the context of creativity.","We provide an introduction to the main concepts of this framework and a re-interpretation of terms commonly found in creativity discussions, proposing formal definitions for novelty and transformational creativity.","This formalisation marks the beginning of a research branch we call Formal Creativity Theory, exploring how learning can be included as preparation for exploratory behaviour and how learning is a key part of transformational creative behaviour.","By employing these definitions, we argue that, while novelty is neither necessary nor sufficient for transformational creativity in general, when using an inspiring set, rather than a sequence of experiences, an agent actually requires novelty for transformational creativity to occur."],"url":"http://arxiv.org/abs/2405.02148v1","category":"cs.AI"} +{"created":"2024-05-03 14:52:31","title":"A Spiking Neural Network Decoder for Implantable Brain Machine Interfaces and its Sparsity-aware Deployment on RISC-V Microcontrollers","abstract":"Implantable Brain-machine interfaces (BMIs) are promising for motor rehabilitation and mobility augmentation, and they demand accurate and energy-efficient algorithms. In this paper, we propose a novel spiking neural network (SNN) decoder for regression tasks for implantable BMIs. The SNN is trained with enhanced spatio-temporal backpropagation to fully leverage its capability to handle temporal problems. The proposed SNN decoder outperforms the state-of-the-art Kalman filter and artificial neural network (ANN) decoders in offline finger velocity decoding tasks. The decoder is deployed on a RISC-V-based hardware platform and optimized to exploit sparsity. The proposed implementation has an average power consumption of 0.50 mW in a duty-cycled mode. When conducting continuous inference without duty-cycling, it achieves an energy efficiency of 1.88 uJ per inference, which is 5.5X less than the baseline ANN. Additionally, the average decoding latency is 0.12 ms for each inference, which is 5.7X faster than the ANN implementation.","sentences":["Implantable Brain-machine interfaces (BMIs) are promising for motor rehabilitation and mobility augmentation, and they demand accurate and energy-efficient algorithms.","In this paper, we propose a novel spiking neural network (SNN) decoder for regression tasks for implantable BMIs.","The SNN is trained with enhanced spatio-temporal backpropagation to fully leverage its capability to handle temporal problems.","The proposed SNN decoder outperforms the state-of-the-art Kalman filter and artificial neural network (ANN) decoders in offline finger velocity decoding tasks.","The decoder is deployed on a RISC-V-based hardware platform and optimized to exploit sparsity.","The proposed implementation has an average power consumption of 0.50 mW in a duty-cycled mode.","When conducting continuous inference without duty-cycling, it achieves an energy efficiency of 1.88 uJ per inference, which is 5.5X less than the baseline ANN.","Additionally, the average decoding latency is 0.12 ms for each inference, which is 5.7X faster than the ANN implementation."],"url":"http://arxiv.org/abs/2405.02146v1","category":"eess.SP"} +{"created":"2024-05-03 14:51:50","title":"Characterized Diffusion and Spatial-Temporal Interaction Network for Trajectory Prediction in Autonomous Driving","abstract":"Trajectory prediction is a cornerstone in autonomous driving (AD), playing a critical role in enabling vehicles to navigate safely and efficiently in dynamic environments. To address this task, this paper presents a novel trajectory prediction model tailored for accuracy in the face of heterogeneous and uncertain traffic scenarios. At the heart of this model lies the Characterized Diffusion Module, an innovative module designed to simulate traffic scenarios with inherent uncertainty. This module enriches the predictive process by infusing it with detailed semantic information, thereby enhancing trajectory prediction accuracy. Complementing this, our Spatio-Temporal (ST) Interaction Module captures the nuanced effects of traffic scenarios on vehicle dynamics across both spatial and temporal dimensions with remarkable effectiveness. Demonstrated through exhaustive evaluations, our model sets a new standard in trajectory prediction, achieving state-of-the-art (SOTA) results on the Next Generation Simulation (NGSIM), Highway Drone (HighD), and Macao Connected Autonomous Driving (MoCAD) datasets across both short and extended temporal spans. This performance underscores the model's unparalleled adaptability and efficacy in navigating complex traffic scenarios, including highways, urban streets, and intersections.","sentences":["Trajectory prediction is a cornerstone in autonomous driving (AD), playing a critical role in enabling vehicles to navigate safely and efficiently in dynamic environments.","To address this task, this paper presents a novel trajectory prediction model tailored for accuracy in the face of heterogeneous and uncertain traffic scenarios.","At the heart of this model lies the Characterized Diffusion Module, an innovative module designed to simulate traffic scenarios with inherent uncertainty.","This module enriches the predictive process by infusing it with detailed semantic information, thereby enhancing trajectory prediction accuracy.","Complementing this, our Spatio-Temporal (ST) Interaction Module captures the nuanced effects of traffic scenarios on vehicle dynamics across both spatial and temporal dimensions with remarkable effectiveness.","Demonstrated through exhaustive evaluations, our model sets a new standard in trajectory prediction, achieving state-of-the-art (SOTA) results on the Next Generation Simulation (NGSIM), Highway Drone (HighD), and Macao Connected Autonomous Driving (MoCAD) datasets across both short and extended temporal spans.","This performance underscores the model's unparalleled adaptability and efficacy in navigating complex traffic scenarios, including highways, urban streets, and intersections."],"url":"http://arxiv.org/abs/2405.02145v1","category":"cs.RO"} +{"created":"2024-05-03 14:44:05","title":"Local cohomology with support in Schubert varieties","abstract":"This paper is concerned with local cohomology sheaves on generalized flag varieties supported in closed Schubert varieties, which carry natural structures as (mixed Hodge) D-modules. We employ Kazhdan--Lusztig theory and Saito's theory of mixed Hodge modules to describe a general strategy to calculate the simple composition factors, Hodge filtration, and weight filtration on these modules. Our main tool is the Grothendieck--Cousin complex, introduced by Kempf, which allows us to relate the local cohomology modules in question to parabolic Verma modules over the corresponding Lie algebra. We show that this complex underlies a complex of mixed Hodge modules, and is thus endowed with Hodge and weight filtrations. As a consequence, strictness implies that computing cohomology commutes with taking associated graded with respect to both of these filtrations. We execute this strategy to calculate the composition factors and weight filtration for Schubert varieties in the Grassmannian, in particular showing that the weight filtration is controlled by the augmented Dyck patterns of Raicu--Weyman. As an application, upon restriction to the opposite big cell, we recover the simple composition factors and weight filtration on local cohomology with support in generic determinantal varieties.","sentences":["This paper is concerned with local cohomology sheaves on generalized flag varieties supported in closed Schubert varieties, which carry natural structures as (mixed Hodge) D-modules.","We employ Kazhdan--Lusztig theory and Saito's theory of mixed Hodge modules to describe a general strategy to calculate the simple composition factors, Hodge filtration, and weight filtration on these modules.","Our main tool is the Grothendieck--Cousin complex, introduced by Kempf, which allows us to relate the local cohomology modules in question to parabolic Verma modules over the corresponding Lie algebra.","We show that this complex underlies a complex of mixed Hodge modules, and is thus endowed with Hodge and weight filtrations.","As a consequence, strictness implies that computing cohomology commutes with taking associated graded with respect to both of these filtrations.","We execute this strategy to calculate the composition factors and weight filtration for Schubert varieties in the Grassmannian, in particular showing that the weight filtration is controlled by the augmented Dyck patterns of Raicu--Weyman.","As an application, upon restriction to the opposite big cell, we recover the simple composition factors and weight filtration on local cohomology with support in generic determinantal varieties."],"url":"http://arxiv.org/abs/2405.02142v1","category":"math.AG"} +{"created":"2024-05-03 14:43:07","title":"An Information Theoretic Perspective on Conformal Prediction","abstract":"Conformal Prediction (CP) is a distribution-free uncertainty estimation framework that constructs prediction sets guaranteed to contain the true answer with a user-specified probability. Intuitively, the size of the prediction set encodes a general notion of uncertainty, with larger sets associated with higher degrees of uncertainty. In this work, we leverage information theory to connect conformal prediction to other notions of uncertainty. More precisely, we prove three different ways to upper bound the intrinsic uncertainty, as described by the conditional entropy of the target variable given the inputs, by combining CP with information theoretical inequalities. Moreover, we demonstrate two direct and useful applications of such connection between conformal prediction and information theory: (i) more principled and effective conformal training objectives that generalize previous approaches and enable end-to-end training of machine learning models from scratch, and (ii) a natural mechanism to incorporate side information into conformal prediction. We empirically validate both applications in centralized and federated learning settings, showing our theoretical results translate to lower inefficiency (average prediction set size) for popular CP methods.","sentences":["Conformal Prediction (CP) is a distribution-free uncertainty estimation framework that constructs prediction sets guaranteed to contain the true answer with a user-specified probability.","Intuitively, the size of the prediction set encodes a general notion of uncertainty, with larger sets associated with higher degrees of uncertainty.","In this work, we leverage information theory to connect conformal prediction to other notions of uncertainty.","More precisely, we prove three different ways to upper bound the intrinsic uncertainty, as described by the conditional entropy of the target variable given the inputs, by combining CP with information theoretical inequalities.","Moreover, we demonstrate two direct and useful applications of such connection between conformal prediction and information theory: (i) more principled and effective conformal training objectives that generalize previous approaches and enable end-to-end training of machine learning models from scratch, and (ii) a natural mechanism to incorporate side information into conformal prediction.","We empirically validate both applications in centralized and federated learning settings, showing our theoretical results translate to lower inefficiency (average prediction set size) for popular CP methods."],"url":"http://arxiv.org/abs/2405.02140v1","category":"cs.LG"} +{"created":"2024-05-03 14:40:54","title":"XtalOpt Version 13: Multi-Objective Evolutionary Search for Novel Functional Materials","abstract":"Version 13 of XtalOpt, an evolutionary algorithm for crystal structure prediction, is now available for download from the CPC program library or the XtalOpt website, https://xtalopt.github.io. In the new version of the XtalOpt code, a general platform for multi-objective global optimization is implemented. This functionality is designed to facilitate the search for (meta)stable phases of functional materials through minimization of the enthalpy of a crystalline system coupled with the simultaneous optimization of any desired properties that are specified by the user. The code is also able to perform a constrained search by filtering the parent pool of structures based on a user-specified feature, while optimizing multiple objectives. Here, we present the implementation and various technical details, and we provide a brief overview of additional improvements that have been introduced in the new version of XtalOpt.","sentences":["Version 13 of XtalOpt, an evolutionary algorithm for crystal structure prediction, is now available for download from the CPC program library or the XtalOpt website, https://xtalopt.github.io.","In the new version of the XtalOpt code, a general platform for multi-objective global optimization is implemented.","This functionality is designed to facilitate the search for (meta)stable phases of functional materials through minimization of the enthalpy of a crystalline system coupled with the simultaneous optimization of any desired properties that are specified by the user.","The code is also able to perform a constrained search by filtering the parent pool of structures based on a user-specified feature, while optimizing multiple objectives.","Here, we present the implementation and various technical details, and we provide a brief overview of additional improvements that have been introduced in the new version of XtalOpt."],"url":"http://arxiv.org/abs/2405.02138v1","category":"physics.comp-ph"} +{"created":"2024-05-03 14:38:59","title":"Optimising Calls to Large Language Models with Uncertainty-Based Two-Tier Selection","abstract":"Researchers and practitioners operating on a limited budget face the cost-performance trade-off dilemma. The challenging decision often centers on whether to use a large LLM with better performance or a smaller one with reduced costs. This has motivated recent research in the optimisation of LLM calls. Either a cascading strategy is used, where a smaller LLM or both are called sequentially, or a routing strategy is used, where only one model is ever called. Both scenarios are dependent on a decision criterion which is typically implemented by an extra neural model. In this work, we propose a simpler solution; we use only the uncertainty of the generations of the small LLM as the decision criterion. We compare our approach with both cascading and routing strategies using three different pairs of pre-trained small and large LLMs, on nine different tasks and against approaches that require an additional neural model. Our experiments reveal this simple solution optimally balances cost and performance, outperforming existing methods on 25 out of 27 experimental setups.","sentences":["Researchers and practitioners operating on a limited budget face the cost-performance trade-off dilemma.","The challenging decision often centers on whether to use a large LLM with better performance or a smaller one with reduced costs.","This has motivated recent research in the optimisation of LLM calls.","Either a cascading strategy is used, where a smaller LLM or both are called sequentially, or a routing strategy is used, where only one model is ever called.","Both scenarios are dependent on a decision criterion which is typically implemented by an extra neural model.","In this work, we propose a simpler solution; we use only the uncertainty of the generations of the small LLM as the decision criterion.","We compare our approach with both cascading and routing strategies using three different pairs of pre-trained small and large LLMs, on nine different tasks and against approaches that require an additional neural model.","Our experiments reveal this simple solution optimally balances cost and performance, outperforming existing methods on 25 out of 27 experimental setups."],"url":"http://arxiv.org/abs/2405.02134v1","category":"cs.CL"} +{"created":"2024-05-03 14:37:17","title":"Learning from Evolution: Improving Collective Decision-Making Mechanisms using Insights from Evolutionary Robotics","abstract":"Collective decision-making enables multi-robot systems to act autonomously in real-world environments. Existing collective decision-making mechanisms suffer from the so-called speed versus accuracy trade-off or rely on high complexity, e.g., by including global communication. Recent work has shown that more efficient collective decision-making mechanisms based on artificial neural networks can be generated using methods from evolutionary computation. A major drawback of these decision-making neural networks is their limited interpretability. Analyzing evolved decision-making mechanisms can help us improve the efficiency of hand-coded decision-making mechanisms while maintaining a higher interpretability. In this paper, we analyze evolved collective decision-making mechanisms in detail and hand-code two new decision-making mechanisms based on the insights gained. In benchmark experiments, we show that the newly implemented collective decision-making mechanisms are more efficient than the state-of-the-art collective decision-making mechanisms voter model and majority rule.","sentences":["Collective decision-making enables multi-robot systems to act autonomously in real-world environments.","Existing collective decision-making mechanisms suffer from the so-called speed versus accuracy trade-off or rely on high complexity, e.g., by including global communication.","Recent work has shown that more efficient collective decision-making mechanisms based on artificial neural networks can be generated using methods from evolutionary computation.","A major drawback of these decision-making neural networks is their limited interpretability.","Analyzing evolved decision-making mechanisms can help us improve the efficiency of hand-coded decision-making mechanisms while maintaining a higher interpretability.","In this paper, we analyze evolved collective decision-making mechanisms in detail and hand-code two new decision-making mechanisms based on the insights gained.","In benchmark experiments, we show that the newly implemented collective decision-making mechanisms are more efficient than the state-of-the-art collective decision-making mechanisms voter model and majority rule."],"url":"http://arxiv.org/abs/2405.02133v1","category":"cs.MA"} +{"created":"2024-05-03 14:35:02","title":"Physics-informed generative neural networks for RF propagation prediction with application to indoor body perception","abstract":"Electromagnetic (EM) body models designed to predict Radio-Frequency (RF) propagation are time-consuming methods which prevent their adoption in strict real-time computational imaging problems, such as human body localization and sensing. Physics-informed Generative Neural Network (GNN) models have been recently proposed to reproduce EM effects, namely to simulate or reconstruct missing data or samples by incorporating relevant EM principles and constraints. The paper discusses a Variational Auto-Encoder (VAE) model which is trained to reproduce the effects of human motions on the EM field and incorporate EM body diffraction principles. Proposed physics-informed generative neural network models are verified against both classical diffraction-based EM tools and full-wave EM body simulations.","sentences":["Electromagnetic (EM) body models designed to predict Radio-Frequency (RF) propagation are time-consuming methods which prevent their adoption in strict real-time computational imaging problems, such as human body localization and sensing.","Physics-informed Generative Neural Network (GNN) models have been recently proposed to reproduce EM effects, namely to simulate or reconstruct missing data or samples by incorporating relevant EM principles and constraints.","The paper discusses a Variational Auto-Encoder (VAE) model which is trained to reproduce the effects of human motions on the EM field and incorporate EM body diffraction principles.","Proposed physics-informed generative neural network models are verified against both classical diffraction-based EM tools and full-wave EM body simulations."],"url":"http://arxiv.org/abs/2405.02131v1","category":"eess.SP"} +{"created":"2024-05-03 14:33:48","title":"Double extension of flat pseudo-Riemannian $F$-Lie algebras","abstract":"We define the concept of a flat pseudo-Riemannian $F$-Lie algebra and construct its corresponding double extension. This algebraic structure can be interpreted as the infinitesimal analogue of a Frobenius Lie group devoid of Euler vector fields. We show that the double extension provides a framework for generating all weakly flat Lorentzian non-abelian bi-nilpotent $F$-Lie algebras possessing one dimensional light-cone subspaces. A similar result can be established for nilpotent Lie algebras equipped with flat scalar products of signature $(2,n-2)$ where $n\\geq 4$. Furthermore, we use this technique to construct Poisson algebras exhibiting compatibility with flat scalar products.","sentences":["We define the concept of a flat pseudo-Riemannian $F$-Lie algebra and construct its corresponding double extension.","This algebraic structure can be interpreted as the infinitesimal analogue of a Frobenius Lie group devoid of Euler vector fields.","We show that the double extension provides a framework for generating all weakly flat Lorentzian non-abelian bi-nilpotent $F$-Lie algebras possessing one dimensional light-cone subspaces.","A similar result can be established for nilpotent Lie algebras equipped with flat scalar products of signature $(2,n-2)$ where $n\\geq 4$.","Furthermore, we use this technique to construct Poisson algebras exhibiting compatibility with flat scalar products."],"url":"http://arxiv.org/abs/2405.02130v1","category":"math.DG"} +{"created":"2024-05-03 14:29:56","title":"City size distributions are driven by each generation's stay-vs-leave decision","abstract":"Throughout history most young adults have chosen to live where their parents did while a smaller number moved away. This is sufficient, by proof and simulation, to account for the well-known power law distributions of city sizes. The model needs only two parameters, $r$ = the probability that a child stays, and the maximum number of cities (which models the observed saturation at high city rank). The power law exponent follows directly as $\\alpha = 1 + 1/r$, with Zipf's Law simply the limiting case as $r \\rightarrow 1$. Observed exponents $(\\alpha = 2.2 \\pm 0.4, n = 158)$ are consistent with stay-or-leave data from large genealogic studies. This model is self-initializing and could have applied from the time of the earliest stable settlements. The driving narrative behind city-size distributions is fundamentally about family ties, familiarity, and risk-avoidance, rather than economic optimization.","sentences":["Throughout history most young adults have chosen to live where their parents did while a smaller number moved away.","This is sufficient, by proof and simulation, to account for the well-known power law distributions of city sizes.","The model needs only two parameters, $r$ = the probability that a child stays, and the maximum number of cities (which models the observed saturation at high city rank).","The power law exponent follows directly as $\\alpha = 1 + 1/r$, with Zipf's Law simply the limiting case as $r \\rightarrow 1$. Observed exponents $(\\alpha = 2.2 \\pm 0.4, n = 158)$ are consistent with stay-or-leave data from large genealogic studies.","This model is self-initializing and could have applied from the time of the earliest stable settlements.","The driving narrative behind city-size distributions is fundamentally about family ties, familiarity, and risk-avoidance, rather than economic optimization."],"url":"http://arxiv.org/abs/2405.02129v1","category":"physics.soc-ph"} +{"created":"2024-05-03 14:29:54","title":"Single and Multi-Hop Question-Answering Datasets for Reticular Chemistry with GPT-4-Turbo","abstract":"The rapid advancement in artificial intelligence and natural language processing has led to the development of large-scale datasets aimed at benchmarking the performance of machine learning models. Herein, we introduce 'RetChemQA,' a comprehensive benchmark dataset designed to evaluate the capabilities of such models in the domain of reticular chemistry. This dataset includes both single-hop and multi-hop question-answer pairs, encompassing approximately 45,000 Q&As for each type. The questions have been extracted from an extensive corpus of literature containing about 2,530 research papers from publishers including NAS, ACS, RSC, Elsevier, and Nature Publishing Group, among others. The dataset has been generated using OpenAI's GPT-4 Turbo, a cutting-edge model known for its exceptional language understanding and generation capabilities. In addition to the Q&A dataset, we also release a dataset of synthesis conditions extracted from the corpus of literature used in this study. The aim of RetChemQA is to provide a robust platform for the development and evaluation of advanced machine learning algorithms, particularly for the reticular chemistry community. The dataset is structured to reflect the complexities and nuances of real-world scientific discourse, thereby enabling nuanced performance assessments across a variety of tasks. The dataset is available at the following link: https://github.com/nakulrampal/RetChemQA","sentences":["The rapid advancement in artificial intelligence and natural language processing has led to the development of large-scale datasets aimed at benchmarking the performance of machine learning models.","Herein, we introduce 'RetChemQA,' a comprehensive benchmark dataset designed to evaluate the capabilities of such models in the domain of reticular chemistry.","This dataset includes both single-hop and multi-hop question-answer pairs, encompassing approximately 45,000 Q&As for each type.","The questions have been extracted from an extensive corpus of literature containing about 2,530 research papers from publishers including NAS, ACS, RSC, Elsevier, and Nature Publishing Group, among others.","The dataset has been generated using OpenAI's GPT-4 Turbo, a cutting-edge model known for its exceptional language understanding and generation capabilities.","In addition to the Q&A dataset, we also release a dataset of synthesis conditions extracted from the corpus of literature used in this study.","The aim of RetChemQA is to provide a robust platform for the development and evaluation of advanced machine learning algorithms, particularly for the reticular chemistry community.","The dataset is structured to reflect the complexities and nuances of real-world scientific discourse, thereby enabling nuanced performance assessments across a variety of tasks.","The dataset is available at the following link: https://github.com/nakulrampal/RetChemQA"],"url":"http://arxiv.org/abs/2405.02128v1","category":"cs.CL"} +{"created":"2024-05-03 14:27:50","title":"Fully Relativistic Derivation of the Thermal Sunyaev-Zel'dovich Effect","abstract":"We present the first fully and inherently relativistic derivation of the thermal Sunyaev-Zel'dovich effect. This work uses the formalism historically used to compute radiation spectra emerging from inverse Thomson/Compton sources of x-ray radiation. Comparing our results to the traditional approach based on relativistically-corrected classical Kompaneets equation, we find small, but systematic differences. Most notable are the modest (< 10 %) differences in the crossover frequency where the spectral distortion due to the Sunyaev-Zel'dovich effect vanishes, and the energy increase of the distribution at high electron cloud temperatures.","sentences":["We present the first fully and inherently relativistic derivation of the thermal Sunyaev-Zel'dovich effect.","This work uses the formalism historically used to compute radiation spectra emerging from inverse Thomson/Compton sources of x-ray radiation.","Comparing our results to the traditional approach based on relativistically-corrected classical Kompaneets equation, we find small, but systematic differences.","Most notable are the modest (< 10 %) differences in the crossover frequency where the spectral distortion due to the Sunyaev-Zel'dovich effect vanishes, and the energy increase of the distribution at high electron cloud temperatures."],"url":"http://arxiv.org/abs/2405.02127v1","category":"astro-ph.HE"} +{"created":"2024-05-03 14:25:21","title":"TIPAA-SSL: Text Independent Phone-to-Audio Alignment based on Self-Supervised Learning and Knowledge Transfer","abstract":"In this paper, we present a novel approach for text independent phone-to-audio alignment based on phoneme recognition, representation learning and knowledge transfer. Our method leverages a self-supervised model (wav2vec2) fine-tuned for phoneme recognition using a Connectionist Temporal Classification (CTC) loss, a dimension reduction model and a frame-level phoneme classifier trained thanks to forced-alignment labels (using Montreal Forced Aligner) to produce multi-lingual phonetic representations, thus requiring minimal additional training. We evaluate our model using synthetic native data from the TIMIT dataset and the SCRIBE dataset for American and British English, respectively. Our proposed model outperforms the state-of-the-art (charsiu) in statistical metrics and has applications in language learning and speech processing systems. We leave experiments on other languages for future work but the design of the system makes it easily adaptable to other languages.","sentences":["In this paper, we present a novel approach for text independent phone-to-audio alignment based on phoneme recognition, representation learning and knowledge transfer.","Our method leverages a self-supervised model (wav2vec2) fine-tuned for phoneme recognition using a Connectionist Temporal Classification (CTC) loss, a dimension reduction model and a frame-level phoneme classifier trained thanks to forced-alignment labels (using Montreal Forced Aligner) to produce multi-lingual phonetic representations, thus requiring minimal additional training.","We evaluate our model using synthetic native data from the TIMIT dataset and the SCRIBE dataset for American and British English, respectively.","Our proposed model outperforms the state-of-the-art (charsiu) in statistical metrics and has applications in language learning and speech processing systems.","We leave experiments on other languages for future work but the design of the system makes it easily adaptable to other languages."],"url":"http://arxiv.org/abs/2405.02124v1","category":"eess.AS"} +{"created":"2024-05-03 14:17:52","title":"Multi-grid reaction-diffusion master equation: applications to morphogen gradient modelling","abstract":"The multi-grid reaction-diffusion master equation (mgRDME) provides a generalization of stochastic compartment-based reaction-diffusion modelling described by the standard reaction-diffusion master equation (RDME). By enabling different resolutions on lattices for biochemical species with different diffusion constants, the mgRDME approach improves both accuracy and efficiency of compartment-based reaction-diffusion simulations. The mgRDME framework is examined through its application to morphogen gradient formation in stochastic reaction-diffusion scenarios, using both an analytically tractable first-order reaction network and a model with a second-order reaction. The results obtained by the mgRDME modelling are compared with the standard RDME model and with the (more detailed) particle-based Brownian dynamics simulations. The dependence of error and numerical cost on the compartment sizes is defined and investigated through a multi-objective optimization problem.","sentences":["The multi-grid reaction-diffusion master equation (mgRDME) provides a generalization of stochastic compartment-based reaction-diffusion modelling described by the standard reaction-diffusion master equation (RDME).","By enabling different resolutions on lattices for biochemical species with different diffusion constants, the mgRDME approach improves both accuracy and efficiency of compartment-based reaction-diffusion simulations.","The mgRDME framework is examined through its application to morphogen gradient formation in stochastic reaction-diffusion scenarios, using both an analytically tractable first-order reaction network and a model with a second-order reaction.","The results obtained by the mgRDME modelling are compared with the standard RDME model and with the (more detailed) particle-based Brownian dynamics simulations.","The dependence of error and numerical cost on the compartment sizes is defined and investigated through a multi-objective optimization problem."],"url":"http://arxiv.org/abs/2405.02117v1","category":"q-bio.QM"} +{"created":"2024-05-03 14:14:27","title":"Probablistic Restoration with Adaptive Noise Sampling for 3D Human Pose Estimation","abstract":"The accuracy and robustness of 3D human pose estimation (HPE) are limited by 2D pose detection errors and 2D to 3D ill-posed challenges, which have drawn great attention to Multi-Hypothesis HPE research. Most existing MH-HPE methods are based on generative models, which are computationally expensive and difficult to train. In this study, we propose a Probabilistic Restoration 3D Human Pose Estimation framework (PRPose) that can be integrated with any lightweight single-hypothesis model. Specifically, PRPose employs a weakly supervised approach to fit the hidden probability distribution of the 2D-to-3D lifting process in the Single-Hypothesis HPE model and then reverse-map the distribution to the 2D pose input through an adaptive noise sampling strategy to generate reasonable multi-hypothesis samples effectively. Extensive experiments on 3D HPE benchmarks (Human3.6M and MPI-INF-3DHP) highlight the effectiveness and efficiency of PRPose. Code is available at: https://github.com/xzhouzeng/PRPose.","sentences":["The accuracy and robustness of 3D human pose estimation (HPE) are limited by 2D pose detection errors and 2D to 3D ill-posed challenges, which have drawn great attention to Multi-Hypothesis HPE research.","Most existing MH-HPE methods are based on generative models, which are computationally expensive and difficult to train.","In this study, we propose a Probabilistic Restoration 3D Human Pose Estimation framework (PRPose) that can be integrated with any lightweight single-hypothesis model.","Specifically, PRPose employs a weakly supervised approach to fit the hidden probability distribution of the 2D-to-3D lifting process in the Single-Hypothesis HPE model and then reverse-map the distribution to the 2D pose input through an adaptive noise sampling strategy to generate reasonable multi-hypothesis samples effectively.","Extensive experiments on 3D HPE benchmarks (Human3.6M and MPI-INF-3DHP) highlight the effectiveness and efficiency of PRPose.","Code is available at: https://github.com/xzhouzeng/PRPose."],"url":"http://arxiv.org/abs/2405.02114v1","category":"cs.CV"} +{"created":"2024-05-03 14:14:18","title":"A Workflow for GLAM Metadata Crosswalk","abstract":"The acquisition of physical artifacts not only involves transferring existing information into the digital ecosystem but also generates information as a process itself, underscoring the importance of meticulous management of FAIR data and metadata. In addition, the diversity of objects within the cultural heritage domain is reflected in a multitude of descriptive models. The digitization process expands the opportunities for exchange and joint utilization, granted that the descriptive schemas are made interoperable in advance. To achieve this goal, we propose a replicable workflow for metadata schema crosswalks that facilitates the preservation and accessibility of cultural heritage in the digital ecosystem. This work presents a methodology for metadata generation and management in the case study of the digital twin of the temporary exhibition \"The Other Renaissance - Ulisse Aldrovandi and the Wonders of the World\". The workflow delineates a systematic, step-by-step transformation of tabular data into RDF format, to enhance Linked Open Data. The methodology adopts the RDF Mapping Language (RML) technology for converting data to RDF with a human contribution involvement. This last aspect entails an interaction between digital humanists and domain experts through surveys leading to the abstraction and reformulation of domain-specific knowledge, to be exploited in the process of formalizing and converting information.","sentences":["The acquisition of physical artifacts not only involves transferring existing information into the digital ecosystem but also generates information as a process itself, underscoring the importance of meticulous management of FAIR data and metadata.","In addition, the diversity of objects within the cultural heritage domain is reflected in a multitude of descriptive models.","The digitization process expands the opportunities for exchange and joint utilization, granted that the descriptive schemas are made interoperable in advance.","To achieve this goal, we propose a replicable workflow for metadata schema crosswalks that facilitates the preservation and accessibility of cultural heritage in the digital ecosystem.","This work presents a methodology for metadata generation and management in the case study of the digital twin of the temporary exhibition \"The Other Renaissance - Ulisse Aldrovandi and the Wonders of the World\".","The workflow delineates a systematic, step-by-step transformation of tabular data into RDF format, to enhance Linked Open Data.","The methodology adopts the RDF Mapping Language (RML) technology for converting data to RDF with a human contribution involvement.","This last aspect entails an interaction between digital humanists and domain experts through surveys leading to the abstraction and reformulation of domain-specific knowledge, to be exploited in the process of formalizing and converting information."],"url":"http://arxiv.org/abs/2405.02113v1","category":"cs.DL"} +{"created":"2024-05-03 14:13:00","title":"On a generalization of R. Chapman's \"evil determinant\"","abstract":"Let $p$ be an odd prime and $x$ be an indeterminate. Recently, Z.-W. Sun proposed the following conjecture: $$\\det\\left[x+\\left(\\frac{j-i}{p}\\right)\\right]_{0\\le i,j\\le \\frac{p-1}{2}}=\\begin{cases} (\\frac{2}{p})pb_px-a_p & \\mbox{if}\\ p\\equiv 1\\pmod4, 1 & \\mbox{if}\\ p\\equiv 3\\pmod4, \\end{cases}$$ where $a_p$ and $b_p$ are rational numbers related to the fundamental unit and class number of the real quadratic field $\\mathbb{Q}(\\sqrt{p})$. In this paper, we confirm the above conjecture of Sun based on Vsemirnov's decomposition of Chapman's \"evil determinant\".","sentences":["Let $p$ be an odd prime and $x$ be an indeterminate.","Recently, Z.-W. Sun proposed the following conjecture: $$\\det\\left[x+\\left(\\frac{j-i}{p}\\right)\\right]_{0\\le i,j\\le \\frac{p-1}{2}}=\\begin{cases} (\\frac{2}{p})pb_px-a_p & \\mbox{if}\\ p\\equiv 1\\pmod4, 1 & \\mbox{if}\\ p\\equiv 3\\pmod4, \\end{cases}$$ where $a_p$ and $b_p$ are rational numbers related to the fundamental unit and class number of the real quadratic field $\\mathbb{Q}(\\sqrt{p})$. In this paper, we confirm the above conjecture of Sun based on Vsemirnov's decomposition of Chapman's \"evil determinant\"."],"url":"http://arxiv.org/abs/2405.02112v1","category":"math.NT"} +{"created":"2024-05-03 14:10:29","title":"Three-Dimensional Amyloid-Beta PET Synthesis from Structural MRI with Conditional Generative Adversarial Networks","abstract":"Motivation: Alzheimer's Disease hallmarks include amyloid-beta deposits and brain atrophy, detectable via PET and MRI scans, respectively. PET is expensive, invasive and exposes patients to ionizing radiation. MRI is cheaper, non-invasive, and free from ionizing radiation but limited to measuring brain atrophy. Goal: To develop an 3D image translation model that synthesizes amyloid-beta PET images from T1-weighted MRI, exploiting the known relationship between amyloid-beta and brain atrophy. Approach: The model was trained on 616 PET/MRI pairs and validated with 264 pairs. Results: The model synthesized amyloid-beta PET images from T1-weighted MRI with high-degree of similarity showing high SSIM and PSNR metrics (SSIM>0.95&PSNR=28). Impact: Our model proves the feasibility of synthesizing amyloid-beta PET images from structural MRI ones, significantly enhancing accessibility for large-cohort studies and early dementia detection, while also reducing cost, invasiveness, and radiation exposure.","sentences":["Motivation: Alzheimer's Disease hallmarks include amyloid-beta deposits and brain atrophy, detectable via PET and MRI scans, respectively.","PET is expensive, invasive and exposes patients to ionizing radiation.","MRI is cheaper, non-invasive, and free from ionizing radiation but limited to measuring brain atrophy. ","Goal: To develop an 3D image translation model that synthesizes amyloid-beta PET images from T1-weighted MRI, exploiting the known relationship between amyloid-beta and brain atrophy. ","Approach:","The model was trained on 616 PET/MRI pairs and validated with 264 pairs. ","Results:","The model synthesized amyloid-beta PET images from T1-weighted MRI with high-degree of similarity showing high SSIM and PSNR metrics (SSIM>0.95&PSNR=28). ","Impact: Our model proves the feasibility of synthesizing amyloid-beta PET images from structural MRI ones, significantly enhancing accessibility for large-cohort studies and early dementia detection, while also reducing cost, invasiveness, and radiation exposure."],"url":"http://arxiv.org/abs/2405.02109v1","category":"eess.IV"} +{"created":"2024-05-03 14:08:39","title":"Exploring Weak measurements within the Einstein-Dirac Cosmological framework","abstract":"Our study applies the Two-State Formalism alongside weak measurements within a spatially homogeneous and isotropic cosmological framework, wherein Dirac spinors are intricately coupled to classical gravity. To elucidate this, we provide detailed formulations for computing the weak values of the energy-momentum tensors, the Z component of spin, and the characterization of pure states. Weak measurements appear to be a generalization and extension of the computation already made by Finster an Hainzl, in A spatially homogeneous and isotropic Einstein-Dirac cosmology. Our analysis reveals that the acceleration of the Universe expansion can be understood as an outcome of postselection, underscoring the effectiveness of weak measurement as a discerning approach for gauging cosmic acceleration.","sentences":["Our study applies the Two-State Formalism alongside weak measurements within a spatially homogeneous and isotropic cosmological framework, wherein Dirac spinors are intricately coupled to classical gravity.","To elucidate this, we provide detailed formulations for computing the weak values of the energy-momentum tensors, the Z component of spin, and the characterization of pure states.","Weak measurements appear to be a generalization and extension of the computation already made by Finster an Hainzl, in A spatially homogeneous and isotropic Einstein-Dirac cosmology.","Our analysis reveals that the acceleration of the Universe expansion can be understood as an outcome of postselection, underscoring the effectiveness of weak measurement as a discerning approach for gauging cosmic acceleration."],"url":"http://arxiv.org/abs/2405.02108v1","category":"gr-qc"} +{"created":"2024-05-03 14:03:04","title":"Evaluating Large Language Models for Structured Science Summarization in the Open Research Knowledge Graph","abstract":"Structured science summaries or research contributions using properties or dimensions beyond traditional keywords enhances science findability. Current methods, such as those used by the Open Research Knowledge Graph (ORKG), involve manually curating properties to describe research papers' contributions in a structured manner, but this is labor-intensive and inconsistent between the domain expert human curators. We propose using Large Language Models (LLMs) to automatically suggest these properties. However, it's essential to assess the readiness of LLMs like GPT-3.5, Llama 2, and Mistral for this task before application. Our study performs a comprehensive comparative analysis between ORKG's manually curated properties and those generated by the aforementioned state-of-the-art LLMs. We evaluate LLM performance through four unique perspectives: semantic alignment and deviation with ORKG properties, fine-grained properties mapping accuracy, SciNCL embeddings-based cosine similarity, and expert surveys comparing manual annotations with LLM outputs. These evaluations occur within a multidisciplinary science setting. Overall, LLMs show potential as recommendation systems for structuring science, but further finetuning is recommended to improve their alignment with scientific tasks and mimicry of human expertise.","sentences":["Structured science summaries or research contributions using properties or dimensions beyond traditional keywords enhances science findability.","Current methods, such as those used by the Open Research Knowledge Graph (ORKG), involve manually curating properties to describe research papers' contributions in a structured manner, but this is labor-intensive and inconsistent between the domain expert human curators.","We propose using Large Language Models (LLMs) to automatically suggest these properties.","However, it's essential to assess the readiness of LLMs like GPT-3.5, Llama 2, and Mistral for this task before application.","Our study performs a comprehensive comparative analysis between ORKG's manually curated properties and those generated by the aforementioned state-of-the-art LLMs.","We evaluate LLM performance through four unique perspectives: semantic alignment and deviation with ORKG properties, fine-grained properties mapping accuracy, SciNCL embeddings-based cosine similarity, and expert surveys comparing manual annotations with LLM outputs.","These evaluations occur within a multidisciplinary science setting.","Overall, LLMs show potential as recommendation systems for structuring science, but further finetuning is recommended to improve their alignment with scientific tasks and mimicry of human expertise."],"url":"http://arxiv.org/abs/2405.02105v1","category":"cs.AI"} +{"created":"2024-05-03 14:00:54","title":"Searching for a new light gauge boson with axial couplings in muon beam dump experiments","abstract":"We present a formalism for new $U(1)$ interactions involving weak hypercharge, baryon, and lepton numbers, and a possible axial symmetry generator $F_A$ in the presence of a second Brout-Englert-Higgs doublet. The resulting $U$ boson, after mixing with the $Z$, interpolates between a generalised dark photon, a dark $Z$, and an axially coupled gauge boson. We especially focus on the axial couplings originating from $F_A$ or from mixing with the $Z$, determined by the scalar sector via parameters like $\\tan\\beta$ and the v.e.v. of an extra dark singlet. We explore the distinctive features of axially coupled interactions, especially in the ultrarelativistic limit, where the $U$ boson behaves much as an axion-like particle, with enhanced interactions to quarks and leptons. This enhancement is particularly relevant for future muon beam dump experiments, since the muon mass considerably increases the effective coupling, proportional to $2m_\\mu/m_U$, compared to analogous experiments with electrons. We also analyse the shape of the expected beam dump exclusion or discovery regions, influenced by $U$ boson interactions and the experiment geometry. Different situations are considered, limited in particular by cases for which the $U$ decays before reaching the detector, or has too small couplings to produce detectable events. We also compare to vectorially coupled bosons and axion-like pseudoscalars, highlighting the importance of understanding the parameter space for future experiment design and optimisation.","sentences":["We present a formalism for new $U(1)$ interactions involving weak hypercharge, baryon, and lepton numbers, and a possible axial symmetry generator $F_A$ in the presence of a second Brout-Englert-Higgs doublet.","The resulting $U$ boson, after mixing with the $Z$, interpolates between a generalised dark photon, a dark $Z$, and an axially coupled gauge boson.","We especially focus on the axial couplings originating from $F_A$ or from mixing with the $Z$, determined by the scalar sector via parameters like $\\tan\\beta$ and the v.e.v. of an extra dark singlet.","We explore the distinctive features of axially coupled interactions, especially in the ultrarelativistic limit, where the $U$ boson behaves much as an axion-like particle, with enhanced interactions to quarks and leptons.","This enhancement is particularly relevant for future muon beam dump experiments, since the muon mass considerably increases the effective coupling, proportional to $2m_\\mu/m_U$, compared to analogous experiments with electrons.","We also analyse the shape of the expected beam dump exclusion or discovery regions, influenced by $U$ boson interactions and the experiment geometry.","Different situations are considered, limited in particular by cases for which the $U$ decays before reaching the detector, or has too small couplings to produce detectable events.","We also compare to vectorially coupled bosons and axion-like pseudoscalars, highlighting the importance of understanding the parameter space for future experiment design and optimisation."],"url":"http://arxiv.org/abs/2405.02104v1","category":"hep-ph"} +{"created":"2024-05-03 13:57:42","title":"Anomalous transport in the quantum East-West kinetically constrained model","abstract":"We study a chaotic particle-conserving kinetically constrained model, with a single parameter which allows us to break reflection symmetry. Through extensive numerical simulations we find that the domain wall state shows a variety of dynamical behaviors from localization all the way to ballistic transport, depending on the value of the reflection breaking parameter. Surprisingly, such anomalous behavior is not mirrored in infinite-temperature dynamics, which appear to scale diffusively, in line with expectations for generic interacting models. However, studying the particle density gradient, we show that the lack of reflection symmetry affects infinite-temperature dynamics, resulting in an asymmetric dynamical structure factor. This is in disagreement with normal diffusion and suggests that the model may also exhibit anomalous dynamics at infinite temperature in the thermodynamic limit. Finally, we observe low-entangled eigenstates in the spectrum of the model, a telltale sign of quantum many body scars.","sentences":["We study a chaotic particle-conserving kinetically constrained model, with a single parameter which allows us to break reflection symmetry.","Through extensive numerical simulations we find that the domain wall state shows a variety of dynamical behaviors from localization all the way to ballistic transport, depending on the value of the reflection breaking parameter.","Surprisingly, such anomalous behavior is not mirrored in infinite-temperature dynamics, which appear to scale diffusively, in line with expectations for generic interacting models.","However, studying the particle density gradient, we show that the lack of reflection symmetry affects infinite-temperature dynamics, resulting in an asymmetric dynamical structure factor.","This is in disagreement with normal diffusion and suggests that the model may also exhibit anomalous dynamics at infinite temperature in the thermodynamic limit.","Finally, we observe low-entangled eigenstates in the spectrum of the model, a telltale sign of quantum many body scars."],"url":"http://arxiv.org/abs/2405.02102v1","category":"quant-ph"} +{"created":"2024-05-03 13:51:20","title":"Chordal matroids arising from generalized parallel connections II","abstract":"In 1961, Dirac showed that chordal graphs are exactly the graphs that can be constructed from complete graphs by a sequence of clique-sums. In an earlier paper, by analogy with Dirac's result, we introduced the class of $GF(q)$-chordal matroids as those matroids that can be constructed from projective geometries over $GF(q)$ by a sequence of generalized parallel connections across projective geometries over $GF(q)$. Our main result showed that when $q=2$, such matroids have no induced minor in $\\{M(C_4),M(K_4)\\}$. In this paper, we show that the class of $GF(2)$-chordal matroids coincides with the class of binary matroids that have none of $M(K_4)$, $M^*(K_{3,3})$, or $M(C_n)$ for $n\\geq 4$ as a flat. We also show that $GF(q)$-chordal matroids can be characterized by an analogous result to Rose's 1970 characterization of chordal graphs as those that have a perfect elimination ordering of vertices.","sentences":["In 1961, Dirac showed that chordal graphs are exactly the graphs that can be constructed from complete graphs by a sequence of clique-sums.","In an earlier paper, by analogy with Dirac's result, we introduced the class of $GF(q)$-chordal matroids as those matroids that can be constructed from projective geometries over $GF(q)$ by a sequence of generalized parallel connections across projective geometries over $GF(q)$. Our main result showed that when $q=2$, such matroids have no induced minor in $\\{M(C_4),M(K_4)\\}$. In this paper, we show that the class of $GF(2)$-chordal matroids coincides with the class of binary matroids that have none of $M(K_4)$, $M^*(K_{3,3})$, or $M(C_n)$ for $n\\geq 4$ as a flat.","We also show that $GF(q)$-chordal matroids can be characterized by an analogous result to Rose's 1970 characterization of chordal graphs as those that have a perfect elimination ordering of vertices."],"url":"http://arxiv.org/abs/2405.02099v1","category":"math.CO"} +{"created":"2024-05-03 13:42:49","title":"Advanced Detection of Source Code Clones via an Ensemble of Unsupervised Similarity Measures","abstract":"The capability of accurately determining code similarity is crucial in many tasks related to software development. For example, it might be essential to identify code duplicates for performing software maintenance. This research introduces a novel ensemble learning approach for code similarity assessment, combining the strengths of multiple unsupervised similarity measures. The key idea is that the strengths of a diverse set of similarity measures can complement each other and mitigate individual weaknesses, leading to improved performance. Preliminary results show that while Transformers-based CodeBERT and its variant GraphCodeBERT are undoubtedly the best option in the presence of abundant training data, in the case of specific small datasets (up to 500 samples), our ensemble achieves similar results, without prejudice to the interpretability of the resulting solution, and with a much lower associated carbon footprint due to training. The source code of this novel approach can be downloaded from https://github.com/jorge-martinez-gil/ensemble-codesim.","sentences":["The capability of accurately determining code similarity is crucial in many tasks related to software development.","For example, it might be essential to identify code duplicates for performing software maintenance.","This research introduces a novel ensemble learning approach for code similarity assessment, combining the strengths of multiple unsupervised similarity measures.","The key idea is that the strengths of a diverse set of similarity measures can complement each other and mitigate individual weaknesses, leading to improved performance.","Preliminary results show that while Transformers-based CodeBERT and its variant GraphCodeBERT are undoubtedly the best option in the presence of abundant training data, in the case of specific small datasets (up to 500 samples), our ensemble achieves similar results, without prejudice to the interpretability of the resulting solution, and with a much lower associated carbon footprint due to training.","The source code of this novel approach can be downloaded from https://github.com/jorge-martinez-gil/ensemble-codesim."],"url":"http://arxiv.org/abs/2405.02095v1","category":"cs.SE"} +{"created":"2024-05-03 13:42:49","title":"An informal account of recent results on initial-boundary value problems for systems of conservation laws","abstract":"This note aims at providing a rather informal and hopefully accessible overview of the fairly long and technical work [4]. In that paper, the authors established new global-in-time existence results for admissible solutions of nonlinear systems of conservation laws defined in domains with boundaries. The main novelty in [4] is that the solution is constructed by taking into account the underlying viscous mechanism, which is relevant because, in the case of initial-boundary value problems, different viscous approximations yield in general different limits. This note will frame the analysis of [4] in the relevant context, compare the main result with the previous existing literature, and touch upon the most innnovative technical points of the proof.","sentences":["This note aims at providing a rather informal and hopefully accessible overview of the fairly long and technical work [4].","In that paper, the authors established new global-in-time existence results for admissible solutions of nonlinear systems of conservation laws defined in domains with boundaries.","The main novelty in [4] is that the solution is constructed by taking into account the underlying viscous mechanism, which is relevant because, in the case of initial-boundary value problems, different viscous approximations yield in general different limits.","This note will frame the analysis of [4] in the relevant context, compare the main result with the previous existing literature, and touch upon the most innnovative technical points of the proof."],"url":"http://arxiv.org/abs/2405.02096v1","category":"math.AP"} +{"created":"2024-05-03 13:35:06","title":"Geometric realizations of the $s$-weak order and its lattice quotients","abstract":"For an $n$-tuple $s$ of non-negative integers, the $s$-weak order is a lattice structure on $s$-trees, generalizing the weak order on permutations. We first describe the join irreducible elements, the canonical join representations, and the forcing order of the $s$-weak order in terms of combinatorial objects, generalizing the arcs, the non-crossing arc diagrams, and the subarc order for the weak order. We then extend the theory of shards and shard polytopes to construct geometric realizations of the $s$-weak order and all its lattice quotients as polyhedral complexes, generalizing the quotient fans and quotientopes of the weak order.","sentences":["For an $n$-tuple $s$ of non-negative integers, the $s$-weak order is a lattice structure on $s$-trees, generalizing the weak order on permutations.","We first describe the join irreducible elements, the canonical join representations, and the forcing order of the $s$-weak order in terms of combinatorial objects, generalizing the arcs, the non-crossing arc diagrams, and the subarc order for the weak order.","We then extend the theory of shards and shard polytopes to construct geometric realizations of the $s$-weak order and all its lattice quotients as polyhedral complexes, generalizing the quotient fans and quotientopes of the weak order."],"url":"http://arxiv.org/abs/2405.02092v1","category":"math.CO"} +{"created":"2024-05-03 13:21:49","title":"Multi-level projection with exponential parallel speedup; Application to sparse auto-encoders neural networks","abstract":"The $\\ell_{1,\\infty}$ norm is an efficient structured projection but the complexity of the best algorithm is unfortunately $\\mathcal{O}\\big(n m \\log(n m)\\big)$ for a matrix in $\\mathbb{R}^{n\\times m}$. In this paper, we propose a new bi-level projection method for which we show that the time complexity for the $\\ell_{1,\\infty}$ norm is only $\\mathcal{O}\\big(n m \\big)$ for a matrix in $\\mathbb{R}^{n\\times m}$, and $\\mathcal{O}\\big(n + m \\big)$ with full parallel power. We generalize our method to tensors and we propose a new multi-level projection, having an induced decomposition that yields a linear parallel speedup up to an exponential speedup factor, resulting in a time complexity lower-bounded by the sum of the dimensions. Experiments show that our bi-level $\\ell_{1,\\infty}$ projection is $2.5$ times faster than the actual fastest algorithm provided by \\textit{Chu et. al.} while providing same accuracy and better sparsity in neural networks applications.","sentences":["The $\\ell_{1,\\infty}$ norm is an efficient structured projection but the complexity of the best algorithm is unfortunately $\\mathcal{O}\\big(n m \\log(n m)\\big)$ for a matrix in $\\mathbb{R}^{n\\times m}$. In this paper, we propose a new bi-level projection method for which we show that the time complexity for the $\\ell_{1,\\infty}$ norm is only $\\mathcal{O}\\big(n m \\big)$ for a matrix in $\\mathbb{R}^{n\\times m}$, and $\\mathcal{O}\\big(n + m \\big)$ with full parallel power.","We generalize our method to tensors and we propose a new multi-level projection, having an induced decomposition that yields a linear parallel speedup up to an exponential speedup factor, resulting in a time complexity lower-bounded by the sum of the dimensions.","Experiments show that our bi-level $\\ell_{1,\\infty}$ projection is $2.5$ times faster than the actual fastest algorithm provided by \\textit{Chu et.","al.} while providing same accuracy and better sparsity in neural networks applications."],"url":"http://arxiv.org/abs/2405.02086v1","category":"cs.LG"} +{"created":"2024-05-03 13:20:37","title":"A semantic loss for ontology classification","abstract":"Deep learning models are often unaware of the inherent constraints of the task they are applied to. However, many downstream tasks require logical consistency. For ontology classification tasks, such constraints include subsumption and disjointness relations between classes. In order to increase the consistency of deep learning models, we propose a semantic loss that combines label-based loss with terms penalising subsumption- or disjointness-violations. Our evaluation on the ChEBI ontology shows that the semantic loss is able to decrease the number of consistency violations by several orders of magnitude without decreasing the classification performance. In addition, we use the semantic loss for unsupervised learning. We show that this can further improve consistency on data from a distribution outside the scope of the supervised training.","sentences":["Deep learning models are often unaware of the inherent constraints of the task they are applied to.","However, many downstream tasks require logical consistency.","For ontology classification tasks, such constraints include subsumption and disjointness relations between classes. ","In order to increase the consistency of deep learning models, we propose a semantic loss that combines label-based loss with terms penalising subsumption- or disjointness-violations.","Our evaluation on the ChEBI ontology shows that the semantic loss is able to decrease the number of consistency violations by several orders of magnitude without decreasing the classification performance.","In addition, we use the semantic loss for unsupervised learning.","We show that this can further improve consistency on data from a distribution outside the scope of the supervised training."],"url":"http://arxiv.org/abs/2405.02083v1","category":"cs.AI"} +{"created":"2024-05-03 13:19:33","title":"A comparative study of conformal prediction methods for valid uncertainty quantification in machine learning","abstract":"In the past decades, most work in the area of data analysis and machine learning was focused on optimizing predictive models and getting better results than what was possible with existing models. To what extent the metrics with which such improvements were measured were accurately capturing the intended goal, whether the numerical differences in the resulting values were significant, or whether uncertainty played a role in this study and if it should have been taken into account, was of secondary importance. Whereas probability theory, be it frequentist or Bayesian, used to be the gold standard in science before the advent of the supercomputer, it was quickly replaced in favor of black box models and sheer computing power because of their ability to handle large data sets. This evolution sadly happened at the expense of interpretability and trustworthiness. However, while people are still trying to improve the predictive power of their models, the community is starting to realize that for many applications it is not so much the exact prediction that is of importance, but rather the variability or uncertainty. The work in this dissertation tries to further the quest for a world where everyone is aware of uncertainty, of how important it is and how to embrace it instead of fearing it. A specific, though general, framework that allows anyone to obtain accurate uncertainty estimates is singled out and analysed. Certain aspects and applications of the framework -- dubbed `conformal prediction' -- are studied in detail. Whereas many approaches to uncertainty quantification make strong assumptions about the data, conformal prediction is, at the time of writing, the only framework that deserves the title `distribution-free'. No parametric assumptions have to be made and the nonparametric results also hold without having to resort to the law of large numbers in the asymptotic regime.","sentences":["In the past decades, most work in the area of data analysis and machine learning was focused on optimizing predictive models and getting better results than what was possible with existing models.","To what extent the metrics with which such improvements were measured were accurately capturing the intended goal, whether the numerical differences in the resulting values were significant, or whether uncertainty played a role in this study and if it should have been taken into account, was of secondary importance.","Whereas probability theory, be it frequentist or Bayesian, used to be the gold standard in science before the advent of the supercomputer, it was quickly replaced in favor of black box models and sheer computing power because of their ability to handle large data sets.","This evolution sadly happened at the expense of interpretability and trustworthiness.","However, while people are still trying to improve the predictive power of their models, the community is starting to realize that for many applications it is not so much the exact prediction that is of importance, but rather the variability or uncertainty. ","The work in this dissertation tries to further the quest for a world where everyone is aware of uncertainty, of how important it is and how to embrace it instead of fearing it.","A specific, though general, framework that allows anyone to obtain accurate uncertainty estimates is singled out and analysed.","Certain aspects and applications of the framework -- dubbed `conformal prediction' -- are studied in detail.","Whereas many approaches to uncertainty quantification make strong assumptions about the data, conformal prediction is, at the time of writing, the only framework that deserves the title `distribution-free'.","No parametric assumptions have to be made and the nonparametric results also hold without having to resort to the law of large numbers in the asymptotic regime."],"url":"http://arxiv.org/abs/2405.02082v1","category":"stat.ML"} +{"created":"2024-05-03 13:15:29","title":"A Mutual Information Perspective on Federated Contrastive Learning","abstract":"We investigate contrastive learning in the federated setting through the lens of SimCLR and multi-view mutual information maximization. In doing so, we uncover a connection between contrastive representation learning and user verification; by adding a user verification loss to each client's local SimCLR loss we recover a lower bound to the global multi-view mutual information. To accommodate for the case of when some labelled data are available at the clients, we extend our SimCLR variant to the federated semi-supervised setting. We see that a supervised SimCLR objective can be obtained with two changes: a) the contrastive loss is computed between datapoints that share the same label and b) we require an additional auxiliary head that predicts the correct labels from either of the two views. Along with the proposed SimCLR extensions, we also study how different sources of non-i.i.d.-ness can impact the performance of federated unsupervised learning through global mutual information maximization; we find that a global objective is beneficial for some sources of non-i.i.d.-ness but can be detrimental for others. We empirically evaluate our proposed extensions in various tasks to validate our claims and furthermore demonstrate that our proposed modifications generalize to other pretraining methods.","sentences":["We investigate contrastive learning in the federated setting through the lens of SimCLR and multi-view mutual information maximization.","In doing so, we uncover a connection between contrastive representation learning and user verification; by adding a user verification loss to each client's local SimCLR loss we recover a lower bound to the global multi-view mutual information.","To accommodate for the case of when some labelled data are available at the clients, we extend our SimCLR variant to the federated semi-supervised setting.","We see that a supervised SimCLR objective can be obtained with two changes: a) the contrastive loss is computed between datapoints that share the same label and b) we require an additional auxiliary head that predicts the correct labels from either of the two views.","Along with the proposed SimCLR extensions, we also study how different sources of non-i.i.d.-ness can impact the performance of federated unsupervised learning through global mutual information maximization; we find that a global objective is beneficial for some sources of non-i.i.d.-ness but can be detrimental for others.","We empirically evaluate our proposed extensions in various tasks to validate our claims and furthermore demonstrate that our proposed modifications generalize to other pretraining methods."],"url":"http://arxiv.org/abs/2405.02081v1","category":"cs.LG"} +{"created":"2024-05-03 13:12:28","title":"Argumentative Large Language Models for Explainable and Contestable Decision-Making","abstract":"The diversity of knowledge encoded in large language models (LLMs) and their ability to apply this knowledge zero-shot in a range of settings makes them a promising candidate for use in decision-making. However, they are currently limited by their inability to reliably provide outputs which are explainable and contestable. In this paper, we attempt to reconcile these strengths and weaknesses by introducing a method for supplementing LLMs with argumentative reasoning. Concretely, we introduce argumentative LLMs, a method utilising LLMs to construct argumentation frameworks, which then serve as the basis for formal reasoning in decision-making. The interpretable nature of these argumentation frameworks and formal reasoning means that any decision made by the supplemented LLM may be naturally explained to, and contested by, humans. We demonstrate the effectiveness of argumentative LLMs experimentally in the decision-making task of claim verification. We obtain results that are competitive with, and in some cases surpass, comparable state-of-the-art techniques.","sentences":["The diversity of knowledge encoded in large language models (LLMs) and their ability to apply this knowledge zero-shot in a range of settings makes them a promising candidate for use in decision-making.","However, they are currently limited by their inability to reliably provide outputs which are explainable and contestable.","In this paper, we attempt to reconcile these strengths and weaknesses by introducing a method for supplementing LLMs with argumentative reasoning.","Concretely, we introduce argumentative LLMs, a method utilising LLMs to construct argumentation frameworks, which then serve as the basis for formal reasoning in decision-making.","The interpretable nature of these argumentation frameworks and formal reasoning means that any decision made by the supplemented LLM may be naturally explained to, and contested by, humans.","We demonstrate the effectiveness of argumentative LLMs experimentally in the decision-making task of claim verification.","We obtain results that are competitive with, and in some cases surpass, comparable state-of-the-art techniques."],"url":"http://arxiv.org/abs/2405.02079v1","category":"cs.CL"} +{"created":"2024-05-03 13:05:47","title":"Stability of Axion-Saxion wormholes","abstract":"We reconsider the perturbative stability of Euclidean axion wormholes. The quadratic action that governs linear perturbations is derived directly in Euclidean gravity. We demonstrate explicitly that a stability analysis in which one treats the axion as a normal two-form gauge field is equivalent to one performed in the Hodge-dual formulation, where one considers the axion as a scalar with a wrong-sign kinetic term. Both analyses indicate that axion wormholes are perturbatively stable, even in the presence of a massless dilaton, or saxion, field that couples to the axion.","sentences":["We reconsider the perturbative stability of Euclidean axion wormholes.","The quadratic action that governs linear perturbations is derived directly in Euclidean gravity.","We demonstrate explicitly that a stability analysis in which one treats the axion as a normal two-form gauge field is equivalent to one performed in the Hodge-dual formulation, where one considers the axion as a scalar with a wrong-sign kinetic term.","Both analyses indicate that axion wormholes are perturbatively stable, even in the presence of a massless dilaton, or saxion, field that couples to the axion."],"url":"http://arxiv.org/abs/2405.02072v1","category":"hep-th"} +{"created":"2024-05-03 13:05:19","title":"Spacelike initial data for black hole stability","abstract":"We construct initial data suitable for the Kerr stability conjecture, that is, solutions to the constraint equations on a spacelike hypersurface with boundary entering the black hole horizon that are arbitrarily decaying perturbations of a Kerr initial data set. This results from a more general perturbative construction on any asymptotically flat initial data set with the topology of $\\mathbb{R}^3\\setminus\\{r<1\\}$ enjoying some analyticity near and at the boundary. In particular, we design a suitable mixed boundary condition for the elliptic operator of the conformal method in order to exclude the Killing initial data sets (KIDS).","sentences":["We construct initial data suitable for the Kerr stability conjecture, that is, solutions to the constraint equations on a spacelike hypersurface with boundary entering the black hole horizon that are arbitrarily decaying perturbations of a Kerr initial data set.","This results from a more general perturbative construction on any asymptotically flat initial data set with the topology of $\\mathbb{R}^3\\setminus\\{r<1\\}$ enjoying some analyticity near and at the boundary.","In particular, we design a suitable mixed boundary condition for the elliptic operator of the conformal method in order to exclude the Killing initial data sets (KIDS)."],"url":"http://arxiv.org/abs/2405.02071v1","category":"math.AP"} +{"created":"2024-05-03 13:00:22","title":"Advancing Pre-trained Teacher: Towards Robust Feature Discrepancy for Anomaly Detection","abstract":"With the wide application of knowledge distillation between an ImageNet pre-trained teacher model and a learnable student model, industrial anomaly detection has witnessed a significant achievement in the past few years. The success of knowledge distillation mainly relies on how to keep the feature discrepancy between the teacher and student model, in which it assumes that: (1) the teacher model can jointly represent two different distributions for the normal and abnormal patterns, while (2) the student model can only reconstruct the normal distribution. However, it still remains a challenging issue to maintain these ideal assumptions in practice. In this paper, we propose a simple yet effective two-stage industrial anomaly detection framework, termed as AAND, which sequentially performs Anomaly Amplification and Normality Distillation to obtain robust feature discrepancy. In the first anomaly amplification stage, we propose a novel Residual Anomaly Amplification (RAA) module to advance the pre-trained teacher encoder. With the exposure of synthetic anomalies, it amplifies anomalies via residual generation while maintaining the integrity of pre-trained model. It mainly comprises a Matching-guided Residual Gate and an Attribute-scaling Residual Generator, which can determine the residuals' proportion and characteristic, respectively. In the second normality distillation stage, we further employ a reverse distillation paradigm to train a student decoder, in which a novel Hard Knowledge Distillation (HKD) loss is built to better facilitate the reconstruction of normal patterns. Comprehensive experiments on the MvTecAD, VisA, and MvTec3D-RGB datasets show that our method achieves state-of-the-art performance.","sentences":["With the wide application of knowledge distillation between an ImageNet pre-trained teacher model and a learnable student model, industrial anomaly detection has witnessed a significant achievement in the past few years.","The success of knowledge distillation mainly relies on how to keep the feature discrepancy between the teacher and student model, in which it assumes that: (1) the teacher model can jointly represent two different distributions for the normal and abnormal patterns, while (2) the student model can only reconstruct the normal distribution.","However, it still remains a challenging issue to maintain these ideal assumptions in practice.","In this paper, we propose a simple yet effective two-stage industrial anomaly detection framework, termed as AAND, which sequentially performs Anomaly Amplification and Normality Distillation to obtain robust feature discrepancy.","In the first anomaly amplification stage, we propose a novel Residual Anomaly Amplification (RAA) module to advance the pre-trained teacher encoder.","With the exposure of synthetic anomalies, it amplifies anomalies via residual generation while maintaining the integrity of pre-trained model.","It mainly comprises a Matching-guided Residual Gate and an Attribute-scaling Residual Generator, which can determine the residuals' proportion and characteristic, respectively.","In the second normality distillation stage, we further employ a reverse distillation paradigm to train a student decoder, in which a novel Hard Knowledge Distillation (HKD) loss is built to better facilitate the reconstruction of normal patterns.","Comprehensive experiments on the MvTecAD, VisA, and MvTec3D-RGB datasets show that our method achieves state-of-the-art performance."],"url":"http://arxiv.org/abs/2405.02068v1","category":"cs.CV"} +{"created":"2024-05-03 12:52:30","title":"Unstable algebraic K-theory: homological stability and other observations","abstract":"We investigate stability properties of the reductive Borel-Serre categories; these were introduced as a model for unstable algebraic K-theory in previous work. We see that they exhibit better homological stability properties than the general linear groups. We also show that they provide an explicit model for Yuan's partial algebraic K-theory.","sentences":["We investigate stability properties of the reductive Borel-Serre categories; these were introduced as a model for unstable algebraic K-theory in previous work.","We see that they exhibit better homological stability properties than the general linear groups.","We also show that they provide an explicit model for Yuan's partial algebraic K-theory."],"url":"http://arxiv.org/abs/2405.02065v1","category":"math.KT"} +{"created":"2024-05-03 12:51:15","title":"Elliptic fourth-order operators with Wentzell boundary conditions on Lipschitz domains","abstract":"For bounded domains $\\Omega$ with Lipschitz boundary $\\Gamma$, we investigate boundary value problems for elliptic operators with variable coefficients of fourth order subject to Wentzell (or dynamic) boundary conditions. Using form methods, we begin by showing general results for an even wider class of operators defined via two (intertwined) quadratic forms by defining very abstract concepts of weak traces. Even in this general setting, we prove generation of an analytic semigroup on the product space $L^2(\\Omega) \\times L^2(\\Gamma)$. Using recent results concerning weak co-normal traces, we apply our abstract theory to the elliptic fourth-order case and are able to fully characterize the domain in terms of Sobolev regularity, also obtaining H\\\"older-regularity of solutions. Finally, we also discuss asymptotic behavior and (eventual) positivity.","sentences":["For bounded domains $\\Omega$ with Lipschitz boundary $\\Gamma$, we investigate boundary value problems for elliptic operators with variable coefficients of fourth order subject to Wentzell (or dynamic) boundary conditions.","Using form methods, we begin by showing general results for an even wider class of operators defined via two (intertwined) quadratic forms by defining very abstract concepts of weak traces.","Even in this general setting, we prove generation of an analytic semigroup on the product space $L^2(\\Omega)","\\times L^2(\\Gamma)$. Using recent results concerning weak co-normal traces, we apply our abstract theory to the elliptic fourth-order case and are able to fully characterize the domain in terms of Sobolev regularity, also obtaining H\\\"older-regularity of solutions.","Finally, we also discuss asymptotic behavior and (eventual) positivity."],"url":"http://arxiv.org/abs/2405.02064v1","category":"math.AP"} +{"created":"2024-05-03 12:42:43","title":"Towards general deep-learning-based tree instance segmentation models","abstract":"The segmentation of individual trees from forest point clouds is a crucial task for downstream analyses such as carbon sequestration estimation. Recently, deep-learning-based methods have been proposed which show the potential of learning to segment trees. Since these methods are trained in a supervised way, the question arises how general models can be obtained that are applicable across a wide range of settings. So far, training has been mainly conducted with data from one specific laser scanning type and for specific types of forests. In this work, we train one segmentation model under various conditions, using seven diverse datasets found in literature, to gain insights into the generalization capabilities under domain-shift. Our results suggest that a generalization from coniferous dominated sparse point clouds to deciduous dominated high-resolution point clouds is possible. Conversely, qualitative evidence suggests that generalization from high-resolution to low-resolution point clouds is challenging. This emphasizes the need for forest point clouds with diverse data characteristics for model development. To enrich the available data basis, labeled trees from two previous works were propagated to the complete forest point cloud and are made publicly available at https://doi.org/10.25625/QUTUWU.","sentences":["The segmentation of individual trees from forest point clouds is a crucial task for downstream analyses such as carbon sequestration estimation.","Recently, deep-learning-based methods have been proposed which show the potential of learning to segment trees.","Since these methods are trained in a supervised way, the question arises how general models can be obtained that are applicable across a wide range of settings.","So far, training has been mainly conducted with data from one specific laser scanning type and for specific types of forests.","In this work, we train one segmentation model under various conditions, using seven diverse datasets found in literature, to gain insights into the generalization capabilities under domain-shift.","Our results suggest that a generalization from coniferous dominated sparse point clouds to deciduous dominated high-resolution point clouds is possible.","Conversely, qualitative evidence suggests that generalization from high-resolution to low-resolution point clouds is challenging.","This emphasizes the need for forest point clouds with diverse data characteristics for model development.","To enrich the available data basis, labeled trees from two previous works were propagated to the complete forest point cloud and are made publicly available at https://doi.org/10.25625/QUTUWU."],"url":"http://arxiv.org/abs/2405.02061v1","category":"cs.CV"} +{"created":"2024-05-03 12:41:09","title":"On accumulated spectrograms for Gabor frames","abstract":"Analogs of classical results on accumulated spectrograms, the sum of spectrograms of eigenfunctions of localization operators, are established for Gabor multipliers on tight frames. We show that the lattice $\\ell^1$ distance between the accumulated spectrogram and the indicator function of the Gabor multiplier mask is bounded by the length of the perimeter of the mask and that this bound is sharp in general. The methods developed for the proofs are also used to show that the Weyl-Heisenberg ensemble restricted to a lattice is hyperuniform.","sentences":["Analogs of classical results on accumulated spectrograms, the sum of spectrograms of eigenfunctions of localization operators, are established for Gabor multipliers on tight frames.","We show that the lattice $\\ell^1$ distance between the accumulated spectrogram and the indicator function of the Gabor multiplier mask is bounded by the length of the perimeter of the mask and that this bound is sharp in general.","The methods developed for the proofs are also used to show that the Weyl-Heisenberg ensemble restricted to a lattice is hyperuniform."],"url":"http://arxiv.org/abs/2405.02059v1","category":"math.FA"} +{"created":"2024-05-03 12:35:54","title":"Solving Sequential Manipulation Puzzles by Finding Easier Subproblems","abstract":"We consider a set of challenging sequential manipulation puzzles, where an agent has to interact with multiple movable objects and navigate narrow passages. Such settings are notoriously difficult for Task-and-Motion Planners, as they require interdependent regrasps and solving hard motion planning problems. In this paper, we propose to search over sequences of easier pick-and-place subproblems, which can lead to the solution of the manipulation puzzle. Our method combines a heuristic-driven forward search of subproblems with an optimization-based Task-and-Motion Planning solver. To guide the search, we introduce heuristics to generate and prioritize useful subgoals. We evaluate our approach on various manually designed and automatically generated scenes, demonstrating the benefits of auxiliary subproblems in sequential manipulation planning.","sentences":["We consider a set of challenging sequential manipulation puzzles, where an agent has to interact with multiple movable objects and navigate narrow passages.","Such settings are notoriously difficult for Task-and-Motion Planners, as they require interdependent regrasps and solving hard motion planning problems.","In this paper, we propose to search over sequences of easier pick-and-place subproblems, which can lead to the solution of the manipulation puzzle.","Our method combines a heuristic-driven forward search of subproblems with an optimization-based Task-and-Motion Planning solver.","To guide the search, we introduce heuristics to generate and prioritize useful subgoals.","We evaluate our approach on various manually designed and automatically generated scenes, demonstrating the benefits of auxiliary subproblems in sequential manipulation planning."],"url":"http://arxiv.org/abs/2405.02053v1","category":"cs.RO"} +{"created":"2024-05-03 12:31:42","title":"Cohesive urban bicycle infrastructure design through optimal transport routing in multilayer networks","abstract":"Bicycle infrastructure networks must meet the needs of cyclists to position cycling as a viable transportation choice in cities. In particular, protected infrastructure should be planned cohesively for the whole city and spacious enough to accommodate all cyclists safely and prevent cyclist congestion -- a common problem in cycling cities like Copenhagen. Here, we devise an adaptive method for optimal bicycle network design and for evaluating congestion criticalities on bicycle paths. The method goes beyond static network measures, using computationally efficient adaptation rules inspired by Optimal Transport on the dynamically updating multilayer network of roads and protected bicycle lanes. Street capacities and cyclist flows reciprocally control each other to optimally accommodate cyclists on streets with one control parameter that dictates the preference of bicycle infrastructure over roads. Applying our method to Copenhagen confirms that the city's bicycle network is generally well-developed. However, we are able to identify the network's bottlenecks, and we find, at a finer scale, disparities in network accessibility and criticalities between different neighborhoods. Our model and results are generalizable beyond this particular case study to serve as a scalable and versatile tool for aiding urban planners in designing cycling-friendly cities.","sentences":["Bicycle infrastructure networks must meet the needs of cyclists to position cycling as a viable transportation choice in cities.","In particular, protected infrastructure should be planned cohesively for the whole city and spacious enough to accommodate all cyclists safely and prevent cyclist congestion -- a common problem in cycling cities like Copenhagen.","Here, we devise an adaptive method for optimal bicycle network design and for evaluating congestion criticalities on bicycle paths.","The method goes beyond static network measures, using computationally efficient adaptation rules inspired by Optimal Transport on the dynamically updating multilayer network of roads and protected bicycle lanes.","Street capacities and cyclist flows reciprocally control each other to optimally accommodate cyclists on streets with one control parameter that dictates the preference of bicycle infrastructure over roads.","Applying our method to Copenhagen confirms that the city's bicycle network is generally well-developed.","However, we are able to identify the network's bottlenecks, and we find, at a finer scale, disparities in network accessibility and criticalities between different neighborhoods.","Our model and results are generalizable beyond this particular case study to serve as a scalable and versatile tool for aiding urban planners in designing cycling-friendly cities."],"url":"http://arxiv.org/abs/2405.02052v1","category":"physics.soc-ph"} +{"created":"2024-05-03 12:30:33","title":"Cosmology in Entangled Relativity","abstract":"General Relativity, in the absence of a cosmological constant, is an inevitable limit of Entangled Relativity, particularly when the universe is dominated by dust and/or electromagnetic radiation. In this communication, I emphasize that this arises from a specific type of decoupling termed \\textit{intrinsic decoupling}. I then discuss what this implies for Dark Energy candidates within this framework. Furthermore, I introduce a novel and tantalizing hypothesis that the Lagrangian of Entangled Relativity represents merely the unperturbed term of an infinite series in a perturbative scheme. The terms of this series are dictated by the only dimensionful universal parameter of the theory, and notably, this series retains the \\textit{intrinsic decoupling} of the original theory, non-perturbatively.","sentences":["General Relativity, in the absence of a cosmological constant, is an inevitable limit of Entangled Relativity, particularly when the universe is dominated by dust and/or electromagnetic radiation.","In this communication, I emphasize that this arises from a specific type of decoupling termed \\textit{intrinsic decoupling}.","I then discuss what this implies for Dark Energy candidates within this framework.","Furthermore, I introduce a novel and tantalizing hypothesis that the Lagrangian of Entangled Relativity represents merely the unperturbed term of an infinite series in a perturbative scheme.","The terms of this series are dictated by the only dimensionful universal parameter of the theory, and notably, this series retains the \\textit{intrinsic decoupling} of the original theory, non-perturbatively."],"url":"http://arxiv.org/abs/2405.02051v1","category":"gr-qc"} +{"created":"2024-05-03 12:30:27","title":"Hypertree shrinking avoiding low degree vertices","abstract":"The shrinking operation converts a hypergraph into a graph by choosing, from each hyperedge, two endvertices of a corresponding graph edge. A hypertree is a hypergraph which can be shrunk to a tree on the same vertex set. Klimo\\v{s}ov\\'{a} and Thomass\\'{e} [J. Combin. Theory Ser. B 156 (2022), 250--293] proved (as a tool to obtain their main result on edge-decompositions of graphs into paths of equal length) that any rank $3$ hypertree $T$ can be shrunk to a tree where the degree of each vertex is at least $1/100$ times its degree in $T$. We prove a stronger and a more general bound, replacing the constant $1/100$ with $1/2k$ when the rank is $k$. In place of entropy compression (used by Klimo\\v{s}ov\\'{a} and Thomass\\'{e}), we use a hypergraph orientation lemma combined with a characterisation of edge-coloured graphs admitting rainbow spanning trees.","sentences":["The shrinking operation converts a hypergraph into a graph by choosing, from each hyperedge, two endvertices of a corresponding graph edge.","A hypertree is a hypergraph which can be shrunk to a tree on the same vertex set.","Klimo\\v{s}ov\\'{a} and Thomass\\'{e} [J. Combin.","Theory Ser.","B 156 (2022), 250--293] proved (as a tool to obtain their main result on edge-decompositions of graphs into paths of equal length) that any rank $3$ hypertree $T$ can be shrunk to a tree where the degree of each vertex is at least $1/100$ times its degree in $T$. We prove a stronger and a more general bound, replacing the constant $1/100$ with $1/2k$ when the rank is $k$. In place of entropy compression (used by Klimo\\v{s}ov\\'{a} and Thomass\\'{e}), we use a hypergraph orientation lemma combined with a characterisation of edge-coloured graphs admitting rainbow spanning trees."],"url":"http://arxiv.org/abs/2405.02049v1","category":"math.CO"} +{"created":"2024-05-03 12:30:01","title":"Comparative Analysis of Retrieval Systems in the Real World","abstract":"This research paper presents a comprehensive analysis of integrating advanced language models with search and retrieval systems in the fields of information retrieval and natural language processing. The objective is to evaluate and compare various state-of-the-art methods based on their performance in terms of accuracy and efficiency. The analysis explores different combinations of technologies, including Azure Cognitive Search Retriever with GPT-4, Pinecone's Canopy framework, Langchain with Pinecone and different language models (OpenAI, Cohere), LlamaIndex with Weaviate Vector Store's hybrid search, Google's RAG implementation on Cloud VertexAI-Search, Amazon SageMaker's RAG, and a novel approach called KG-FID Retrieval. The motivation for this analysis arises from the increasing demand for robust and responsive question-answering systems in various domains. The RobustQA metric is used to evaluate the performance of these systems under diverse paraphrasing of questions. The report aims to provide insights into the strengths and weaknesses of each method, facilitating informed decisions in the deployment and development of AI-driven search and retrieval systems.","sentences":["This research paper presents a comprehensive analysis of integrating advanced language models with search and retrieval systems in the fields of information retrieval and natural language processing.","The objective is to evaluate and compare various state-of-the-art methods based on their performance in terms of accuracy and efficiency.","The analysis explores different combinations of technologies, including Azure Cognitive Search Retriever with GPT-4, Pinecone's Canopy framework, Langchain with Pinecone and different language models (OpenAI, Cohere), LlamaIndex with Weaviate Vector Store's hybrid search, Google's RAG implementation on Cloud VertexAI-Search, Amazon SageMaker's RAG, and a novel approach called KG-FID Retrieval.","The motivation for this analysis arises from the increasing demand for robust and responsive question-answering systems in various domains.","The RobustQA metric is used to evaluate the performance of these systems under diverse paraphrasing of questions.","The report aims to provide insights into the strengths and weaknesses of each method, facilitating informed decisions in the deployment and development of AI-driven search and retrieval systems."],"url":"http://arxiv.org/abs/2405.02048v1","category":"cs.IR"} +{"created":"2024-05-03 12:29:07","title":"Small Logic-based Multipliers with Incomplete Sub-Multipliers for FPGAs","abstract":"There is a recent trend in artificial intelligence (AI) inference towards lower precision data formats down to 8 bits and less. As multiplication is the most complex operation in typical inference tasks, there is a large demand for efficient small multipliers. The large DSP blocks have limitations implementing many small multipliers efficiently. Hence, this work proposes a solution for better logic-based multipliers that is especially beneficial for small multipliers. Our work is based on the multiplier tiling method in which a multiplier is designed out of several sub-multiplier tiles. The key observation we made is that these sub-multipliers do not necessarily have to perform a complete (rectangular) NxK multiplication and more efficient sub-multipliers are possible that are incomplete (non-rectangular). This proposal first seeks to identify efficient incomplete irregular sub-multipliers and then demonstrates improvements over state-of-the-art designs. It is shown that optimal solutions can be found using integer linear programming (ILP), which are evaluated in FPGA synthesis experiments.","sentences":["There is a recent trend in artificial intelligence (AI) inference towards lower precision data formats down to 8 bits and less.","As multiplication is the most complex operation in typical inference tasks, there is a large demand for efficient small multipliers.","The large DSP blocks have limitations implementing many small multipliers efficiently.","Hence, this work proposes a solution for better logic-based multipliers that is especially beneficial for small multipliers.","Our work is based on the multiplier tiling method in which a multiplier is designed out of several sub-multiplier tiles.","The key observation we made is that these sub-multipliers do not necessarily have to perform a complete (rectangular) NxK multiplication and more efficient sub-multipliers are possible that are incomplete (non-rectangular).","This proposal first seeks to identify efficient incomplete irregular sub-multipliers and then demonstrates improvements over state-of-the-art designs.","It is shown that optimal solutions can be found using integer linear programming (ILP), which are evaluated in FPGA synthesis experiments."],"url":"http://arxiv.org/abs/2405.02047v1","category":"cs.AR"} +{"created":"2024-05-03 12:21:43","title":"Zero-Sum Positional Differential Games as a Framework for Robust Reinforcement Learning: Deep Q-Learning Approach","abstract":"Robust Reinforcement Learning (RRL) is a promising Reinforcement Learning (RL) paradigm aimed at training robust to uncertainty or disturbances models, making them more efficient for real-world applications. Following this paradigm, uncertainty or disturbances are interpreted as actions of a second adversarial agent, and thus, the problem is reduced to seeking the agents' policies robust to any opponent's actions. This paper is the first to propose considering the RRL problems within the positional differential game theory, which helps us to obtain theoretically justified intuition to develop a centralized Q-learning approach. Namely, we prove that under Isaacs's condition (sufficiently general for real-world dynamical systems), the same Q-function can be utilized as an approximate solution of both minimax and maximin Bellman equations. Based on these results, we present the Isaacs Deep Q-Network algorithms and demonstrate their superiority compared to other baseline RRL and Multi-Agent RL algorithms in various environments.","sentences":["Robust Reinforcement Learning (RRL) is a promising Reinforcement Learning (RL) paradigm aimed at training robust to uncertainty or disturbances models, making them more efficient for real-world applications.","Following this paradigm, uncertainty or disturbances are interpreted as actions of a second adversarial agent, and thus, the problem is reduced to seeking the agents' policies robust to any opponent's actions.","This paper is the first to propose considering the RRL problems within the positional differential game theory, which helps us to obtain theoretically justified intuition to develop a centralized Q-learning approach.","Namely, we prove that under Isaacs's condition (sufficiently general for real-world dynamical systems), the same Q-function can be utilized as an approximate solution of both minimax and maximin Bellman equations.","Based on these results, we present the Isaacs Deep Q-Network algorithms and demonstrate their superiority compared to other baseline RRL and Multi-Agent RL algorithms in various environments."],"url":"http://arxiv.org/abs/2405.02044v1","category":"cs.LG"} +{"created":"2024-05-03 12:19:38","title":"Large Multimodal Model based Standardisation of Pathology Reports with Confidence and their Prognostic Significance","abstract":"Pathology reports are rich in clinical and pathological details but are often presented in free-text format. The unstructured nature of these reports presents a significant challenge limiting the accessibility of their content. In this work, we present a practical approach based on the use of large multimodal models (LMMs) for automatically extracting information from scanned images of pathology reports with the goal of generating a standardised report specifying the value of different fields along with estimated confidence about the accuracy of the extracted fields. The proposed approach overcomes limitations of existing methods which do not assign confidence scores to extracted fields limiting their practical use. The proposed framework uses two stages of prompting a Large Multimodal Model (LMM) for information extraction and validation. The framework generalises to textual reports from multiple medical centres as well as scanned images of legacy pathology reports. We show that the estimated confidence is an effective indicator of the accuracy of the extracted information that can be used to select only accurately extracted fields. We also show the prognostic significance of structured and unstructured data from pathology reports and show that the automatically extracted field values significant prognostic value for patient stratification. The framework is available for evaluation via the URL: https://labieb.dcs.warwick.ac.uk/.","sentences":["Pathology reports are rich in clinical and pathological details but are often presented in free-text format.","The unstructured nature of these reports presents a significant challenge limiting the accessibility of their content.","In this work, we present a practical approach based on the use of large multimodal models (LMMs) for automatically extracting information from scanned images of pathology reports with the goal of generating a standardised report specifying the value of different fields along with estimated confidence about the accuracy of the extracted fields.","The proposed approach overcomes limitations of existing methods which do not assign confidence scores to extracted fields limiting their practical use.","The proposed framework uses two stages of prompting a Large Multimodal Model (LMM) for information extraction and validation.","The framework generalises to textual reports from multiple medical centres as well as scanned images of legacy pathology reports.","We show that the estimated confidence is an effective indicator of the accuracy of the extracted information that can be used to select only accurately extracted fields.","We also show the prognostic significance of structured and unstructured data from pathology reports and show that the automatically extracted field values significant prognostic value for patient stratification.","The framework is available for evaluation via the URL: https://labieb.dcs.warwick.ac.uk/."],"url":"http://arxiv.org/abs/2405.02040v1","category":"cs.CL"} +{"created":"2024-05-03 12:18:16","title":"On the Submodule Structure of Hook Specht Modules in Characteristic 2","abstract":"The submodule structure of general Specht modules in prime characteristic is a difficult open problem. Kleshchev and Sheth [Journal of Algebra, 221(2), pp.705-722] gave a combinatorial description of the submodule structure of Specht modules labelled by $2$-part partitions in prime characteristic. Using this result, as well as filtrations of Specht modules labelled by hook partitions via $2$-part Specht modules in characteristic $2$, one can study the submodule structure of hook Specht modules. In particular, we classify which of these are uniserial.","sentences":["The submodule structure of general Specht modules in prime characteristic is a difficult open problem.","Kleshchev and Sheth","[Journal of Algebra, 221(2), pp.705-722] gave a combinatorial description of the submodule structure of Specht modules labelled by $2$-part partitions in prime characteristic.","Using this result, as well as filtrations of Specht modules labelled by hook partitions via $2$-part Specht modules in characteristic $2$, one can study the submodule structure of hook Specht modules.","In particular, we classify which of these are uniserial."],"url":"http://arxiv.org/abs/2405.02039v1","category":"math.RT"} +{"created":"2024-05-03 12:13:34","title":"Effect of Helium Ion Implantation on 3C-SiC Nanomechanical String Resonators","abstract":"Hybrid quantum devices enable novel functionalities by combining the benefits of different subsystems. Particularly, point defects in nanomechanical resonators made of diamond or silicon carbide (SiC) have been proposed for precise magnetic field sensing and as versatile quantum transducers. However, the realization of a hybrid system may involve tradeoffs in the performance of the constituent subsystems. In a spin-mechanical system, the mechanical properties of the resonator may suffer from the presence of engineered defects in the crystal lattice. This may severely restrict the performance of the resulting device and needs to be carefully explored. Here, we focus on the impact of defects on high Q nanomechanical string resonators made of pre-stressed 3C-SiC grown on Si(111). We use helium ion implantation to create point defects and study their accumulated effect on the mechanical performance. Using Euler-Bernoulli beam theory, we present a method to determine Young's modulus and the pre-stress of the strings. We find that Young's modulus is not modified by implantation. Under implantation doses relevant for single defect or defect ensemble generation, both tensile stress and damping rate also remain unaltered. For higher implantation dose, both exhibit a characteristic change.","sentences":["Hybrid quantum devices enable novel functionalities by combining the benefits of different subsystems.","Particularly, point defects in nanomechanical resonators made of diamond or silicon carbide (SiC) have been proposed for precise magnetic field sensing and as versatile quantum transducers.","However, the realization of a hybrid system may involve tradeoffs in the performance of the constituent subsystems.","In a spin-mechanical system, the mechanical properties of the resonator may suffer from the presence of engineered defects in the crystal lattice.","This may severely restrict the performance of the resulting device and needs to be carefully explored.","Here, we focus on the impact of defects on high Q nanomechanical string resonators made of pre-stressed 3C-SiC grown on Si(111).","We use helium ion implantation to create point defects and study their accumulated effect on the mechanical performance.","Using Euler-Bernoulli beam theory, we present a method to determine Young's modulus and the pre-stress of the strings.","We find that Young's modulus is not modified by implantation.","Under implantation doses relevant for single defect or defect ensemble generation, both tensile stress and damping rate also remain unaltered.","For higher implantation dose, both exhibit a characteristic change."],"url":"http://arxiv.org/abs/2405.02035v1","category":"cond-mat.mes-hall"} +{"created":"2024-05-03 12:10:57","title":"Multi-Agent Coverage Control on Surfaces Using Conformal Mapping","abstract":"Real-time environmental monitoring using a multi-agent system (MAS) has long been a focal point of cooperative control. It is still a challenging task to provide cost-effective services for potential emergencies in surface environments. This paper explores the transformation of a general surface into a two-dimensional (2D) disk through the construction of a conformal mapping. Multiple agents are strategically deployed within the mapped convex disk, followed by mapping back to the original surface environment. This approach circumvents the complexities associated with handling the difficulties and intricacies of path planning. Technical analysis encompasses the design of distributed control laws and the method to eliminate distortions introduced by the mapping. Moreover, the developed coverage algorithm is applied to a scenario of monitoring surface deformation. Finally, the effectiveness of the proposed algorithm is validated through numerical simulations.","sentences":["Real-time environmental monitoring using a multi-agent system (MAS) has long been a focal point of cooperative control.","It is still a challenging task to provide cost-effective services for potential emergencies in surface environments.","This paper explores the transformation of a general surface into a two-dimensional (2D) disk through the construction of a conformal mapping.","Multiple agents are strategically deployed within the mapped convex disk, followed by mapping back to the original surface environment.","This approach circumvents the complexities associated with handling the difficulties and intricacies of path planning.","Technical analysis encompasses the design of distributed control laws and the method to eliminate distortions introduced by the mapping.","Moreover, the developed coverage algorithm is applied to a scenario of monitoring surface deformation.","Finally, the effectiveness of the proposed algorithm is validated through numerical simulations."],"url":"http://arxiv.org/abs/2405.02034v1","category":"math.OC"} +{"created":"2024-05-03 12:08:25","title":"Features, paradoxes and amendments of perturbative non-Hermitian quantum mechanics","abstract":"Quantum mechanics of unitary systems is considered in quasi-Hermitian representation. In this framework the concept of perturbation is found counterintuitive, for three reasons. The first one is that in this formalism we are allowed to change the physical Hilbert-space norm. Thus, in a preselected Hamiltonian $H(\\lambda)=H_0+\\lambda\\,H_1$ the size (and, hence, influence) of the perturbation cannot always be kept under a reliable control. Often, an enhanced sensitivity to perturbations is observed, for this reason, in open quantum systems. Second, even when we consider just a closed quantum system in which the influence of $H_1\\neq H_1^\\dagger$ is guaranteed to be small, the correct probabilistic interpretation of the system remains ambiguous, mainly due to the non-uniqueness of the physical Hilbert-space inner-product metric~$\\Theta$. Third, even if we decide to ignore the ambiguity and if we pick up just any one of the eligible metrics (which reduces the scope of the theory of course), such a choice would still vary with $\\lambda$. In our paper it is shown that all of these three obstacles can be circumvented via just a mild amendment of the Rayleigh-Schr\\\"{o}dinger perturbation-expansion approach. The flexibility of $\\Theta=\\Theta(\\lambda)$ is shown to remain tractable while opening several new model-building horizons including the study of generic random perturbations and/or of multiple specific non-Hermitian toy models. In parallel, several paradoxes and open questions are shown to survive.","sentences":["Quantum mechanics of unitary systems is considered in quasi-Hermitian representation.","In this framework the concept of perturbation is found counterintuitive, for three reasons.","The first one is that in this formalism we are allowed to change the physical Hilbert-space norm.","Thus, in a preselected Hamiltonian $H(\\lambda)=H_0+\\lambda\\,H_1$ the size (and, hence, influence) of the perturbation cannot always be kept under a reliable control.","Often, an enhanced sensitivity to perturbations is observed, for this reason, in open quantum systems.","Second, even when we consider just a closed quantum system in which the influence of $H_1\\neq H_1^\\dagger$ is guaranteed to be small, the correct probabilistic interpretation of the system remains ambiguous, mainly due to the non-uniqueness of the physical Hilbert-space inner-product metric~$\\Theta$.","Third, even if we decide to ignore the ambiguity and if we pick up just any one of the eligible metrics (which reduces the scope of the theory of course), such a choice would still vary with $\\lambda$. In our paper it is shown that all of these three obstacles can be circumvented via just a mild amendment of the Rayleigh-Schr\\\"{o}dinger perturbation-expansion approach.","The flexibility of $\\Theta=\\Theta(\\lambda)$ is shown to remain tractable while opening several new model-building horizons including the study of generic random perturbations and/or of multiple specific non-Hermitian toy models.","In parallel, several paradoxes and open questions are shown to survive."],"url":"http://arxiv.org/abs/2405.02032v1","category":"quant-ph"} +{"created":"2024-05-03 11:56:13","title":"Analyzing Narrative Processing in Large Language Models (LLMs): Using GPT4 to test BERT","abstract":"The ability to transmit and receive complex information via language is unique to humans and is the basis of traditions, culture and versatile social interactions. Through the disruptive introduction of transformer based large language models (LLMs) humans are not the only entity to \"understand\" and produce language any more. In the present study, we have performed the first steps to use LLMs as a model to understand fundamental mechanisms of language processing in neural networks, in order to make predictions and generate hypotheses on how the human brain does language processing. Thus, we have used ChatGPT to generate seven different stylistic variations of ten different narratives (Aesop's fables). We used these stories as input for the open source LLM BERT and have analyzed the activation patterns of the hidden units of BERT using multi-dimensional scaling and cluster analysis. We found that the activation vectors of the hidden units cluster according to stylistic variations in earlier layers of BERT (1) than narrative content (4-5). Despite the fact that BERT consists of 12 identical building blocks that are stacked and trained on large text corpora, the different layers perform different tasks. This is a very useful model of the human brain, where self-similar structures, i.e. different areas of the cerebral cortex, can have different functions and are therefore well suited to processing language in a very efficient way. The proposed approach has the potential to open the black box of LLMs on the one hand, and might be a further step to unravel the neural processes underlying human language processing and cognition in general.","sentences":["The ability to transmit and receive complex information via language is unique to humans and is the basis of traditions, culture and versatile social interactions.","Through the disruptive introduction of transformer based large language models (LLMs) humans are not the only entity to \"understand\" and produce language any more.","In the present study, we have performed the first steps to use LLMs as a model to understand fundamental mechanisms of language processing in neural networks, in order to make predictions and generate hypotheses on how the human brain does language processing.","Thus, we have used ChatGPT to generate seven different stylistic variations of ten different narratives (Aesop's fables).","We used these stories as input for the open source LLM BERT and have analyzed the activation patterns of the hidden units of BERT using multi-dimensional scaling and cluster analysis.","We found that the activation vectors of the hidden units cluster according to stylistic variations in earlier layers of BERT (1) than narrative content (4-5).","Despite the fact that BERT consists of 12 identical building blocks that are stacked and trained on large text corpora, the different layers perform different tasks.","This is a very useful model of the human brain, where self-similar structures, i.e. different areas of the cerebral cortex, can have different functions and are therefore well suited to processing language in a very efficient way.","The proposed approach has the potential to open the black box of LLMs on the one hand, and might be a further step to unravel the neural processes underlying human language processing and cognition in general."],"url":"http://arxiv.org/abs/2405.02024v1","category":"cs.CL"} +{"created":"2024-05-03 11:41:15","title":"Experimental jet control with Bayesian optimization and persistent data topology","abstract":"This study experimentally optimizes the mixing of a turbulent jet at $Re=10000$ with the surrounding air by targeted shear layer actuation. The forcing is composed of superposed harmonic signals of different azimuthal wavenumber $m$ generated by eight loudspeakers circumferentially distributed around the nozzle lip. Amplitudes and frequencies of the individual harmonic contributions serve as optimization parameters and the time-averaged centerline velocity downstream of the potential core is used as a metric for mixing optimization. The actuation is optimized through Bayesian optimization. Three search spaces are explored - axisymmetric forcing, $m=0$, superposed axisymmetric and helical forcing, $m \\in \\{0,1\\}$, and axisymmetric actuation combined with two counter-rotating helical modes, $m \\in \\{-1,0,1\\}$. High-speed PIV is employed to analyze the jet response to the optimized forcing. The optimization processes are analyzed by persistent data topology. In the search space of axisymmetric excitation, the routine identifies an actuation at the natural frequency of the flow to be most efficient, with the centerline velocity being decreased by $15\\%$. The optimal solutions in both the two-mode and three-mode search space converge to a similar forcing with one axial and one helical mode combined at a frequency ratio of around $2.3$. Spectral analysis of the PIV images reveals that for the identified optimal forcing frequencies, a non-linear interaction between forced and natural structures in the jet flow is triggered, leading to a reduction in centerline velocity of around $35\\%$. The topology of the most complex search space from the discrete data reveals four basins of attractions, classified into three forcing patterns including axisymmetric, axisym.-helical, and axisym.-flapping. Two deep basins are related to the optimal axisym.-helical pattern, and the others are shallower.","sentences":["This study experimentally optimizes the mixing of a turbulent jet at $Re=10000$ with the surrounding air by targeted shear layer actuation.","The forcing is composed of superposed harmonic signals of different azimuthal wavenumber $m$ generated by eight loudspeakers circumferentially distributed around the nozzle lip.","Amplitudes and frequencies of the individual harmonic contributions serve as optimization parameters and the time-averaged centerline velocity downstream of the potential core is used as a metric for mixing optimization.","The actuation is optimized through Bayesian optimization.","Three search spaces are explored - axisymmetric forcing, $m=0$, superposed axisymmetric and helical forcing, $m \\in \\{0,1\\}$, and axisymmetric actuation combined with two counter-rotating helical modes, $m \\in \\{-1,0,1\\}$. High-speed PIV is employed to analyze the jet response to the optimized forcing.","The optimization processes are analyzed by persistent data topology.","In the search space of axisymmetric excitation, the routine identifies an actuation at the natural frequency of the flow to be most efficient, with the centerline velocity being decreased by $15\\%$. The optimal solutions in both the two-mode and three-mode search space converge to a similar forcing with one axial and one helical mode combined at a frequency ratio of around $2.3$. Spectral analysis of the PIV images reveals that for the identified optimal forcing frequencies, a non-linear interaction between forced and natural structures in the jet flow is triggered, leading to a reduction in centerline velocity of around $35\\%$. The topology of the most complex search space from the discrete data reveals four basins of attractions, classified into three forcing patterns including axisymmetric, axisym.-helical, and axisym.-flapping.","Two deep basins are related to the optimal axisym.-helical pattern, and the others are shallower."],"url":"http://arxiv.org/abs/2405.02020v1","category":"physics.flu-dyn"} +{"created":"2024-05-03 11:33:52","title":"Time-of-arrival distributions for continuous quantum systems","abstract":"Using standard results from statistics, we show that for any continuous quantum system (Gaussian or otherwise) and any observable $A$ (position or otherwise), the distribution $ \\pi _{a}\\left(t\\right) $ of a time measurement at a fixed state $a$ can be inferred from the distribution $ \\rho _{t}\\left( a\\right) $ of a state measurement at a fixed time $t$ via the transformation $ \\pi _{a}\\left( t\\right) = \\left\\vert \\frac{\\partial }{\\partial t} \\int_{-\\infty }^{a}\\rho _{t}\\left( u\\right) du \\right\\vert $. This finding suggests that the answer to the long-lasting time-of-arrival problem is in fact readily available in the standard formalism, secretly hidden within the Born rule, and therefore does not require the introduction of an ad-hoc time operator or a commitment to a specific (e.g., Bohmian) ontology. The generality and versatility of the result are illustrated by applications to the time-of-arrival at a given location for a free particle in a superposed state and to the time required to reach a given velocity for a free-falling quantum particle. Our approach also offers a potentially promising new avenue toward the design of an experimental protocol for the yet-to-be-performed observation of the phenomenon of quantum backflow.","sentences":["Using standard results from statistics, we show that for any continuous quantum system (Gaussian or otherwise) and any observable $A$ (position or otherwise), the distribution $ \\pi _{a}\\left(t\\right) $ of a time measurement at a fixed state $a$ can be inferred from the distribution $ \\rho _{t}\\left( a\\right) $ of a state measurement at a fixed time $t$ via the transformation $ \\pi _{a}\\left( t\\right)","= \\left\\vert \\frac{\\partial }{\\partial t} \\int_{-\\infty }^{a}\\rho _{t}\\left( u\\right) du \\right\\vert $.","This finding suggests that the answer to the long-lasting time-of-arrival problem is in fact readily available in the standard formalism, secretly hidden within the Born rule, and therefore does not require the introduction of an ad-hoc time operator or a commitment to a specific (e.g., Bohmian) ontology.","The generality and versatility of the result are illustrated by applications to the time-of-arrival at a given location for a free particle in a superposed state and to the time required to reach a given velocity for a free-falling quantum particle.","Our approach also offers a potentially promising new avenue toward the design of an experimental protocol for the yet-to-be-performed observation of the phenomenon of quantum backflow."],"url":"http://arxiv.org/abs/2405.02018v1","category":"quant-ph"} +{"created":"2024-05-03 11:28:21","title":"Adversarial Botometer: Adversarial Analysis for Social Bot Detection","abstract":"Social bots play a significant role in many online social networks (OSN) as they imitate human behavior. This fact raises difficult questions about their capabilities and potential risks. Given the recent advances in Generative AI (GenAI), social bots are capable of producing highly realistic and complex content that mimics human creativity. As the malicious social bots emerge to deceive people with their unrealistic content, identifying them and distinguishing the content they produce has become an actual challenge for numerous social platforms. Several approaches to this problem have already been proposed in the literature, but the proposed solutions have not been widely evaluated. To address this issue, we evaluate the behavior of a text-based bot detector in a competitive environment where some scenarios are proposed: \\textit{First}, the tug-of-war between a bot and a bot detector is examined. It is interesting to analyze which party is more likely to prevail and which circumstances influence these expectations. In this regard, we model the problem as a synthetic adversarial game in which a conversational bot and a bot detector are engaged in strategic online interactions. \\textit{Second}, the bot detection model is evaluated under attack examples generated by a social bot; to this end, we poison the dataset with attack examples and evaluate the model performance under this condition. \\textit{Finally}, to investigate the impact of the dataset, a cross-domain analysis is performed. Through our comprehensive evaluation of different categories of social bots using two benchmark datasets, we were able to demonstrate some achivement that could be utilized in future works.","sentences":["Social bots play a significant role in many online social networks (OSN) as they imitate human behavior.","This fact raises difficult questions about their capabilities and potential risks.","Given the recent advances in Generative AI (GenAI), social bots are capable of producing highly realistic and complex content that mimics human creativity.","As the malicious social bots emerge to deceive people with their unrealistic content, identifying them and distinguishing the content they produce has become an actual challenge for numerous social platforms.","Several approaches to this problem have already been proposed in the literature, but the proposed solutions have not been widely evaluated.","To address this issue, we evaluate the behavior of a text-based bot detector in a competitive environment where some scenarios are proposed: \\textit{First}, the tug-of-war between a bot and a bot detector is examined.","It is interesting to analyze which party is more likely to prevail and which circumstances influence these expectations.","In this regard, we model the problem as a synthetic adversarial game in which a conversational bot and a bot detector are engaged in strategic online interactions.","\\textit{Second}, the bot detection model is evaluated under attack examples generated by a social bot; to this end, we poison the dataset with attack examples and evaluate the model performance under this condition.","\\textit{Finally}, to investigate the impact of the dataset, a cross-domain analysis is performed.","Through our comprehensive evaluation of different categories of social bots using two benchmark datasets, we were able to demonstrate some achivement that could be utilized in future works."],"url":"http://arxiv.org/abs/2405.02016v1","category":"cs.SI"} +{"created":"2024-05-03 11:27:31","title":"Evaluating Production Planning and Control Systems in Different Environments: A Comparative Simulation Study","abstract":"Selecting the appropriate production planning and control systems (PPCS) presents a significant challenge for many companies, as their performance, i.e. overall costs, depends on the production system environment. Key environmental characteristics include the system's structure, i.e. flow shop, hybrid shop, or job shop, and the planned shop load. Besides selecting a suitable PPCS, its parameterization significantly influences the performance. This publication investigates the performance and the optimal parametrization of Material Requirement Planning (MRP), Reorder Point System (RPS) and Constant Work In Progress (ConWIP) at different stochastic multi-item multi-stage production system environments by conduction a comprehensive full factorial simulation study. The results indicate that MRP and ConWIP generally outperform RPS in all observed environments. Moreover, when comparing MRP with ConWIP, the performance clearly varies depending on the specific production system environment.","sentences":["Selecting the appropriate production planning and control systems (PPCS) presents a significant challenge for many companies, as their performance, i.e. overall costs, depends on the production system environment.","Key environmental characteristics include the system's structure, i.e. flow shop, hybrid shop, or job shop, and the planned shop load.","Besides selecting a suitable PPCS, its parameterization significantly influences the performance.","This publication investigates the performance and the optimal parametrization of Material Requirement Planning (MRP), Reorder Point System (RPS) and Constant Work In Progress (ConWIP) at different stochastic multi-item multi-stage production system environments by conduction a comprehensive full factorial simulation study.","The results indicate that MRP and ConWIP generally outperform RPS in all observed environments.","Moreover, when comparing MRP with ConWIP, the performance clearly varies depending on the specific production system environment."],"url":"http://arxiv.org/abs/2405.02015v1","category":"econ.GN"} +{"created":"2024-05-03 11:16:27","title":"DiffMap: Enhancing Map Segmentation with Map Prior Using Diffusion Model","abstract":"Constructing high-definition (HD) maps is a crucial requirement for enabling autonomous driving. In recent years, several map segmentation algorithms have been developed to address this need, leveraging advancements in Bird's-Eye View (BEV) perception. However, existing models still encounter challenges in producing realistic and consistent semantic map layouts. One prominent issue is the limited utilization of structured priors inherent in map segmentation masks. In light of this, we propose DiffMap, a novel approach specifically designed to model the structured priors of map segmentation masks using latent diffusion model. By incorporating this technique, the performance of existing semantic segmentation methods can be significantly enhanced and certain structural errors present in the segmentation outputs can be effectively rectified. Notably, the proposed module can be seamlessly integrated into any map segmentation model, thereby augmenting its capability to accurately delineate semantic information. Furthermore, through extensive visualization analysis, our model demonstrates superior proficiency in generating results that more accurately reflect real-world map layouts, further validating its efficacy in improving the quality of the generated maps.","sentences":["Constructing high-definition (HD) maps is a crucial requirement for enabling autonomous driving.","In recent years, several map segmentation algorithms have been developed to address this need, leveraging advancements in Bird's-Eye View (BEV) perception.","However, existing models still encounter challenges in producing realistic and consistent semantic map layouts.","One prominent issue is the limited utilization of structured priors inherent in map segmentation masks.","In light of this, we propose DiffMap, a novel approach specifically designed to model the structured priors of map segmentation masks using latent diffusion model.","By incorporating this technique, the performance of existing semantic segmentation methods can be significantly enhanced and certain structural errors present in the segmentation outputs can be effectively rectified.","Notably, the proposed module can be seamlessly integrated into any map segmentation model, thereby augmenting its capability to accurately delineate semantic information.","Furthermore, through extensive visualization analysis, our model demonstrates superior proficiency in generating results that more accurately reflect real-world map layouts, further validating its efficacy in improving the quality of the generated maps."],"url":"http://arxiv.org/abs/2405.02008v1","category":"cs.CV"} +{"created":"2024-05-03 11:08:09","title":"On the original composition of the gas forming first-generation stars in clusters: insights from HST and JWST","abstract":"Globular cluster (GC) stars composed of pristine material (first-generation, 1G, stars) are not chemically homogeneous, as they exhibit extended sequences in the \"Chromosome Map\" (ChM). Recent studies characterized 1G stars within the center of 55 Galactic GCs, revealing metallicity variations. Despite this progress, several unanswered questions persist, particularly concerning the link between the 1G metallicity spread and factors such as the radial distance from the cluster center or the host GC parameters. Additionally, it remains unclear whether the extended 1G sequence phenomenon is exclusive to old Galactic GCs with multiple populations. This work addresses these open issues, examining 1G stars in different environments. First, we combine Hubble Space Telescope (HST) and James Webb Space Telescope photometry of the GC 47 Tucanae to study 1G stars at increasing distances from the cluster center. We find that metal-rich 1G stars are more centrally concentrated than metal-poor ones, suggesting a metallicity radial gradient. Additionally, the two groups of 1G stars share similar kinematics. Since our analysis focuses on giant stars in the cluster center and M dwarfs in external fields, we discuss the possibility that the metallicity distribution depends on stellar mass. Subsequently, we analyze HST multi-band photometry of two simple-population clusters, NGC 6791 and NGC 1783, revealing elongated sequences in the ChM associated with metallicity variations. Finally, we investigate the 1G color distribution in 51 GCs, finding no connections with the host cluster parameters. These results shed light on the complex nature of 1G stars, providing insights into the GC formation environment.","sentences":["Globular cluster (GC) stars composed of pristine material (first-generation, 1G, stars) are not chemically homogeneous, as they exhibit extended sequences in the \"Chromosome Map\" (ChM).","Recent studies characterized 1G stars within the center of 55 Galactic GCs, revealing metallicity variations.","Despite this progress, several unanswered questions persist, particularly concerning the link between the 1G metallicity spread and factors such as the radial distance from the cluster center or the host GC parameters.","Additionally, it remains unclear whether the extended 1G sequence phenomenon is exclusive to old Galactic GCs with multiple populations.","This work addresses these open issues, examining 1G stars in different environments.","First, we combine Hubble Space Telescope (HST) and James Webb Space Telescope photometry of the GC 47 Tucanae to study 1G stars at increasing distances from the cluster center.","We find that metal-rich 1G stars are more centrally concentrated than metal-poor ones, suggesting a metallicity radial gradient.","Additionally, the two groups of 1G stars share similar kinematics.","Since our analysis focuses on giant stars in the cluster center and M dwarfs in external fields, we discuss the possibility that the metallicity distribution depends on stellar mass.","Subsequently, we analyze HST multi-band photometry of two simple-population clusters, NGC 6791 and NGC 1783, revealing elongated sequences in the ChM associated with metallicity variations.","Finally, we investigate the 1G color distribution in 51 GCs, finding no connections with the host cluster parameters.","These results shed light on the complex nature of 1G stars, providing insights into the GC formation environment."],"url":"http://arxiv.org/abs/2405.02006v1","category":"astro-ph.GA"} +{"created":"2024-05-03 10:57:14","title":"Semi-Automatic Infrared Calibration for Augmented Reality Systems in Surgery","abstract":"Augmented reality (AR) has the potential to improve the immersion and efficiency of computer-assisted orthopaedic surgery (CAOS) by allowing surgeons to maintain focus on the operating site rather than external displays in the operating theatre. Successful deployment of AR to CAOS requires a calibration that can accurately calculate the spatial relationship between real and holographic objects. Several studies attempt this calibration through manual alignment or with additional fiducial markers in the surgical scene. We propose a calibration system that offers a direct method for the calibration of AR head-mounted displays (HMDs) with CAOS systems, by using infrared-reflective marker-arrays widely used in CAOS. In our fast, user-agnostic setup, a HoloLens 2 detected the pose of marker arrays using infrared response and time-of-flight depth obtained through sensors onboard the HMD. Registration with a commercially available CAOS system was achieved when an IR marker-array was visible to both devices. Study tests found relative-tracking mean errors of 2.03 mm and 1.12{\\deg} when calculating the relative pose between two static marker-arrays at short ranges. When using the calibration result to provide in-situ holographic guidance for a simulated wire-insertion task, a pre-clinical test reported mean errors of 2.07 mm and 1.54{\\deg} when compared to a pre-planned trajectory.","sentences":["Augmented reality (AR) has the potential to improve the immersion and efficiency of computer-assisted orthopaedic surgery (CAOS) by allowing surgeons to maintain focus on the operating site rather than external displays in the operating theatre.","Successful deployment of AR to CAOS requires a calibration that can accurately calculate the spatial relationship between real and holographic objects.","Several studies attempt this calibration through manual alignment or with additional fiducial markers in the surgical scene.","We propose a calibration system that offers a direct method for the calibration of AR head-mounted displays (HMDs) with CAOS systems, by using infrared-reflective marker-arrays widely used in CAOS.","In our fast, user-agnostic setup, a HoloLens 2 detected the pose of marker arrays using infrared response and time-of-flight depth obtained through sensors onboard the HMD.","Registration with a commercially available CAOS system was achieved when an IR marker-array was visible to both devices.","Study tests found relative-tracking mean errors of 2.03 mm and 1.12{\\deg} when calculating the relative pose between two static marker-arrays at short ranges.","When using the calibration result to provide in-situ holographic guidance for a simulated wire-insertion task, a pre-clinical test reported mean errors of 2.07 mm and 1.54{\\deg} when compared to a pre-planned trajectory."],"url":"http://arxiv.org/abs/2405.01999v1","category":"cs.RO"} +{"created":"2024-05-03 10:54:58","title":"Circular geodesic orbits in the equatorial plane of a charged rotating disc of dust","abstract":"Equatorial circular geodesic orbits of neutral test particles in the exterior spacetime of the charged rotating disc of dust are analyzed in dependence of its specific charge and a relativity parameter. The charged rotating disc of dust is an axisymmetric, stationary solution of the Einstein-Maxwell equations in terms of a post-Newtonian expansion. In particular, photon, marginally bound and marginally stable orbits are discussed. It turns out that general formulae in closed form for angular velocity, specific energy and specific angular momentum of the test particles can be derived, which hold for any (exterior) asymptotically flat, axisymmetric, stationary and reflection symmetric (electro-)vacuum spacetime. Furthermore, circular motion in the exterior spacetime of the charged rotating disc of dust is qualitatively very similar to that around a Kerr-Newman black hole for sufficiently large radii, but differs strongly in the respective closer vicinity.","sentences":["Equatorial circular geodesic orbits of neutral test particles in the exterior spacetime of the charged rotating disc of dust are analyzed in dependence of its specific charge and a relativity parameter.","The charged rotating disc of dust is an axisymmetric, stationary solution of the Einstein-Maxwell equations in terms of a post-Newtonian expansion.","In particular, photon, marginally bound and marginally stable orbits are discussed.","It turns out that general formulae in closed form for angular velocity, specific energy and specific angular momentum of the test particles can be derived, which hold for any (exterior) asymptotically flat, axisymmetric, stationary and reflection symmetric (electro-)vacuum spacetime.","Furthermore, circular motion in the exterior spacetime of the charged rotating disc of dust is qualitatively very similar to that around a Kerr-Newman black hole for sufficiently large radii, but differs strongly in the respective closer vicinity."],"url":"http://arxiv.org/abs/2405.01998v1","category":"gr-qc"} +{"created":"2024-05-03 10:54:14","title":"Exploring Combinatorial Problem Solving with Large Language Models: A Case Study on the Travelling Salesman Problem Using GPT-3.5 Turbo","abstract":"Large Language Models (LLMs) are deep learning models designed to generate text based on textual input. Although researchers have been developing these models for more complex tasks such as code generation and general reasoning, few efforts have explored how LLMs can be applied to combinatorial problems. In this research, we investigate the potential of LLMs to solve the Travelling Salesman Problem (TSP). Utilizing GPT-3.5 Turbo, we conducted experiments employing various approaches, including zero-shot in-context learning, few-shot in-context learning, and chain-of-thoughts (CoT). Consequently, we fine-tuned GPT-3.5 Turbo to solve a specific problem size and tested it using a set of various instance sizes. The fine-tuned models demonstrated promising performance on problems identical in size to the training instances and generalized well to larger problems. Furthermore, to improve the performance of the fine-tuned model without incurring additional training costs, we adopted a self-ensemble approach to improve the quality of the solutions.","sentences":["Large Language Models (LLMs) are deep learning models designed to generate text based on textual input.","Although researchers have been developing these models for more complex tasks such as code generation and general reasoning, few efforts have explored how LLMs can be applied to combinatorial problems.","In this research, we investigate the potential of LLMs to solve the Travelling Salesman Problem (TSP).","Utilizing GPT-3.5","Turbo",", we conducted experiments employing various approaches, including zero-shot in-context learning, few-shot in-context learning, and chain-of-thoughts (CoT).","Consequently, we fine-tuned GPT-3.5 Turbo to solve a specific problem size and tested it using a set of various instance sizes.","The fine-tuned models demonstrated promising performance on problems identical in size to the training instances and generalized well to larger problems.","Furthermore, to improve the performance of the fine-tuned model without incurring additional training costs, we adopted a self-ensemble approach to improve the quality of the solutions."],"url":"http://arxiv.org/abs/2405.01997v1","category":"cs.CL"} +{"created":"2024-05-03 10:47:28","title":"Gaussian orbital perturbation theory in Schwarzschild space-time in terms of elliptic functions","abstract":"General relativistic Gauss equations for osculating elements for bound orbits under the influence of a perturbing force in an underlying Schwarzschild space-time have been derived in terms of Weierstrass elliptic functions. Thereby, the perturbation forces are restricted to act within the orbital plane only. These equations are analytically solved in linear approximation for several different perturbations such as cosmological constant perturbations, quantum corrections to the Schwarzschild metric, and hybrid Schwarzschild/post-Newtonian $2.5$ order self-forces for binary systems in an effective one-body framework.","sentences":["General relativistic Gauss equations for osculating elements for bound orbits under the influence of a perturbing force in an underlying Schwarzschild space-time have been derived in terms of Weierstrass elliptic functions.","Thereby, the perturbation forces are restricted to act within the orbital plane only.","These equations are analytically solved in linear approximation for several different perturbations such as cosmological constant perturbations, quantum corrections to the Schwarzschild metric, and hybrid Schwarzschild/post-Newtonian $2.5$ order self-forces for binary systems in an effective one-body framework."],"url":"http://arxiv.org/abs/2405.01991v1","category":"gr-qc"} +{"created":"2024-05-03 10:43:48","title":"Parameter estimation in ODEs: assessing the potential of local and global solvers","abstract":"We consider the problem of parameter estimation in dynamic systems described by ordinary differential equations. A review of the existing literature emphasizes the need for deterministic global optimization methods due to the nonconvex nature of these problems. Recent works have focused on expanding the capabilities of specialized deterministic global optimization algorithms to handle more complex problems. Despite advancements, current deterministic methods are limited to problems with a maximum of around five state and five decision variables, prompting ongoing efforts to enhance their applicability to practical problems. Our study seeks to assess the effectiveness of state-of-the-art general-purpose global and local solvers in handling realistic-sized problems efficiently, and evaluating their capabilities to cope with the nonconvex nature of the underlying estimation problems.","sentences":["We consider the problem of parameter estimation in dynamic systems described by ordinary differential equations.","A review of the existing literature emphasizes the need for deterministic global optimization methods due to the nonconvex nature of these problems.","Recent works have focused on expanding the capabilities of specialized deterministic global optimization algorithms to handle more complex problems.","Despite advancements, current deterministic methods are limited to problems with a maximum of around five state and five decision variables, prompting ongoing efforts to enhance their applicability to practical problems. ","Our study seeks to assess the effectiveness of state-of-the-art general-purpose global and local solvers in handling realistic-sized problems efficiently, and evaluating their capabilities to cope with the nonconvex nature of the underlying estimation problems."],"url":"http://arxiv.org/abs/2405.01989v1","category":"math.OC"} +{"created":"2024-05-03 10:42:17","title":"Joint sentiment analysis of lyrics and audio in music","abstract":"Sentiment or mood can express themselves on various levels in music. In automatic analysis, the actual audio data is usually analyzed, but the lyrics can also play a crucial role in the perception of moods. We first evaluate various models for sentiment analysis based on lyrics and audio separately. The corresponding approaches already show satisfactory results, but they also exhibit weaknesses, the causes of which we examine in more detail. Furthermore, different approaches to combining the audio and lyrics results are proposed and evaluated. Considering both modalities generally leads to improved performance. We investigate misclassifications and (also intentional) contradictions between audio and lyrics sentiment more closely, and identify possible causes. Finally, we address fundamental problems in this research area, such as high subjectivity, lack of data, and inconsistency in emotion taxonomies.","sentences":["Sentiment or mood can express themselves on various levels in music.","In automatic analysis, the actual audio data is usually analyzed, but the lyrics can also play a crucial role in the perception of moods.","We first evaluate various models for sentiment analysis based on lyrics and audio separately.","The corresponding approaches already show satisfactory results, but they also exhibit weaknesses, the causes of which we examine in more detail.","Furthermore, different approaches to combining the audio and lyrics results are proposed and evaluated.","Considering both modalities generally leads to improved performance.","We investigate misclassifications and (also intentional) contradictions between audio and lyrics sentiment more closely, and identify possible causes.","Finally, we address fundamental problems in this research area, such as high subjectivity, lack of data, and inconsistency in emotion taxonomies."],"url":"http://arxiv.org/abs/2405.01988v1","category":"cs.SD"} +{"created":"2024-05-03 10:37:34","title":"A Penalty-Based Guardrail Algorithm for Non-Decreasing Optimization with Inequality Constraints","abstract":"Traditional mathematical programming solvers require long computational times to solve constrained minimization problems of complex and large-scale physical systems. Therefore, these problems are often transformed into unconstrained ones, and solved with computationally efficient optimization approaches based on first-order information, such as the gradient descent method. However, for unconstrained problems, balancing the minimization of the objective function with the reduction of constraint violations is challenging. We consider the class of time-dependent minimization problems with increasing (possibly) nonlinear and non-convex objective function and non-decreasing (possibly) nonlinear and non-convex inequality constraints. To efficiently solve them, we propose a penalty-based guardrail algorithm (PGA). This algorithm adapts a standard penalty-based method by dynamically updating the right-hand side of the constraints with a guardrail variable which adds a margin to prevent violations. We evaluate PGA on two novel application domains: a simplified model of a district heating system and an optimization model derived from learned deep neural networks. Our method significantly outperforms mathematical programming solvers and the standard penalty-based method, and achieves better performance and faster convergence than a state-of-the-art algorithm (IPDD) within a specified time limit.","sentences":["Traditional mathematical programming solvers require long computational times to solve constrained minimization problems of complex and large-scale physical systems.","Therefore, these problems are often transformed into unconstrained ones, and solved with computationally efficient optimization approaches based on first-order information, such as the gradient descent method.","However, for unconstrained problems, balancing the minimization of the objective function with the reduction of constraint violations is challenging.","We consider the class of time-dependent minimization problems with increasing (possibly) nonlinear and non-convex objective function and non-decreasing (possibly) nonlinear and non-convex inequality constraints.","To efficiently solve them, we propose a penalty-based guardrail algorithm (PGA).","This algorithm adapts a standard penalty-based method by dynamically updating the right-hand side of the constraints with a guardrail variable which adds a margin to prevent violations.","We evaluate PGA on two novel application domains: a simplified model of a district heating system and an optimization model derived from learned deep neural networks.","Our method significantly outperforms mathematical programming solvers and the standard penalty-based method, and achieves better performance and faster convergence than a state-of-the-art algorithm (IPDD) within a specified time limit."],"url":"http://arxiv.org/abs/2405.01984v1","category":"math.OC"} +{"created":"2024-05-03 10:24:33","title":"Model-based reinforcement learning for protein backbone design","abstract":"Designing protein nanomaterials of predefined shape and characteristics has the potential to dramatically impact the medical industry. Machine learning (ML) has proven successful in protein design, reducing the need for expensive wet lab experiment rounds. However, challenges persist in efficiently exploring the protein fitness landscapes to identify optimal protein designs. In response, we propose the use of AlphaZero to generate protein backbones, meeting shape and structural scoring requirements. We extend an existing Monte Carlo tree search (MCTS) framework by incorporating a novel threshold-based reward and secondary objectives to improve design precision. This innovation considerably outperforms existing approaches, leading to protein backbones that better respect structural scores. The application of AlphaZero is novel in the context of protein backbone design and demonstrates promising performance. AlphaZero consistently surpasses baseline MCTS by more than 100% in top-down protein design tasks. Additionally, our application of AlphaZero with secondary objectives uncovers further promising outcomes, indicating the potential of model-based reinforcement learning (RL) in navigating the intricate and nuanced aspects of protein design","sentences":["Designing protein nanomaterials of predefined shape and characteristics has the potential to dramatically impact the medical industry.","Machine learning (ML) has proven successful in protein design, reducing the need for expensive wet lab experiment rounds.","However, challenges persist in efficiently exploring the protein fitness landscapes to identify optimal protein designs.","In response, we propose the use of AlphaZero to generate protein backbones, meeting shape and structural scoring requirements.","We extend an existing Monte Carlo tree search (MCTS) framework by incorporating a novel threshold-based reward and secondary objectives to improve design precision.","This innovation considerably outperforms existing approaches, leading to protein backbones that better respect structural scores.","The application of AlphaZero is novel in the context of protein backbone design and demonstrates promising performance.","AlphaZero consistently surpasses baseline MCTS by more than 100% in top-down protein design tasks.","Additionally, our application of AlphaZero with secondary objectives uncovers further promising outcomes, indicating the potential of model-based reinforcement learning (RL) in navigating the intricate and nuanced aspects of protein design"],"url":"http://arxiv.org/abs/2405.01983v1","category":"cs.AI"} +{"created":"2024-05-03 10:14:41","title":"Graph Neural Network based Active and Passive Beamforming for Distributed STAR-RIS-Assisted Multi-User MISO Systems","abstract":"This paper investigates a joint active and passive beamforming design for distributed simultaneous transmitting and reflecting (STAR) reconfigurable intelligent surface (RIS) assisted multi-user (MU)- mutiple input single output (MISO) systems, where the energy splitting (ES) mode is considered for the STAR-RIS. We aim to design the active beamforming vectors at the base station (BS) and the passive beamforming at the STAR-RIS to maximize the user sum rate under transmitting power constraints. The formulated problem is non-convex and nontrivial to obtain the global optimum due to the coupling between active beamforming vectors and STAR-RIS phase shifts. To efficiently solve the problem, we propose a novel graph neural network (GNN)-based framework. Specifically, we first model the interactions among users and network entities are using a heterogeneous graph representation. A heterogeneous graph neural network (HGNN) implementation is then introduced to directly optimizes beamforming vectors and STAR-RIS coefficients with the system objective. Numerical results show that the proposed approach yields efficient performance compared to the previous benchmarks. Furthermore, the proposed GNN is scalable with various system configurations.","sentences":["This paper investigates a joint active and passive beamforming design for distributed simultaneous transmitting and reflecting (STAR) reconfigurable intelligent surface (RIS) assisted multi-user (MU)- mutiple input single output (MISO) systems, where the energy splitting (ES) mode is considered for the STAR-RIS.","We aim to design the active beamforming vectors at the base station (BS) and the passive beamforming at the STAR-RIS to maximize the user sum rate under transmitting power constraints.","The formulated problem is non-convex and nontrivial to obtain the global optimum due to the coupling between active beamforming vectors and STAR-RIS phase shifts.","To efficiently solve the problem, we propose a novel graph neural network (GNN)-based framework.","Specifically, we first model the interactions among users and network entities are using a heterogeneous graph representation.","A heterogeneous graph neural network (HGNN) implementation is then introduced to directly optimizes beamforming vectors and STAR-RIS coefficients with the system objective.","Numerical results show that the proposed approach yields efficient performance compared to the previous benchmarks.","Furthermore, the proposed GNN is scalable with various system configurations."],"url":"http://arxiv.org/abs/2405.01979v1","category":"cs.IT"} +{"created":"2024-05-03 10:05:31","title":"Quantifying Distribution Shifts and Uncertainties for Enhanced Model Robustness in Machine Learning Applications","abstract":"Distribution shifts, where statistical properties differ between training and test datasets, present a significant challenge in real-world machine learning applications where they directly impact model generalization and robustness. In this study, we explore model adaptation and generalization by utilizing synthetic data to systematically address distributional disparities. Our investigation aims to identify the prerequisites for successful model adaptation across diverse data distributions, while quantifying the associated uncertainties. Specifically, we generate synthetic data using the Van der Waals equation for gases and employ quantitative measures such as Kullback-Leibler divergence, Jensen-Shannon distance, and Mahalanobis distance to assess data similarity. These metrics en able us to evaluate both model accuracy and quantify the associated uncertainty in predictions arising from data distribution shifts. Our findings suggest that utilizing statistical measures, such as the Mahalanobis distance, to determine whether model predictions fall within the low-error \"interpolation regime\" or the high-error \"extrapolation regime\" provides a complementary method for assessing distribution shift and model uncertainty. These insights hold significant value for enhancing model robustness and generalization, essential for the successful deployment of machine learning applications in real-world scenarios.","sentences":["Distribution shifts, where statistical properties differ between training and test datasets, present a significant challenge in real-world machine learning applications where they directly impact model generalization and robustness.","In this study, we explore model adaptation and generalization by utilizing synthetic data to systematically address distributional disparities.","Our investigation aims to identify the prerequisites for successful model adaptation across diverse data distributions, while quantifying the associated uncertainties.","Specifically, we generate synthetic data using the Van der Waals equation for gases and employ quantitative measures such as Kullback-Leibler divergence, Jensen-Shannon distance, and Mahalanobis distance to assess data similarity.","These metrics en able us to evaluate both model accuracy and quantify the associated uncertainty in predictions arising from data distribution shifts.","Our findings suggest that utilizing statistical measures, such as the Mahalanobis distance, to determine whether model predictions fall within the low-error \"interpolation regime\" or the high-error \"extrapolation regime\" provides a complementary method for assessing distribution shift and model uncertainty.","These insights hold significant value for enhancing model robustness and generalization, essential for the successful deployment of machine learning applications in real-world scenarios."],"url":"http://arxiv.org/abs/2405.01978v1","category":"cs.LG"} +{"created":"2024-05-03 09:57:44","title":"Multitask Extension of Geometrically Aligned Transfer Encoder","abstract":"Molecular datasets often suffer from a lack of data. It is well-known that gathering data is difficult due to the complexity of experimentation or simulation involved. Here, we leverage mutual information across different tasks in molecular data to address this issue. We extend an algorithm that utilizes the geometric characteristics of the encoding space, known as the Geometrically Aligned Transfer Encoder (GATE), to a multi-task setup. Thus, we connect multiple molecular tasks by aligning the curved coordinates onto locally flat coordinates, ensuring the flow of information from source tasks to support performance on target data.","sentences":["Molecular datasets often suffer from a lack of data.","It is well-known that gathering data is difficult due to the complexity of experimentation or simulation involved.","Here, we leverage mutual information across different tasks in molecular data to address this issue.","We extend an algorithm that utilizes the geometric characteristics of the encoding space, known as the Geometrically Aligned Transfer Encoder (GATE), to a multi-task setup.","Thus, we connect multiple molecular tasks by aligning the curved coordinates onto locally flat coordinates, ensuring the flow of information from source tasks to support performance on target data."],"url":"http://arxiv.org/abs/2405.01974v1","category":"cs.LG"} +{"created":"2024-05-03 09:54:10","title":"A quantitative and typological study of Early Slavic participle clauses and their competition","abstract":"This thesis is a corpus-based, quantitative, and typological analysis of the functions of Early Slavic participle constructions and their finite competitors ($jegda$-'when'-clauses). The first part leverages detailed linguistic annotation on Early Slavic corpora at the morphosyntactic, dependency, information-structural, and lexical levels to obtain indirect evidence for different potential functions of participle clauses and their main finite competitor and understand the roles of compositionality and default discourse reasoning as explanations for the distribution of participle constructions and $jegda$-clauses in the corpus. The second part uses massively parallel data to analyze typological variation in how languages express the semantic space of English $when$, whose scope encompasses that of Early Slavic participle constructions and $jegda$-clauses. Probabilistic semantic maps are generated and statistical methods (including Kriging, Gaussian Mixture Modelling, precision and recall analysis) are used to induce cross-linguistically salient dimensions from the parallel corpus and to study conceptual variation within the semantic space of the hypothetical concept WHEN.","sentences":["This thesis is a corpus-based, quantitative, and typological analysis of the functions of Early Slavic participle constructions and their finite competitors ($jegda$-'when'-clauses).","The first part leverages detailed linguistic annotation on Early Slavic corpora at the morphosyntactic, dependency, information-structural, and lexical levels to obtain indirect evidence for different potential functions of participle clauses and their main finite competitor and understand the roles of compositionality and default discourse reasoning as explanations for the distribution of participle constructions and $jegda$-clauses in the corpus.","The second part uses massively parallel data to analyze typological variation in how languages express the semantic space of English $when$, whose scope encompasses that of Early Slavic participle constructions and $jegda$-clauses.","Probabilistic semantic maps are generated and statistical methods (including Kriging, Gaussian Mixture Modelling, precision and recall analysis) are used to induce cross-linguistically salient dimensions from the parallel corpus and to study conceptual variation within the semantic space of the hypothetical concept WHEN."],"url":"http://arxiv.org/abs/2405.01972v1","category":"cs.CL"} +{"created":"2024-05-03 09:45:09","title":"Real-time multichannel deep speech enhancement in hearing aids: Comparing monaural and binaural processing in complex acoustic scenarios","abstract":"Deep learning has the potential to enhance speech signals and increase their intelligibility for users of hearing aids. Deep models suited for real-world application should feature a low computational complexity and low processing delay of only a few milliseconds. In this paper, we explore deep speech enhancement that matches these requirements and contrast monaural and binaural processing algorithms in two complex acoustic scenes. Both algorithms are evaluated with objective metrics and in experiments with hearing-impaired listeners performing a speech-in-noise test. Results are compared to two traditional enhancement strategies, i.e., adaptive differential microphone processing and binaural beamforming. While in diffuse noise, all algorithms perform similarly, the binaural deep learning approach performs best in the presence of spatial interferers. Through a post-analysis, this can be attributed to improvements at low SNRs and to precise spatial filtering.","sentences":["Deep learning has the potential to enhance speech signals and increase their intelligibility for users of hearing aids.","Deep models suited for real-world application should feature a low computational complexity and low processing delay of only a few milliseconds.","In this paper, we explore deep speech enhancement that matches these requirements and contrast monaural and binaural processing algorithms in two complex acoustic scenes.","Both algorithms are evaluated with objective metrics and in experiments with hearing-impaired listeners performing a speech-in-noise test.","Results are compared to two traditional enhancement strategies, i.e., adaptive differential microphone processing and binaural beamforming.","While in diffuse noise, all algorithms perform similarly, the binaural deep learning approach performs best in the presence of spatial interferers.","Through a post-analysis, this can be attributed to improvements at low SNRs and to precise spatial filtering."],"url":"http://arxiv.org/abs/2405.01967v1","category":"eess.AS"} +{"created":"2024-05-03 09:41:39","title":"Understanding LLMs Requires More Than Statistical Generalization","abstract":"The last decade has seen blossoming research in deep learning theory attempting to answer, \"Why does deep learning generalize?\" A powerful shift in perspective precipitated this progress: the study of overparametrized models in the interpolation regime. In this paper, we argue that another perspective shift is due, since some of the desirable qualities of LLMs are not a consequence of good statistical generalization and require a separate theoretical explanation. Our core argument relies on the observation that AR probabilistic models are inherently non-identifiable: models zero or near-zero KL divergence apart -- thus, equivalent test loss -- can exhibit markedly different behaviors. We support our position with mathematical examples and empirical observations, illustrating why non-identifiability has practical relevance through three case studies: (1) the non-identifiability of zero-shot rule extrapolation; (2) the approximate non-identifiability of in-context learning; and (3) the non-identifiability of fine-tunability. We review promising research directions focusing on LLM-relevant generalization measures, transferability, and inductive biases.","sentences":["The last decade has seen blossoming research in deep learning theory attempting to answer, \"Why does deep learning generalize?\"","A powerful shift in perspective precipitated this progress: the study of overparametrized models in the interpolation regime.","In this paper, we argue that another perspective shift is due, since some of the desirable qualities of LLMs are not a consequence of good statistical generalization and require a separate theoretical explanation.","Our core argument relies on the observation that AR probabilistic models are inherently non-identifiable: models zero or near-zero KL divergence apart -- thus, equivalent test loss -- can exhibit markedly different behaviors.","We support our position with mathematical examples and empirical observations, illustrating why non-identifiability has practical relevance through three case studies: (1) the non-identifiability of zero-shot rule extrapolation; (2) the approximate non-identifiability of in-context learning; and (3) the non-identifiability of fine-tunability.","We review promising research directions focusing on LLM-relevant generalization measures, transferability, and inductive biases."],"url":"http://arxiv.org/abs/2405.01964v1","category":"stat.ML"} +{"created":"2024-05-03 09:40:47","title":"From Attack to Defense: Insights into Deep Learning Security Measures in Black-Box Settings","abstract":"Deep Learning (DL) is rapidly maturing to the point that it can be used in safety- and security-crucial applications. However, adversarial samples, which are undetectable to the human eye, pose a serious threat that can cause the model to misbehave and compromise the performance of such applications. Addressing the robustness of DL models has become crucial to understanding and defending against adversarial attacks. In this study, we perform comprehensive experiments to examine the effect of adversarial attacks and defenses on various model architectures across well-known datasets. Our research focuses on black-box attacks such as SimBA, HopSkipJump, MGAAttack, and boundary attacks, as well as preprocessor-based defensive mechanisms, including bits squeezing, median smoothing, and JPEG filter. Experimenting with various models, our results demonstrate that the level of noise needed for the attack increases as the number of layers increases. Moreover, the attack success rate decreases as the number of layers increases. This indicates that model complexity and robustness have a significant relationship. Investigating the diversity and robustness relationship, our experiments with diverse models show that having a large number of parameters does not imply higher robustness. Our experiments extend to show the effects of the training dataset on model robustness. Using various datasets such as ImageNet-1000, CIFAR-100, and CIFAR-10 are used to evaluate the black-box attacks. Considering the multiple dimensions of our analysis, e.g., model complexity and training dataset, we examined the behavior of black-box attacks when models apply defenses. Our results show that applying defense strategies can significantly reduce attack effectiveness. This research provides in-depth analysis and insight into the robustness of DL models against various attacks, and defenses.","sentences":["Deep Learning (DL) is rapidly maturing to the point that it can be used in safety- and security-crucial applications.","However, adversarial samples, which are undetectable to the human eye, pose a serious threat that can cause the model to misbehave and compromise the performance of such applications.","Addressing the robustness of DL models has become crucial to understanding and defending against adversarial attacks.","In this study, we perform comprehensive experiments to examine the effect of adversarial attacks and defenses on various model architectures across well-known datasets.","Our research focuses on black-box attacks such as SimBA, HopSkipJump, MGAAttack, and boundary attacks, as well as preprocessor-based defensive mechanisms, including bits squeezing, median smoothing, and JPEG filter.","Experimenting with various models, our results demonstrate that the level of noise needed for the attack increases as the number of layers increases.","Moreover, the attack success rate decreases as the number of layers increases.","This indicates that model complexity and robustness have a significant relationship.","Investigating the diversity and robustness relationship, our experiments with diverse models show that having a large number of parameters does not imply higher robustness.","Our experiments extend to show the effects of the training dataset on model robustness.","Using various datasets such as ImageNet-1000, CIFAR-100, and CIFAR-10 are used to evaluate the black-box attacks.","Considering the multiple dimensions of our analysis, e.g., model complexity and training dataset, we examined the behavior of black-box attacks when models apply defenses.","Our results show that applying defense strategies can significantly reduce attack effectiveness.","This research provides in-depth analysis and insight into the robustness of DL models against various attacks, and defenses."],"url":"http://arxiv.org/abs/2405.01963v1","category":"cs.CR"} +{"created":"2024-05-03 09:39:54","title":"Optical skyrmions from metafibers","abstract":"Optical skyrmions are an emerging class of structured light with sophisticated particle-like topologies with great potential for revolutionizing modern informatics. However, the current generation of optical skyrmions involves complex or bulky systems, hindering their development of practical applications. Here, exploiting the emergent \"lab-on-fiber\" technology, we demonstrate the design of a metafiber-integrated photonic skyrmion generator. We not only successfully generated high-quality optical skyrmions from metafibers, but also experimentally verified their remarkable properties, such as regulability and topological stability with deep-subwavelength features beyond the diffraction limits. Our flexible and fiber-integrated optical skyrmions platform paves the avenue for future applications of topologically-enhanced remote super-resolution microscopy and super-robust information transfer.","sentences":["Optical skyrmions are an emerging class of structured light with sophisticated particle-like topologies with great potential for revolutionizing modern informatics.","However, the current generation of optical skyrmions involves complex or bulky systems, hindering their development of practical applications.","Here, exploiting the emergent \"lab-on-fiber\" technology, we demonstrate the design of a metafiber-integrated photonic skyrmion generator.","We not only successfully generated high-quality optical skyrmions from metafibers, but also experimentally verified their remarkable properties, such as regulability and topological stability with deep-subwavelength features beyond the diffraction limits.","Our flexible and fiber-integrated optical skyrmions platform paves the avenue for future applications of topologically-enhanced remote super-resolution microscopy and super-robust information transfer."],"url":"http://arxiv.org/abs/2405.01962v1","category":"physics.optics"} +{"created":"2024-05-03 09:33:18","title":"An analysis and solution of ill-conditioning in physics-informed neural networks","abstract":"Physics-informed neural networks (PINNs) have recently emerged as a novel and popular approach for solving forward and inverse problems involving partial differential equations (PDEs). However, achieving stable training and obtaining correct results remain a challenge in many cases, often attributed to the ill-conditioning of PINNs. Nonetheless, further analysis is still lacking, severely limiting the progress and applications of PINNs in complex engineering problems. Drawing inspiration from the ill-conditioning analysis in traditional numerical methods, we establish a connection between the ill-conditioning of PINNs and the ill-conditioning of the Jacobian matrix of the PDE system. Specifically, for any given PDE system, we construct its controlled system. This controlled system allows for adjustment of the condition number of the Jacobian matrix while retaining the same solution as the original system. Our numerical findings suggest that the ill-conditioning observed in PINNs predominantly stems from that of the Jacobian matrix. As the condition number of the Jacobian matrix decreases, the controlled systems exhibit faster convergence rates and higher accuracy. Building upon this understanding and the natural extension of controlled systems, we present a general approach to mitigate the ill-conditioning of PINNs, leading to successful simulations of the three-dimensional flow around the M6 wing at a Reynolds number of 5,000. To the best of our knowledge, this is the first time that PINNs have been successful in simulating such complex systems, offering a promising new technique for addressing industrial complexity problems. Our findings also offer valuable insights guiding the future development of PINNs.","sentences":["Physics-informed neural networks (PINNs) have recently emerged as a novel and popular approach for solving forward and inverse problems involving partial differential equations (PDEs).","However, achieving stable training and obtaining correct results remain a challenge in many cases, often attributed to the ill-conditioning of PINNs.","Nonetheless, further analysis is still lacking, severely limiting the progress and applications of PINNs in complex engineering problems.","Drawing inspiration from the ill-conditioning analysis in traditional numerical methods, we establish a connection between the ill-conditioning of PINNs and the ill-conditioning of the Jacobian matrix of the PDE system.","Specifically, for any given PDE system, we construct its controlled system.","This controlled system allows for adjustment of the condition number of the Jacobian matrix while retaining the same solution as the original system.","Our numerical findings suggest that the ill-conditioning observed in PINNs predominantly stems from that of the Jacobian matrix.","As the condition number of the Jacobian matrix decreases, the controlled systems exhibit faster convergence rates and higher accuracy.","Building upon this understanding and the natural extension of controlled systems, we present a general approach to mitigate the ill-conditioning of PINNs, leading to successful simulations of the three-dimensional flow around the M6 wing at a Reynolds number of 5,000.","To the best of our knowledge, this is the first time that PINNs have been successful in simulating such complex systems, offering a promising new technique for addressing industrial complexity problems.","Our findings also offer valuable insights guiding the future development of PINNs."],"url":"http://arxiv.org/abs/2405.01957v1","category":"physics.flu-dyn"} +{"created":"2024-05-03 09:28:06","title":"Mahler equations for Zeckendorf numeration","abstract":"We define generalised equations of Z-Mahler type, based on the Zeckendorf numeration system. We show that if a sequence over a commutative ring is Z-regular, then it is the sequence of coefficients of a series which is a solution of a Z-Mahler equation. Conversely, if the Z-Mahler equation is isolating, then its solutions define Z-regular sequences. This is a generalisation of results of Becker and Dumas. We provide an example to show that there exist non-isolating Z-Mahler equations whose solutions do not define Z-regular sequences. Our proof yields a new construction of weighted automata that generate classical q-regular sequences.","sentences":["We define generalised equations of Z-Mahler type, based on the Zeckendorf numeration system.","We show that if a sequence over a commutative ring is Z-regular, then it is the sequence of coefficients of a series which is a solution of a Z-Mahler equation.","Conversely, if the Z-Mahler equation is isolating, then its solutions define Z-regular sequences.","This is a generalisation of results of Becker and Dumas.","We provide an example to show that there exist non-isolating Z-Mahler equations whose solutions do not define Z-regular sequences.","Our proof yields a new construction of weighted automata that generate classical q-regular sequences."],"url":"http://arxiv.org/abs/2405.01953v1","category":"math.NT"} +{"created":"2024-05-03 09:27:31","title":"Three Quantization Regimes for ReLU Networks","abstract":"We establish the fundamental limits in the approximation of Lipschitz functions by deep ReLU neural networks with finite-precision weights. Specifically, three regimes, namely under-, over-, and proper quantization, in terms of minimax approximation error behavior as a function of network weight precision, are identified. This is accomplished by deriving nonasymptotic tight lower and upper bounds on the minimax approximation error. Notably, in the proper-quantization regime, neural networks exhibit memory-optimality in the approximation of Lipschitz functions. Deep networks have an inherent advantage over shallow networks in achieving memory-optimality. We also develop the notion of depth-precision tradeoff, showing that networks with high-precision weights can be converted into functionally equivalent deeper networks with low-precision weights, while preserving memory-optimality. This idea is reminiscent of sigma-delta analog-to-digital conversion, where oversampling rate is traded for resolution in the quantization of signal samples. We improve upon the best-known ReLU network approximation results for Lipschitz functions and describe a refinement of the bit extraction technique which could be of independent general interest.","sentences":["We establish the fundamental limits in the approximation of Lipschitz functions by deep ReLU neural networks with finite-precision weights.","Specifically, three regimes, namely under-, over-, and proper quantization, in terms of minimax approximation error behavior as a function of network weight precision, are identified.","This is accomplished by deriving nonasymptotic tight lower and upper bounds on the minimax approximation error.","Notably, in the proper-quantization regime, neural networks exhibit memory-optimality in the approximation of Lipschitz functions.","Deep networks have an inherent advantage over shallow networks in achieving memory-optimality.","We also develop the notion of depth-precision tradeoff, showing that networks with high-precision weights can be converted into functionally equivalent deeper networks with low-precision weights, while preserving memory-optimality.","This idea is reminiscent of sigma-delta analog-to-digital conversion, where oversampling rate is traded for resolution in the quantization of signal samples.","We improve upon the best-known ReLU network approximation results for Lipschitz functions and describe a refinement of the bit extraction technique which could be of independent general interest."],"url":"http://arxiv.org/abs/2405.01952v1","category":"stat.ML"} +{"created":"2024-05-03 09:24:29","title":"Gaussian Lagrangian Galaxy Bias","abstract":"Understanding $\\textit{galaxy bias}$ -- that is the statistical relation between matter and galaxies -- is of key importance for extracting cosmological information from galaxy surveys. While the bias function $f$ is usually approximated through a parametric expansion, we show here, that it can also be measured directly from simulations in a non-parameteric way. Our measurements show that the Lagrangian bias function is very close to a Gaussian for halo selections of any mass. Therefore, we newly introduce a Gaussian bias model with several intriguing properties: (1) It predicts only strictly positive probabilities $f > 0$ (unlike expansion models), (2) It has a simple analytic renormalized form and (3) It behaves gracefully in many scenarios where the classical expansion converges poorly. We show that the Gaussian bias model describes the galaxy environment distribution $p(\\delta | \\mathrm{g})$, the scale dependent bias function $f$ and the renormalized bias function $F$ of haloes and galaxies generally equally well or significantly better than a second order expansion with the same number of parameters. We suggest that a Gaussian bias approach may enhance the range of validity of bias schemes where the canonical expansion converges poorly and further, that it may make new applications possible, since it guarantees the positivity of predicted galaxy densities.","sentences":["Understanding $\\textit{galaxy bias}$ -- that is the statistical relation between matter and galaxies -- is of key importance for extracting cosmological information from galaxy surveys.","While the bias function $f$ is usually approximated through a parametric expansion, we show here, that it can also be measured directly from simulations in a non-parameteric way.","Our measurements show that the Lagrangian bias function is very close to a Gaussian for halo selections of any mass.","Therefore, we newly introduce a Gaussian bias model with several intriguing properties: (1) It predicts only strictly positive probabilities $f > 0$ (unlike expansion models), (2) It has a simple analytic renormalized form and (3) It behaves gracefully in many scenarios where the classical expansion converges poorly.","We show that the Gaussian bias model describes the galaxy environment distribution $p(\\delta | \\mathrm{g})$, the scale dependent bias function $f$ and the renormalized bias function $F$ of haloes and galaxies generally equally well or significantly better than a second order expansion with the same number of parameters.","We suggest that a Gaussian bias approach may enhance the range of validity of bias schemes where the canonical expansion converges poorly and further, that it may make new applications possible, since it guarantees the positivity of predicted galaxy densities."],"url":"http://arxiv.org/abs/2405.01951v1","category":"astro-ph.CO"} +{"created":"2024-05-03 09:22:20","title":"Bubble wall velocity and gravitational wave in the minimal left-right symmetric model","abstract":"The bubble wall velocity in the first order phase transition plays an important role in determining both the amplitude and the pivot frequency of stochastic gravitational wave background. In the framework of the minimal left-right symmetric model, we study the wall velocity when the first order phase transition can occur. The wall velocity can be determined by matching the distribution functions in the free particle approximation and the local thermal equilibrium approximation. It is found that the wall velocity can be determined in the range $ 0.2 < v_w < 0.5 $ for the parameter space with the first order phase transition. It is also found that for the case when the wall velocity is close to the speed of sound, the peak amplitude of gravitational wave spectrum can be larger than that in the runaway case. Moreover, It is also found that there exists an approximate power law between the wall velocity and pressure difference between broken and symmetry phases, and the power index is equal to 0.41 or so.","sentences":["The bubble wall velocity in the first order phase transition plays an important role in determining both the amplitude and the pivot frequency of stochastic gravitational wave background.","In the framework of the minimal left-right symmetric model, we study the wall velocity when the first order phase transition can occur.","The wall velocity can be determined by matching the distribution functions in the free particle approximation and the local thermal equilibrium approximation.","It is found that the wall velocity can be determined in the range $ 0.2 < v_w < 0.5 $ for the parameter space with the first order phase transition.","It is also found that for the case when the wall velocity is close to the speed of sound, the peak amplitude of gravitational wave spectrum can be larger than that in the runaway case.","Moreover, It is also found that there exists an approximate power law between the wall velocity and pressure difference between broken and symmetry phases, and the power index is equal to 0.41 or so."],"url":"http://arxiv.org/abs/2405.01949v1","category":"gr-qc"} +{"created":"2024-05-03 09:21:13","title":"Common Randomness Generation from Sources with Infinite Polish Alphabet","abstract":"We investigate the problem of common randomness (CR) generation in the basic two-party communication setting in which a sender and a receiver aim to agree on a common random variable with high probability. The terminals observe independent and identically distributed (i.i.d.) samples of sources with an arbitrary distribution defined on a Polish alphabet and are allowed to communicate as little as possible over a noisy, memoryless channel. We establish single-letter upper and lower bounds on the CR capacity for the specified model. The derived bounds hold with equality except for at most countably many points where discontinuity issues might arise.","sentences":["We investigate the problem of common randomness (CR) generation in the basic two-party communication setting in which a sender and a receiver aim to agree on a common random variable with high probability.","The terminals observe independent and identically distributed (i.i.d.)","samples of sources with an arbitrary distribution defined on a Polish alphabet and are allowed to communicate as little as possible over a noisy, memoryless channel.","We establish single-letter upper and lower bounds on the CR capacity for the specified model.","The derived bounds hold with equality except for at most countably many points where discontinuity issues might arise."],"url":"http://arxiv.org/abs/2405.01948v1","category":"cs.IT"} +{"created":"2024-05-03 09:13:13","title":"Dependency-Aware Semi-Structured Sparsity of GLU Variants in Large Language Models","abstract":"The rapid advancement in Large Language Models (LLMs) has markedly enhanced the capabilities of language understanding and generation. However, the substantial model size poses hardware challenges, affecting both memory size for serving and inference latency for token generation. To address those challenges, we propose Dependency-aware Semi-structured Sparsity (DaSS), a novel method for the recent prevalent SwiGLU-based LLMs pruning. Our approach incorporates structural dependency into the weight magnitude-based unstructured pruning. We introduce an MLP-specific pruning metric that evaluates the importance of each weight by jointly considering its magnitude and its corresponding MLP intermediate activation norms. DaSS facilitates a balance between the adaptability offered by unstructured pruning and the structural consistency inherent in dependency-based structured pruning. Empirical evaluations on Mistral and LLaMA2 model families demonstrate that DaSS not only outperforms both SparseGPT and Wanda in achieving hardware-friendly N:M sparsity patterns but also maintains the computational efficiency of Wanda.","sentences":["The rapid advancement in Large Language Models (LLMs) has markedly enhanced the capabilities of language understanding and generation.","However, the substantial model size poses hardware challenges, affecting both memory size for serving and inference latency for token generation.","To address those challenges, we propose Dependency-aware Semi-structured Sparsity (DaSS), a novel method for the recent prevalent SwiGLU-based LLMs pruning.","Our approach incorporates structural dependency into the weight magnitude-based unstructured pruning.","We introduce an MLP-specific pruning metric that evaluates the importance of each weight by jointly considering its magnitude and its corresponding MLP intermediate activation norms.","DaSS facilitates a balance between the adaptability offered by unstructured pruning and the structural consistency inherent in dependency-based structured pruning.","Empirical evaluations on Mistral and LLaMA2 model families demonstrate that DaSS not only outperforms both SparseGPT and Wanda in achieving hardware-friendly N:M sparsity patterns but also maintains the computational efficiency of Wanda."],"url":"http://arxiv.org/abs/2405.01943v1","category":"cs.CL"} +{"created":"2024-05-03 09:04:50","title":"The bifurcation measure is exponentially mixing","abstract":"We prove general mixing theorems for sequences of meromorphic maps on compact K\\\"ahler manifolds. We deduce that the bifurcation measure is exponentially mixing for a family of rational maps of $\\mathbb{P}^q(\\mathbb{C})$ endowed with suitably many marked points.","sentences":["We prove general mixing theorems for sequences of meromorphic maps on compact K\\\"ahler manifolds.","We deduce that the bifurcation measure is exponentially mixing for a family of rational maps of $\\mathbb{P}^q(\\mathbb{C})$ endowed with suitably many marked points."],"url":"http://arxiv.org/abs/2405.01939v1","category":"math.DS"} +{"created":"2024-05-03 08:58:38","title":"Impact of Architectural Modifications on Deep Learning Adversarial Robustness","abstract":"Rapid advancements of deep learning are accelerating adoption in a wide variety of applications, including safety-critical applications such as self-driving vehicles, drones, robots, and surveillance systems. These advancements include applying variations of sophisticated techniques that improve the performance of models. However, such models are not immune to adversarial manipulations, which can cause the system to misbehave and remain unnoticed by experts. The frequency of modifications to existing deep learning models necessitates thorough analysis to determine the impact on models' robustness. In this work, we present an experimental evaluation of the effects of model modifications on deep learning model robustness using adversarial attacks. Our methodology involves examining the robustness of variations of models against various adversarial attacks. By conducting our experiments, we aim to shed light on the critical issue of maintaining the reliability and safety of deep learning models in safety- and security-critical applications. Our results indicate the pressing demand for an in-depth assessment of the effects of model changes on the robustness of models.","sentences":["Rapid advancements of deep learning are accelerating adoption in a wide variety of applications, including safety-critical applications such as self-driving vehicles, drones, robots, and surveillance systems.","These advancements include applying variations of sophisticated techniques that improve the performance of models.","However, such models are not immune to adversarial manipulations, which can cause the system to misbehave and remain unnoticed by experts.","The frequency of modifications to existing deep learning models necessitates thorough analysis to determine the impact on models' robustness.","In this work, we present an experimental evaluation of the effects of model modifications on deep learning model robustness using adversarial attacks.","Our methodology involves examining the robustness of variations of models against various adversarial attacks.","By conducting our experiments, we aim to shed light on the critical issue of maintaining the reliability and safety of deep learning models in safety- and security-critical applications.","Our results indicate the pressing demand for an in-depth assessment of the effects of model changes on the robustness of models."],"url":"http://arxiv.org/abs/2405.01934v1","category":"cs.CV"} +{"created":"2024-05-03 08:50:27","title":"Gravitational field due a moving Schwarzschild object","abstract":"In this paper we analyze the spacetime geometry due to a Schwarzschild object having accelerated motion. In the beginning we investigated the gravitational field due to a uniformly moving Schwarzschild object and obtained the spacetime line element for such an object. After analyzing the necessary boundary conditions the obtained line element is found to be consistent. Next we extended our work to a uniformly accelerated Schwarzschild object. In that case we obtained the spacetime line element for both when the acceleration is along the $X-$direction and when the acceleration is in an arbitrary direction on the $XY$ plane. The boundary conditions of those line elements have been examined. Such work will have immense applications in astrophysics when we calculate the geodesic equations of test particles in the gravitational field of objects having uniform accelerated motion.","sentences":["In this paper we analyze the spacetime geometry due to a Schwarzschild object having accelerated motion.","In the beginning we investigated the gravitational field due to a uniformly moving Schwarzschild object and obtained the spacetime line element for such an object.","After analyzing the necessary boundary conditions the obtained line element is found to be consistent.","Next we extended our work to a uniformly accelerated Schwarzschild object.","In that case we obtained the spacetime line element for both when the acceleration is along the $X-$direction and when the acceleration is in an arbitrary direction on the $XY$ plane.","The boundary conditions of those line elements have been examined.","Such work will have immense applications in astrophysics when we calculate the geodesic equations of test particles in the gravitational field of objects having uniform accelerated motion."],"url":"http://arxiv.org/abs/2405.01932v1","category":"gr-qc"} +{"created":"2024-05-03 08:49:49","title":"RF Chain-Free mmWave Transmission: Modeling and Experimental Verification","abstract":"The utilization of millimeter wave frequency bands is expected to become prevalent in the following communication systems. However, generating and transmitting communication signals over these frequencies is not as straightforward as in sub-6 GHz frequencies due to complex transceiver structures. As an alternative to conventional transmitter architectures, this paper investigates the implementation of time-modulated arrays to effectively modulate and transmit high-quality communication signals at millimeter wave frequencies. By exploiting the array structures and analog beamformers, which are the fundamental components of millimeter wave transmitters, secure and low-cost transmission can be achieved. Though, harmonics of theoretically infinite bandwidth arise as a fundamental problem in this approach. Thus, this paper presents a frequency analysis tool for the time-modulated arrays with hardware impairments and shows how controlling the sampling period can reduce the harmonics. Furthermore, the derived results are experimentally verified at 25 GHz with two important remarks. First, the phase error of received signals can be reduced by 32% using the proposed architecture. Second, the harmonics can be significantly suppressed by the correct choice of sampling period for the given hardware.","sentences":["The utilization of millimeter wave frequency bands is expected to become prevalent in the following communication systems.","However, generating and transmitting communication signals over these frequencies is not as straightforward as in sub-6 GHz frequencies due to complex transceiver structures.","As an alternative to conventional transmitter architectures, this paper investigates the implementation of time-modulated arrays to effectively modulate and transmit high-quality communication signals at millimeter wave frequencies.","By exploiting the array structures and analog beamformers, which are the fundamental components of millimeter wave transmitters, secure and low-cost transmission can be achieved.","Though, harmonics of theoretically infinite bandwidth arise as a fundamental problem in this approach.","Thus, this paper presents a frequency analysis tool for the time-modulated arrays with hardware impairments and shows how controlling the sampling period can reduce the harmonics.","Furthermore, the derived results are experimentally verified at 25 GHz with two important remarks.","First, the phase error of received signals can be reduced by 32% using the proposed architecture.","Second, the harmonics can be significantly suppressed by the correct choice of sampling period for the given hardware."],"url":"http://arxiv.org/abs/2405.01931v1","category":"eess.SP"} +{"created":"2024-05-03 08:49:22","title":"OARelatedWork: A Large-Scale Dataset of Related Work Sections with Full-texts from Open Access Sources","abstract":"This paper introduces OARelatedWork, the first large-scale multi-document summarization dataset for related work generation containing whole related work sections and full-texts of cited papers. The dataset includes 94 450 papers and 5 824 689 unique referenced papers. It was designed for the task of automatically generating related work to shift the field toward generating entire related work sections from all available content instead of generating parts of related work sections from abstracts only, which is the current mainstream in this field for abstractive approaches. We show that the estimated upper bound for extractive summarization increases by 217% in the ROUGE-2 score, when using full content instead of abstracts. Furthermore, we show the benefits of full content data on naive, oracle, traditional, and transformer-based baselines. Long outputs, such as related work sections, pose challenges for automatic evaluation metrics like BERTScore due to their limited input length. We tackle this issue by proposing and evaluating a meta-metric using BERTScore. Despite operating on smaller blocks, we show this meta-metric correlates with human judgment, comparably to the original BERTScore.","sentences":["This paper introduces OARelatedWork, the first large-scale multi-document summarization dataset for related work generation containing whole related work sections and full-texts of cited papers.","The dataset includes 94 450 papers and 5 824 689 unique referenced papers.","It was designed for the task of automatically generating related work to shift the field toward generating entire related work sections from all available content instead of generating parts of related work sections from abstracts only, which is the current mainstream in this field for abstractive approaches.","We show that the estimated upper bound for extractive summarization increases by 217% in the ROUGE-2 score, when using full content instead of abstracts.","Furthermore, we show the benefits of full content data on naive, oracle, traditional, and transformer-based baselines.","Long outputs, such as related work sections, pose challenges for automatic evaluation metrics like BERTScore due to their limited input length.","We tackle this issue by proposing and evaluating a meta-metric using BERTScore.","Despite operating on smaller blocks, we show this meta-metric correlates with human judgment, comparably to the original BERTScore."],"url":"http://arxiv.org/abs/2405.01930v1","category":"cs.CL"} +{"created":"2024-05-03 08:44:25","title":"Enhancing NLoS RIS-Aided Localization with Optimization and Machine Learning","abstract":"This paper introduces two machine learning optimization algorithms to significantly enhance position estimation in Reconfigurable Intelligent Surface (RIS) aided localization for mobile user equipment in Non-Line-of-Sight conditions. Leveraging the strengths of these algorithms, we present two methods capable of achieving extremely high accuracy, reaching sub-centimeter or even sub-millimeter levels at 3.5 GHz. The simulation results highlight the potential of these approaches, showing significant improvements in indoor mobile localization. The demonstrated precision and reliability of the proposed methods offer new opportunities for practical applications in real-world scenarios, particularly in Non-Line-of-Sight indoor localization. By evaluating four optimization techniques, we determine that a combination of a Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) results in localization errors under 30 cm in 90 % of the cases, and under 5 mm for close to 85 % of cases when considering a simulated room of 10 m by 10 m where two of the walls are equipped with RIS tiles.","sentences":["This paper introduces two machine learning optimization algorithms to significantly enhance position estimation in Reconfigurable Intelligent Surface (RIS) aided localization for mobile user equipment in Non-Line-of-Sight conditions.","Leveraging the strengths of these algorithms, we present two methods capable of achieving extremely high accuracy, reaching sub-centimeter or even sub-millimeter levels at 3.5 GHz.","The simulation results highlight the potential of these approaches, showing significant improvements in indoor mobile localization.","The demonstrated precision and reliability of the proposed methods offer new opportunities for practical applications in real-world scenarios, particularly in Non-Line-of-Sight indoor localization.","By evaluating four optimization techniques, we determine that a combination of a Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) results in localization errors under 30 cm in 90 % of the cases, and under 5 mm for close to 85 % of cases when considering a simulated room of 10 m by 10 m where two of the walls are equipped with RIS tiles."],"url":"http://arxiv.org/abs/2405.01928v1","category":"eess.SP"} +{"created":"2024-05-03 08:43:06","title":"Auto-Encoding Morph-Tokens for Multimodal LLM","abstract":"For multimodal LLMs, the synergy of visual comprehension (textual output) and generation (visual output) presents an ongoing challenge. This is due to a conflicting objective: for comprehension, an MLLM needs to abstract the visuals; for generation, it needs to preserve the visuals as much as possible. Thus, the objective is a dilemma for visual-tokens. To resolve the conflict, we propose encoding images into morph-tokens to serve a dual purpose: for comprehension, they act as visual prompts instructing MLLM to generate texts; for generation, they take on a different, non-conflicting role as complete visual-tokens for image reconstruction, where the missing visual cues are recovered by the MLLM. Extensive experiments show that morph-tokens can achieve a new SOTA for multimodal comprehension and generation simultaneously. Our project is available at https://github.com/DCDmllm/MorphTokens.","sentences":["For multimodal LLMs, the synergy of visual comprehension (textual output) and generation (visual output) presents an ongoing challenge.","This is due to a conflicting objective: for comprehension, an MLLM needs to abstract the visuals; for generation, it needs to preserve the visuals as much as possible.","Thus, the objective is a dilemma for visual-tokens.","To resolve the conflict, we propose encoding images into morph-tokens to serve a dual purpose: for comprehension, they act as visual prompts instructing MLLM to generate texts; for generation, they take on a different, non-conflicting role as complete visual-tokens for image reconstruction, where the missing visual cues are recovered by the MLLM.","Extensive experiments show that morph-tokens can achieve a new SOTA for multimodal comprehension and generation simultaneously.","Our project is available at https://github.com/DCDmllm/MorphTokens."],"url":"http://arxiv.org/abs/2405.01926v1","category":"cs.CV"} +{"created":"2024-05-03 08:34:13","title":"Semi-Parametric Retrieval via Binary Token Index","abstract":"The landscape of information retrieval has broadened from search services to a critical component in various advanced applications, where indexing efficiency, cost-effectiveness, and freshness are increasingly important yet remain less explored. To address these demands, we introduce Semi-parametric Vocabulary Disentangled Retrieval (SVDR). SVDR is a novel semi-parametric retrieval framework that supports two types of indexes: an embedding-based index for high effectiveness, akin to existing neural retrieval methods; and a binary token index that allows for quick and cost-effective setup, resembling traditional term-based retrieval. In our evaluation on three open-domain question answering benchmarks with the entire Wikipedia as the retrieval corpus, SVDR consistently demonstrates superiority. It achieves a 3% higher top-1 retrieval accuracy compared to the dense retriever DPR when using an embedding-based index and an 9% higher top-1 accuracy compared to BM25 when using a binary token index. Specifically, the adoption of a binary token index reduces index preparation time from 30 GPU hours to just 2 CPU hours and storage size from 31 GB to 2 GB, achieving a 90% reduction compared to an embedding-based index.","sentences":["The landscape of information retrieval has broadened from search services to a critical component in various advanced applications, where indexing efficiency, cost-effectiveness, and freshness are increasingly important yet remain less explored.","To address these demands, we introduce Semi-parametric Vocabulary Disentangled Retrieval (SVDR).","SVDR is a novel semi-parametric retrieval framework that supports two types of indexes: an embedding-based index for high effectiveness, akin to existing neural retrieval methods; and a binary token index that allows for quick and cost-effective setup, resembling traditional term-based retrieval.","In our evaluation on three open-domain question answering benchmarks with the entire Wikipedia as the retrieval corpus, SVDR consistently demonstrates superiority.","It achieves a 3% higher top-1 retrieval accuracy compared to the dense retriever DPR when using an embedding-based index and an 9% higher top-1 accuracy compared to BM25 when using a binary token index.","Specifically, the adoption of a binary token index reduces index preparation time from 30 GPU hours to just 2 CPU hours and storage size from 31 GB to 2 GB, achieving a 90% reduction compared to an embedding-based index."],"url":"http://arxiv.org/abs/2405.01924v1","category":"cs.CL"} +{"created":"2024-05-03 17:27:55","title":"An Optical Gamma-Ray Burst Catalogue with Measured Redshift PART I: Data Release of 535 Gamma-Ray Bursts and Colour Evolution","abstract":"We present the largest optical photometry compilation of Gamma-Ray Bursts (GRBs) with redshifts ($z$). We include 64813 observations of 535 events (including upper limits) from 28 February 1997 up to 18 August 2023. We also present a user-friendly web tool \\textit{grbLC} which allows users the visualization of photometry, coordinates, redshift, host galaxy extinction, and spectral indices for each event in our database. Furthermore, we have added a Gamma Ray Coordinate Network (GCN) scraper that can be used to collect data by gathering magnitudes from the GCNs. The web tool also includes a package for uniformly investigating colour evolution. We compute the optical spectral indices for 138 GRBs for which we have at least 4 filters at the same epoch in our sample and craft a procedure to distinguish between GRBs with and without colour evolution. By providing a uniform format and repository for the optical catalogue, this web-based archive is the first step towards unifying several community efforts to gather the photometric information for all GRBs with known redshifts. This catalogue will enable population studies by providing light curves (LCs) with better coverage since we have gathered data from different ground-based locations. Consequently, these LCs can be used to train future LC reconstructions for an extended inference of the redshift. The data gathering also allows us to fill some of the orbital gaps from Swift in crucial points of the LCs, e.g., at the end of the plateau emission or where a jet break is identified.","sentences":["We present the largest optical photometry compilation of Gamma-Ray Bursts (GRBs) with redshifts ($z$).","We include 64813 observations of 535 events (including upper limits) from 28 February 1997 up to 18 August 2023.","We also present a user-friendly web tool \\textit{grbLC} which allows users the visualization of photometry, coordinates, redshift, host galaxy extinction, and spectral indices for each event in our database.","Furthermore, we have added a Gamma Ray Coordinate Network (GCN) scraper that can be used to collect data by gathering magnitudes from the GCNs.","The web tool also includes a package for uniformly investigating colour evolution.","We compute the optical spectral indices for 138 GRBs for which we have at least 4 filters at the same epoch in our sample and craft a procedure to distinguish between GRBs with and without colour evolution.","By providing a uniform format and repository for the optical catalogue, this web-based archive is the first step towards unifying several community efforts to gather the photometric information for all GRBs with known redshifts.","This catalogue will enable population studies by providing light curves (LCs) with better coverage since we have gathered data from different ground-based locations.","Consequently, these LCs can be used to train future LC reconstructions for an extended inference of the redshift.","The data gathering also allows us to fill some of the orbital gaps from Swift in crucial points of the LCs, e.g., at the end of the plateau emission or where a jet break is identified."],"url":"http://arxiv.org/abs/2405.02263v1","category":"astro-ph.HE"} +{"created":"2024-05-03 17:26:44","title":"Emergent Magnetic Field and Nonzero Gyrovector of the Toroidal Magnetic Hopfion","abstract":"Magnetic hopfions are localized magnetic solitons with a nonzero 3D topological charge (Hopf index). Herein, an analytical calculation of the magnetic hopfion gyrovector is presented and it is shown that it does not vanish even in an infinite sample. The calculation method is based on the concept of the emergent magnetic field. The particular case of the simplest nontrivial toroidal hopfion with the Hopf index $\\|Q_H\\|=1$ in the cylindrical magnetic dot is considered and dependencies of the gyrovector components on the dot sizes are calculated. Nonzero hopfion gyrovector is important in any description of the hopfion dynamics within the collective coordinate Thieles approach.","sentences":["Magnetic hopfions are localized magnetic solitons with a nonzero 3D topological charge (Hopf index).","Herein, an analytical calculation of the magnetic hopfion gyrovector is presented and it is shown that it does not vanish even in an infinite sample.","The calculation method is based on the concept of the emergent magnetic field.","The particular case of the simplest nontrivial toroidal hopfion with the Hopf index $\\|Q_H\\|=1$ in the cylindrical magnetic dot is considered and dependencies of the gyrovector components on the dot sizes are calculated.","Nonzero hopfion gyrovector is important in any description of the hopfion dynamics within the collective coordinate Thieles approach."],"url":"http://arxiv.org/abs/2405.02262v1","category":"cond-mat.mes-hall"} +{"created":"2024-05-03 17:13:46","title":"Response of strongly coupled fermions on classical and quantum computers","abstract":"Studying the response of quantum systems is essential for gaining deeper insights into the fundamental nature of matter and its behavior in diverse physical contexts. Computation of nuclear response is critical for many applications, but its spectroscopically accurate description in medium-heavy nuclei in wide energy ranges remains particularly challenging because of the complex nature of nuclear quantum states in the high-level-density regime. Herein, we push the limits of configuration complexity in the classical computation of the nuclear response and present an algorithm with a quantum benefit for treating complex configurations. The classical computational method of approaching spectroscopic accuracy is implemented for medium-heavy nuclei and pioneered for the dipole response of 120Sn, while the quantum algorithm reaching the exact solution is realized for the Lipkin Hamiltonian to unravel the emergence of collectivity at strong coupling.","sentences":["Studying the response of quantum systems is essential for gaining deeper insights into the fundamental nature of matter and its behavior in diverse physical contexts.","Computation of nuclear response is critical for many applications, but its spectroscopically accurate description in medium-heavy nuclei in wide energy ranges remains particularly challenging because of the complex nature of nuclear quantum states in the high-level-density regime.","Herein, we push the limits of configuration complexity in the classical computation of the nuclear response and present an algorithm with a quantum benefit for treating complex configurations.","The classical computational method of approaching spectroscopic accuracy is implemented for medium-heavy nuclei and pioneered for the dipole response of 120Sn, while the quantum algorithm reaching the exact solution is realized for the Lipkin Hamiltonian to unravel the emergence of collectivity at strong coupling."],"url":"http://arxiv.org/abs/2405.02255v1","category":"nucl-th"} +{"created":"2024-05-03 17:09:12","title":"Fermionization and collective excitations of 1D polariton lattices","abstract":"We theoretically demonstrate that the hallmarks of correlation and fermionization in a one-dimensional exciton-polaritons gas can be observed with state-of-the-art technology. Our system consists of a chain of excitonic quantum dots coupled to a photonic waveguide, with a low filling of polaritons. We analytically identify the Tonks-Girardeau, Tavis-Cummings and mean-field limits and relate them to different regimes of the excitonic anharmonicity and photonic bandwidth. Using matrix-product states, we numerically calculate the ground-state energies, correlation functions and dynamic structure factor of the system. In particular, the latter has a finite weight in the Lieb-Liniger hole branch, and the density-density correlator displays Friedel-like oscillations for realistic parameters, which reveal the onset of fermionization close to the Tonks-Girardeau regime. Our work encourages future experiments aimed at observing, for the first time and in spite of the moderate excitonic anharmonicity, strongly correlated exciton-polariton physics.","sentences":["We theoretically demonstrate that the hallmarks of correlation and fermionization in a one-dimensional exciton-polaritons gas can be observed with state-of-the-art technology.","Our system consists of a chain of excitonic quantum dots coupled to a photonic waveguide, with a low filling of polaritons.","We analytically identify the Tonks-Girardeau, Tavis-Cummings and mean-field limits and relate them to different regimes of the excitonic anharmonicity and photonic bandwidth.","Using matrix-product states, we numerically calculate the ground-state energies, correlation functions and dynamic structure factor of the system.","In particular, the latter has a finite weight in the Lieb-Liniger hole branch, and the density-density correlator displays Friedel-like oscillations for realistic parameters, which reveal the onset of fermionization close to the Tonks-Girardeau regime.","Our work encourages future experiments aimed at observing, for the first time and in spite of the moderate excitonic anharmonicity, strongly correlated exciton-polariton physics."],"url":"http://arxiv.org/abs/2405.02251v1","category":"quant-ph"} +{"created":"2024-05-03 16:39:20","title":"On the Utility of External Agent Intention Predictor for Human-AI Coordination","abstract":"Reaching a consensus on the team plans is vital to human-AI coordination. Although previous studies provide approaches through communications in various ways, it could still be hard to coordinate when the AI has no explainable plan to communicate. To cover this gap, we suggest incorporating external models to assist humans in understanding the intentions of AI agents. In this paper, we propose a two-stage paradigm that first trains a Theory of Mind (ToM) model from collected offline trajectories of the target agent, and utilizes the model in the process of human-AI collaboration by real-timely displaying the future action predictions of the target agent. Such a paradigm leaves the AI agent as a black box and thus is available for improving any agents. To test our paradigm, we further implement a transformer-based predictor as the ToM model and develop an extended online human-AI collaboration platform for experiments. The comprehensive experimental results verify that human-AI teams can achieve better performance with the help of our model. A user assessment attached to the experiment further demonstrates that our paradigm can significantly enhance the situational awareness of humans. Our study presents the potential to augment the ability of humans via external assistance in human-AI collaboration, which may further inspire future research.","sentences":["Reaching a consensus on the team plans is vital to human-AI coordination.","Although previous studies provide approaches through communications in various ways, it could still be hard to coordinate when the AI has no explainable plan to communicate.","To cover this gap, we suggest incorporating external models to assist humans in understanding the intentions of AI agents.","In this paper, we propose a two-stage paradigm that first trains a Theory of Mind (ToM) model from collected offline trajectories of the target agent, and utilizes the model in the process of human-AI collaboration by real-timely displaying the future action predictions of the target agent.","Such a paradigm leaves the AI agent as a black box and thus is available for improving any agents.","To test our paradigm, we further implement a transformer-based predictor as the ToM model and develop an extended online human-AI collaboration platform for experiments.","The comprehensive experimental results verify that human-AI teams can achieve better performance with the help of our model.","A user assessment attached to the experiment further demonstrates that our paradigm can significantly enhance the situational awareness of humans.","Our study presents the potential to augment the ability of humans via external assistance in human-AI collaboration, which may further inspire future research."],"url":"http://arxiv.org/abs/2405.02229v1","category":"cs.HC"} +{"created":"2024-05-03 15:26:04","title":"On a question of Kwakkel and Markovic","abstract":"A question of F. Kwakkel and V. Markovic on existence of C^1-diffeomorphisms of closed surfaces that permute a dense collection of domains with bounded geometry is answered in the negative. In fact, it is proved that for closed surfaces of genus at least one such diffeomorphisms do not exist regardless of whether they have positive or zero topological entropy.","sentences":["A question of F. Kwakkel and V. Markovic on existence of C^1-diffeomorphisms of closed surfaces that permute a dense collection of domains with bounded geometry is answered in the negative.","In fact, it is proved that for closed surfaces of genus at least one such diffeomorphisms do not exist regardless of whether they have positive or zero topological entropy."],"url":"http://arxiv.org/abs/2405.02176v1","category":"math.DS"} +{"created":"2024-05-03 15:07:31","title":"Mobility-induced kinetic effects in multicomponent mixtures","abstract":"We give an overview exploring the role of kinetics in multicomponent mixtures. Compared to the most commonly studied binary (single species plus solvent) case, multicomponent fluids show a rich interplay between kinetics and thermodynamics due to the possibility of fractionation, interdiffusion of mixture components and collective motion. This leads to a competition between multiple timescales that change depending on the underlying kinetics. At high densities, crowding effects are relevant and non-equilibrium structures can become long-lived. We present the main approaches for the study of kinetic effects in multicomponents mixtures, including the role of crowding, and explore their consequences for equilibrium and non-equilibrium scenarios. We conclude by identifying the main challenges in the field.","sentences":["We give an overview exploring the role of kinetics in multicomponent mixtures.","Compared to the most commonly studied binary (single species plus solvent) case, multicomponent fluids show a rich interplay between kinetics and thermodynamics due to the possibility of fractionation, interdiffusion of mixture components and collective motion.","This leads to a competition between multiple timescales that change depending on the underlying kinetics.","At high densities, crowding effects are relevant and non-equilibrium structures can become long-lived.","We present the main approaches for the study of kinetic effects in multicomponents mixtures, including the role of crowding, and explore their consequences for equilibrium and non-equilibrium scenarios.","We conclude by identifying the main challenges in the field."],"url":"http://arxiv.org/abs/2405.02159v1","category":"cond-mat.soft"} +{"created":"2024-05-03 14:54:24","title":"Piezoresistivity as an Order Parameter for Ferroaxial Transitions","abstract":"Recent progress in the understanding of the collective behavior of electrons and ions have revealed new types of ferroic orders beyond ferroelectricity and ferromagnetism, such as the ferroaxial state. The latter retains only rotational symmetry around a single axis and reflection symmetry with respect to a single mirror plane, both of which are set by an emergent electric toroidal dipole moment. Due to this unusual symmetry-breaking pattern, it has been challenging to directly measure the ferroaxial order parameter, despite the increasing attention this state has drawn. Here, we show that off-diagonal components of the piezoresistivity tensor (i.e., the linear change in resistivity under strain) transform the same way as the ferroaxial moments, providing a direct probe of such order parameters. We identify two new proper ferroaxial materials through a materials database search, and use first-principles calculations to evaluate the piezoconductivity of the double-perovskite CaSnF$_6$, revealing its connection to ferroaxial order and to octahedral rotation modes.","sentences":["Recent progress in the understanding of the collective behavior of electrons and ions have revealed new types of ferroic orders beyond ferroelectricity and ferromagnetism, such as the ferroaxial state.","The latter retains only rotational symmetry around a single axis and reflection symmetry with respect to a single mirror plane, both of which are set by an emergent electric toroidal dipole moment.","Due to this unusual symmetry-breaking pattern, it has been challenging to directly measure the ferroaxial order parameter, despite the increasing attention this state has drawn.","Here, we show that off-diagonal components of the piezoresistivity tensor (i.e., the linear change in resistivity under strain) transform the same way as the ferroaxial moments, providing a direct probe of such order parameters.","We identify two new proper ferroaxial materials through a materials database search, and use first-principles calculations to evaluate the piezoconductivity of the double-perovskite CaSnF$_6$, revealing its connection to ferroaxial order and to octahedral rotation modes."],"url":"http://arxiv.org/abs/2405.02149v1","category":"cond-mat.mtrl-sci"} +{"created":"2024-05-03 14:44:04","title":"Multi-Objective Recommendation via Multivariate Policy Learning","abstract":"Real-world recommender systems often need to balance multiple objectives when deciding which recommendations to present to users. These include behavioural signals (e.g. clicks, shares, dwell time), as well as broader objectives (e.g. diversity, fairness). Scalarisation methods are commonly used to handle this balancing task, where a weighted average of per-objective reward signals determines the final score used for ranking. Naturally, how these weights are computed exactly, is key to success for any online platform. We frame this as a decision-making task, where the scalarisation weights are actions taken to maximise an overall North Star reward (e.g. long-term user retention or growth). We extend existing policy learning methods to the continuous multivariate action domain, proposing to maximise a pessimistic lower bound on the North Star reward that the learnt policy will yield. Typical lower bounds based on normal approximations suffer from insufficient coverage, and we propose an efficient and effective policy-dependent correction for this. We provide guidance to design stochastic data collection policies, as well as highly sensitive reward signals. Empirical observations from simulations, offline and online experiments highlight the efficacy of our deployed approach.","sentences":["Real-world recommender systems often need to balance multiple objectives when deciding which recommendations to present to users.","These include behavioural signals (e.g. clicks, shares, dwell time), as well as broader objectives (e.g. diversity, fairness).","Scalarisation methods are commonly used to handle this balancing task, where a weighted average of per-objective reward signals determines the final score used for ranking.","Naturally, how these weights are computed exactly, is key to success for any online platform.","We frame this as a decision-making task, where the scalarisation weights are actions taken to maximise an overall North Star reward (e.g. long-term user retention or growth).","We extend existing policy learning methods to the continuous multivariate action domain, proposing to maximise a pessimistic lower bound on the North Star reward that the learnt policy will yield.","Typical lower bounds based on normal approximations suffer from insufficient coverage, and we propose an efficient and effective policy-dependent correction for this.","We provide guidance to design stochastic data collection policies, as well as highly sensitive reward signals.","Empirical observations from simulations, offline and online experiments highlight the efficacy of our deployed approach."],"url":"http://arxiv.org/abs/2405.02141v1","category":"cs.IR"} +{"created":"2024-05-03 14:08:34","title":"Equal Requests are Asymptotically Hardest for Data Recovery","abstract":"In a distributed storage system serving hot data, the data recovery performance becomes important, captured e.g. by the service rate. We give partial evidence for it being hardest to serve a sequence of equal user requests (as in PIR coding regime) both for concrete and random user requests and server contents. We prove that a constant request sequence is locally hardest to serve: If enough copies of each vector are stored in servers, then if a request sequence with all requests equal can be served then we can still serve it if a few requests are changed. For random iid server contents, with number of data symbols constant (for simplicity) and the number of servers growing, we show that the maximum number of user requests we can serve divided by the number of servers we need approaches a limit almost surely. For uniform server contents, we show this limit is 1/2, both for sequences of copies of a fixed request and of any requests, so it is at least as hard to serve equal requests as any requests. For iid requests independent from the uniform server contents the limit is at least 1/2 and equal to 1/2 if requests are all equal to a fixed request almost surely, confirming the same. As a building block, we deduce from a 1952 result of Marshall Hall, Jr. on abelian groups, that any collection of half as many requests as coded symbols in the doubled binary simplex code can be served by this code. This implies the fractional version of the Functional Batch Code Conjecture that allows half-servers.","sentences":["In a distributed storage system serving hot data, the data recovery performance becomes important, captured e.g. by the service rate.","We give partial evidence for it being hardest to serve a sequence of equal user requests (as in PIR coding regime) both for concrete and random user requests and server contents. ","We prove that a constant request sequence is locally hardest to serve: If enough copies of each vector are stored in servers, then if a request sequence with all requests equal can be served then we can still serve it if a few requests are changed. ","For random iid server contents, with number of data symbols constant (for simplicity) and the number of servers growing, we show that the maximum number of user requests we can serve divided by the number of servers we need approaches a limit almost surely.","For uniform server contents, we show this limit is 1/2, both for sequences of copies of a fixed request and of any requests, so it is at least as hard to serve equal requests as any requests.","For iid requests independent from the uniform server contents the limit is at least 1/2 and equal to 1/2 if requests are all equal to a fixed request almost surely, confirming the same. ","As a building block, we deduce from a 1952 result of Marshall Hall, Jr. on abelian groups, that any collection of half as many requests as coded symbols in the doubled binary simplex code can be served by this code.","This implies the fractional version of the Functional Batch Code Conjecture that allows half-servers."],"url":"http://arxiv.org/abs/2405.02107v1","category":"cs.IT"} +{"created":"2024-05-03 11:01:30","title":"On finding optimal collective variables for complex systems by minimizing the deviation between effective and full dynamics","abstract":"This paper is concerned with collective variables, or reaction coordinates, that map a discrete-in-time Markov process $X_n$ in $\\mathbb{R}^d$ to a (much) smaller dimension $k \\ll d$. We define the effective dynamics under a given collective variable map $\\xi$ as the best Markovian representation of $X_n$ under $\\xi$. The novelty of the paper is that it gives strict criteria for selecting optimal collective variables via the properties of the effective dynamics. In particular, we show that the transition density of the effective dynamics of the optimal collective variable solves a relative entropy minimization problem from certain family of densities to the transition density of $X_n$. We also show that many transfer operator-based data-driven numerical approaches essentially learn quantities of the effective dynamics. Furthermore, we obtain various error estimates for the effective dynamics in approximating dominant timescales / eigenvalues and transition rates of the original process $X_n$ and how optimal collective variables minimize these errors. Our results contribute to the development of theoretical tools for the understanding of complex dynamical systems, e.g. molecular kinetics, on large timescales. These results shed light on the relations among existing data-driven numerical approaches for identifying good collective variables, and they also motivate the development of new methods.","sentences":["This paper is concerned with collective variables, or reaction coordinates, that map a discrete-in-time Markov process $X_n$ in $\\mathbb{R}^d$ to a (much) smaller dimension $k \\ll d$. We define the effective dynamics under a given collective variable map $\\xi$ as the best Markovian representation of $X_n$ under $\\xi$. The novelty of the paper is that it gives strict criteria for selecting optimal collective variables via the properties of the effective dynamics.","In particular, we show that the transition density of the effective dynamics of the optimal collective variable solves a relative entropy minimization problem from certain family of densities to the transition density of $X_n$. We also show that many transfer operator-based data-driven numerical approaches essentially learn quantities of the effective dynamics.","Furthermore, we obtain various error estimates for the effective dynamics in approximating dominant timescales / eigenvalues and transition rates of the original process $X_n$ and how optimal collective variables minimize these errors.","Our results contribute to the development of theoretical tools for the understanding of complex dynamical systems, e.g. molecular kinetics, on large timescales.","These results shed light on the relations among existing data-driven numerical approaches for identifying good collective variables, and they also motivate the development of new methods."],"url":"http://arxiv.org/abs/2405.02001v1","category":"math.OC"} +{"created":"2024-05-03 09:35:53","title":"Proliferation-driven mechanical feedback regulates cell dynamics in growing tissues","abstract":"Local stresses in a tissue, a collective property, regulate cell division and apoptosis. In turn, cell growth and division induce active stresses in the tissue. As a consequence, there is a feedback between cell growth and local stresses. However, how the cell dynamics depend on local stress-dependent cell division and the feedback strength is not fully understood. Here, we probe the consequences of stress-mediated growth and cell division on cell dynamics using agent-based simulations of a two-dimensional growing tissue. We discover a rich dynamical behavior of individual cells, ranging from jamming (mean square displacement, $\\Delta (t) \\sim t^{\\alpha}$ with $\\alpha$ less than unity), to hyperdiffusion ($\\alpha > 2$) depending on cell division rate and the strength of the mechanical feedback. Strikingly, $\\Delta (t)$ is determined by the tissue growth law, which quantifies cell proliferation (number of cells $N(t)$ as a function of time). The growth law ($N(t) \\sim t^{\\lambda}$ at long times) is regulated by the critical pressure that controls the strength of the mechanical feedback and the ratio between cell division-apoptosis rates. We show that $\\lambda \\sim \\alpha$, which implies that higher growth rate leads to a greater degree of cell migration. The variations in cell motility are linked to the emergence of highly persistent forces extending over several cell cycle times. Our predictions are testable using cell-tracking imaging techniques.","sentences":["Local stresses in a tissue, a collective property, regulate cell division and apoptosis.","In turn, cell growth and division induce active stresses in the tissue.","As a consequence, there is a feedback between cell growth and local stresses.","However, how the cell dynamics depend on local stress-dependent cell division and the feedback strength is not fully understood.","Here, we probe the consequences of stress-mediated growth and cell division on cell dynamics using agent-based simulations of a two-dimensional growing tissue.","We discover a rich dynamical behavior of individual cells, ranging from jamming (mean square displacement, $\\Delta (t) \\sim t^{\\alpha}$ with $\\alpha$ less than unity), to hyperdiffusion ($\\alpha > 2$) depending on cell division rate and the strength of the mechanical feedback.","Strikingly, $\\Delta (t)$ is determined by the tissue growth law, which quantifies cell proliferation (number of cells $N(t)$ as a function of time).","The growth law ($N(t) \\sim t^{\\lambda}$ at long times) is regulated by the critical pressure that controls the strength of the mechanical feedback and the ratio between cell division-apoptosis rates.","We show that $\\lambda \\sim \\alpha$, which implies that higher growth rate leads to a greater degree of cell migration.","The variations in cell motility are linked to the emergence of highly persistent forces extending over several cell cycle times.","Our predictions are testable using cell-tracking imaging techniques."],"url":"http://arxiv.org/abs/2405.01960v1","category":"cond-mat.soft"} +{"created":"2024-05-03 09:19:13","title":"Interaction-Enhanced Superradiance of a Ryderg-Atom Array","abstract":"We study the superradiant phase transition of an array of Rydberg atoms in a dissipative microwave cavity. Under the interplay of the cavity field and the long-range Rydberg interaction, the steady state of the system exhibits an interaction-enhanced superradiance, with vanishing critical atom-cavity coupling rates at a discrete set of interaction strengths. We find that, while the phenomenon can be analytically understood in the case of constant all-to-all interaction, the enhanced superradiance persists under the spatially dependent dipolar interaction, but shifted in the critical interaction strengths. The diverging susceptibility at these critical points is captured by emergent quantum Rabi models, each of which comprises a pair of collective atomic states with different numbers of atomic excitations. These collective states become degenerate at the critical interaction strengths, resulting in a superradiant phase for an arbitrarily small atom-cavity coupling.","sentences":["We study the superradiant phase transition of an array of Rydberg atoms in a dissipative microwave cavity.","Under the interplay of the cavity field and the long-range Rydberg interaction, the steady state of the system exhibits an interaction-enhanced superradiance, with vanishing critical atom-cavity coupling rates at a discrete set of interaction strengths.","We find that, while the phenomenon can be analytically understood in the case of constant all-to-all interaction, the enhanced superradiance persists under the spatially dependent dipolar interaction, but shifted in the critical interaction strengths.","The diverging susceptibility at these critical points is captured by emergent quantum Rabi models, each of which comprises a pair of collective atomic states with different numbers of atomic excitations.","These collective states become degenerate at the critical interaction strengths, resulting in a superradiant phase for an arbitrarily small atom-cavity coupling."],"url":"http://arxiv.org/abs/2405.01945v1","category":"quant-ph"} +{"created":"2024-05-03 17:58:02","title":"Thermodynamic constraints on polar active matter hydrodynamics","abstract":"In this letter we use the thermodynamics of a passive fluid to constrain transport in the corresponding active fluid which is subsequently described by the Toner-Tu equations. Acknowledging that the system fundamentally breaks boost symmetry we compel what were previously entirely phenomenological parameters in the Toner-Tu model to satisfy precise relationships among themselves. Consequently, we determine exact scalings for the transport coefficients under dynamical renormalisation group flow that reproduce the results of recent simulations.","sentences":["In this letter we use the thermodynamics of a passive fluid to constrain transport in the corresponding active fluid which is subsequently described by the Toner-Tu equations.","Acknowledging that the system fundamentally breaks boost symmetry we compel what were previously entirely phenomenological parameters in the Toner-Tu model to satisfy precise relationships among themselves.","Consequently, we determine exact scalings for the transport coefficients under dynamical renormalisation group flow that reproduce the results of recent simulations."],"url":"http://arxiv.org/abs/2405.02283v1","category":"cond-mat.stat-mech"} +{"created":"2024-05-03 17:50:30","title":"Local insulator-to-superconductor transition in amorphous InO$_x$ films modulated by e-beam irradiation","abstract":"We present a novel method enabling precise post-fabrication modulation of the electrical resistance in micrometer-scale regions of amorphous indium oxide (a-InO$_x$) films. By subjecting initially insulating films to an electron beam at room temperature, we demonstrate that the exposed region of the films becomes superconducting. The resultant superconducting transition temperature ($T_c$) is adjustable up to 2.8 K by changing the electron dose and accelerating voltage. This technique offers a compelling alternative to traditional a-InO$_x$ annealing methods for both fundamental investigations and practical applications. Moreover, it empowers independent adjustment of electrical properties across initially identical a-InO$_x$ samples on the same substrate, facilitating the creation of superconducting microstructures with precise $T_c$ control at the micrometer scale. The observed resistance modifications likely stem from photoreduction induced by X-ray and/or UV radiation emitted during electron beam interactions with the film and substrate.","sentences":["We present a novel method enabling precise post-fabrication modulation of the electrical resistance in micrometer-scale regions of amorphous indium oxide (a-InO$_x$) films.","By subjecting initially insulating films to an electron beam at room temperature, we demonstrate that the exposed region of the films becomes superconducting.","The resultant superconducting transition temperature ($T_c$) is adjustable up to 2.8 K by changing the electron dose and accelerating voltage.","This technique offers a compelling alternative to traditional a-InO$_x$ annealing methods for both fundamental investigations and practical applications.","Moreover, it empowers independent adjustment of electrical properties across initially identical a-InO$_x$ samples on the same substrate, facilitating the creation of superconducting microstructures with precise $T_c$ control at the micrometer scale.","The observed resistance modifications likely stem from photoreduction induced by X-ray and/or UV radiation emitted during electron beam interactions with the film and substrate."],"url":"http://arxiv.org/abs/2405.02276v1","category":"cond-mat.supr-con"} +{"created":"2024-05-03 17:45:35","title":"Transversely Projective Structures on Smooth Foliations on Surfaces","abstract":"Brunella's classification implies that every smooth foliation on a compact complex surface admits a singular transversely projective structure. However, Biswas and Sorin's recent work shows that certain foliations on compact complex surfaces, despite possessing a singular transversely projective structure, do not admit a regular transversely projective structure. Here, we describe the smooth foliations on compact complex surfaces which fail to possess a regular transversely projective structure.","sentences":["Brunella's classification implies that every smooth foliation on a compact complex surface admits a singular transversely projective structure.","However, Biswas and Sorin's recent work shows that certain foliations on compact complex surfaces, despite possessing a singular transversely projective structure, do not admit a regular transversely projective structure.","Here, we describe the smooth foliations on compact complex surfaces which fail to possess a regular transversely projective structure."],"url":"http://arxiv.org/abs/2405.02273v1","category":"math.CV"} +{"created":"2024-05-03 17:38:05","title":"On its way to the neutron star-white dwarf binary graveyard, IGR J16194-2810, a first ascent M giant X-ray binary","abstract":"A single-lined spectroscopic orbit for the M giant in the X-ray binary IGR J16194-2810 is determined from a time-series of optical spectra. The spectroscopic orbital period of 192.5 days is twice that of the photometric period, confirming that the M giant in the system is an ellipsoidal variable. The giant is identified as a first ascent giant approaching the red giant tip. The primary is a neutron star (NS) with its M giant companion filling its Roche lobe verifying the system classification as a Low-Mass X-ray binary (LMXB). Stellar C, N, O and Fe abundances are derived for the M giant with the C, N, O values typical for a field giant with [Fe/H] = -0.14. The system does not have a large kick velocity. Models for the evolution of the system into a binary NS-white dwarf (WD) are presented. The X-ray properties are discussed in the context of this model. This binary is a rare example of a luminous, long orbital period, LMXB early in the transient ellipsoidal phase.","sentences":["A single-lined spectroscopic orbit for the M giant in the X-ray binary IGR J16194-2810 is determined from a time-series of optical spectra.","The spectroscopic orbital period of 192.5 days is twice that of the photometric period, confirming that the M giant in the system is an ellipsoidal variable.","The giant is identified as a first ascent giant approaching the red giant tip.","The primary is a neutron star (NS) with its M giant companion filling its Roche lobe verifying the system classification as a Low-Mass X-ray binary (LMXB).","Stellar C, N, O and Fe abundances are derived for the M giant with the C, N, O values typical for a field giant with [Fe/H] = -0.14.","The system does not have a large kick velocity.","Models for the evolution of the system into a binary NS-white dwarf (WD) are presented.","The X-ray properties are discussed in the context of this model.","This binary is a rare example of a luminous, long orbital period, LMXB early in the transient ellipsoidal phase."],"url":"http://arxiv.org/abs/2405.02270v1","category":"astro-ph.SR"} +{"created":"2024-05-03 17:14:20","title":"Efficient computation of topological integral transforms","abstract":"Topological integral transforms have found many applications in shape analysis, from prediction of clinical outcomes in brain cancer to analysis of barley seeds. Using Euler characteristic as a measure, these objects record rich geometric information on weighted polytopal complexes. While some implementations exist, they only enable discretized representations of the transforms, and they do not handle weighted complexes (such as for instance images). Moreover, recent hybrid transforms lack an implementation. In this paper, we introduce Eucalc, a novel implementation of three topological integral transforms -- the Euler characteristic transform, the Radon transform, and hybrid transforms -- for weighted cubical complexes. Leveraging piecewise linear Morse theory and Euler calculus, the algorithms significantly reduce computational complexity by focusing on critical points. Our software provides exact representations of transforms, handles both binary and grayscale images, and supports multi-core processing. It is publicly available as a C++ library with a Python wrapper. We present mathematical foundations, implementation details, and experimental evaluations, demonstrating Eucalc's efficiency.","sentences":["Topological integral transforms have found many applications in shape analysis, from prediction of clinical outcomes in brain cancer to analysis of barley seeds.","Using Euler characteristic as a measure, these objects record rich geometric information on weighted polytopal complexes.","While some implementations exist, they only enable discretized representations of the transforms, and they do not handle weighted complexes (such as for instance images).","Moreover, recent hybrid transforms lack an implementation. ","In this paper, we introduce Eucalc, a novel implementation of three topological integral transforms -- the Euler characteristic transform, the Radon transform, and hybrid transforms -- for weighted cubical complexes.","Leveraging piecewise linear Morse theory and Euler calculus, the algorithms significantly reduce computational complexity by focusing on critical points.","Our software provides exact representations of transforms, handles both binary and grayscale images, and supports multi-core processing.","It is publicly available as a C++ library with a Python wrapper.","We present mathematical foundations, implementation details, and experimental evaluations, demonstrating Eucalc's efficiency."],"url":"http://arxiv.org/abs/2405.02256v1","category":"cs.CG"} +{"created":"2024-05-03 17:10:42","title":"Moment matching based reduced closed-loop design to achieve asymptotic performance","abstract":"In this paper, the moment matching techniques are adopted to obtain reduced-order closed-loop systems with reduced-order controllers that maintain the closed-loop stability and guarantee desired asymptotic performance, after revealing the relationship between the Internal Model Principle used in control design and the time-domain moment matching problem. As a result, the design of a low order controller can be done starting from considering the achieving of asymptotic performance as a moment matching problem, resulting in a reduced order closed-loop system.","sentences":["In this paper, the moment matching techniques are adopted to obtain reduced-order closed-loop systems with reduced-order controllers that maintain the closed-loop stability and guarantee desired asymptotic performance, after revealing the relationship between the Internal Model Principle used in control design and the time-domain moment matching problem.","As a result, the design of a low order controller can be done starting from considering the achieving of asymptotic performance as a moment matching problem, resulting in a reduced order closed-loop system."],"url":"http://arxiv.org/abs/2405.02253v1","category":"math.OC"} +{"created":"2024-05-03 17:02:14","title":"Fractonic criticality in Rydberg atom arrays","abstract":"Fractonic matter can undergo unconventional phase transitions driven by the condensation of particles that move along subdimensional manifolds. We propose that this type of quantum critical point can be realized in a bilayer of crossed Rydberg chains. This system exhibits a transition between a disordered phase and a charge-density-wave phase with subextensive ground state degeneracy. The transition is described by a stack of critical Ising conformal field theories that become decoupled in the low-energy limit due to emergent subsystem symmetries. We discuss the unusual scaling properties and derive anisotropic correlators that provide signatures of subdimensional criticality in a realistic setup.","sentences":["Fractonic matter can undergo unconventional phase transitions driven by the condensation of particles that move along subdimensional manifolds.","We propose that this type of quantum critical point can be realized in a bilayer of crossed Rydberg chains.","This system exhibits a transition between a disordered phase and a charge-density-wave phase with subextensive ground state degeneracy.","The transition is described by a stack of critical Ising conformal field theories that become decoupled in the low-energy limit due to emergent subsystem symmetries.","We discuss the unusual scaling properties and derive anisotropic correlators that provide signatures of subdimensional criticality in a realistic setup."],"url":"http://arxiv.org/abs/2405.02248v1","category":"cond-mat.str-el"} +{"created":"2024-05-03 16:53:09","title":"The JWST EXCELS survey: Too much, too young, too fast? Ultra-massive quiescent galaxies at 3 < z < 5","abstract":"We report ultra-deep, medium-resolution spectroscopic observations for 4 quiescent galaxies with log$_{10}(M_*/\\mathrm{M_\\odot})>11$ at $3 < z < 5$. These data were obtained with JWST NIRSpec as part of the Early eXtragalactic Continuum and Emission Line Science (EXCELS) survey, which we introduce in this work. The first pair of galaxies are newly selected from PRIMER UDS imaging, both at $z=4.62$ and separated by $860$ pkpc on the sky, within a larger structure for which we confirm several other members. These galaxies formed at $z\\simeq8-10$, and, despite their similar stellar masses, ages, and their proximity, they exhibit very different stellar metallicities, hinting at different formation pathways. These systems could plausibly merge by the present day to produce a local massive elliptical galaxy. The other 2 ultra-massive quiescent galaxies are previously known at $z=3.99$ and $3.19$, with the latter (ZF-UDS-7329) having been the subject of debate as potentially too old and too massive to be accommodated by the $\\Lambda$-CDM halo-mass function. Both exhibit high stellar metallicities, and for ZF-UDS-7329 we are able to measure the $\\alpha-$enhancement, obtaining [Mg/Fe] = $0.42^{+0.19}_{-0.17}$. We finally evaluate whether these 4 galaxies are consistent with the $\\Lambda$-CDM halo-mass function using an extreme value statistics approach. We find that the $z=4.62$ objects and the $z=3.19$ object are unlikely within our area under the assumption of standard stellar fractions ($f_*\\simeq0.1-0.2$). However, these objects roughly align with the most massive galaxies expected under the assumption of 100 per cent conversion of baryons to stars ($f_*$=1). Our results suggest extreme galaxy formation physics during the first billion years, but no conflict with $\\Lambda$-CDM cosmology.","sentences":["We report ultra-deep, medium-resolution spectroscopic observations for 4 quiescent galaxies with log$_{10}(M_*/\\mathrm{M_\\odot})>11$ at $3 < z < 5$.","These data were obtained with JWST NIRSpec as part of the Early eXtragalactic Continuum and Emission Line Science (EXCELS) survey, which we introduce in this work.","The first pair of galaxies are newly selected from PRIMER UDS imaging, both at $z=4.62$ and separated by $860$ pkpc on the sky, within a larger structure for which we confirm several other members.","These galaxies formed at $z\\simeq8-10$, and, despite their similar stellar masses, ages, and their proximity, they exhibit very different stellar metallicities, hinting at different formation pathways.","These systems could plausibly merge by the present day to produce a local massive elliptical galaxy.","The other 2 ultra-massive quiescent galaxies are previously known at $z=3.99$ and $3.19$, with the latter (ZF-UDS-7329) having been the subject of debate as potentially too old and too massive to be accommodated by the $\\Lambda$-CDM halo-mass function.","Both exhibit high stellar metallicities, and for ZF-UDS-7329 we are able to measure the $\\alpha-$enhancement, obtaining [Mg/Fe] = $0.42^{+0.19}_{-0.17}$. We finally evaluate whether these 4 galaxies are consistent with the $\\Lambda$-CDM halo-mass function using an extreme value statistics approach.","We find that the $z=4.62$ objects and the $z=3.19$ object are unlikely within our area under the assumption of standard stellar fractions ($f_*\\simeq0.1-0.2$).","However, these objects roughly align with the most massive galaxies expected under the assumption of 100 per cent conversion of baryons to stars ($f_*$=1).","Our results suggest extreme galaxy formation physics during the first billion years, but no conflict with $\\Lambda$-CDM cosmology."],"url":"http://arxiv.org/abs/2405.02242v1","category":"astro-ph.GA"} +{"created":"2024-05-03 16:50:06","title":"Supersymmetric Quantum Mechanics on a noncommutative plane through the lens of deformation quantization","abstract":"A gauge invariant mathematical formalism based on deformation quantization is outlined to model an $\\mathcal{N}=2$ supersymmetric system of a spin $1/2$ charged particle placed in a nocommutative plane under the influence of a vertical uniform magnetic field. The noncommutative involutive algebra $(C^{\\infty}(\\mathbb{R}^{2})[[\\vartheta]],*^r)$ of formal power series in $\\vartheta$ with coefficients in the commutative ring $C^{\\infty}(\\mathbb{R}^{2})$ was employed to construct the relevant observables, viz., SUSY Hamiltonian $H$, supercharge operator $Q$ and its adjoint $Q^{\\dag}$ all belonging to the $2\\times 2$ matrix algebra $\\mathcal{M}_{2}(C^{\\infty}(\\mathbb{R}^{2})[[\\vartheta]],*^r)$ with the help of a family of gauge-equivalent star products $*^{r}$. The energy eigenvalues of the SUSY Hamiltonian all turned out to be independent of not only the gauge parameter $r$ but also the noncommutativity parameter $\\vartheta$. The nontrivial Fermionic ground state was subsequently computed associated with the zero energy which indicates that supersymmetry remains unbroken in all orders of $\\vartheta$. The Witten index for the noncommutative SUSY Landau problem turns out to be $-1$ corroborating the fact that there is no broken supersymmetry for the model we are considering.","sentences":["A gauge invariant mathematical formalism based on deformation quantization is outlined to model an $\\mathcal{N}=2$ supersymmetric system of a spin $1/2$ charged particle placed in a nocommutative plane under the influence of a vertical uniform magnetic field.","The noncommutative involutive algebra $(C^{\\infty}(\\mathbb{R}^{2})[[\\vartheta]],*^r)$ of formal power series in $\\vartheta$ with coefficients in the commutative ring $C^{\\infty}(\\mathbb{R}^{2})$ was employed to construct the relevant observables, viz., SUSY Hamiltonian $H$, supercharge operator $Q$ and its adjoint $Q^{\\dag}$ all belonging to the $2\\times 2$ matrix algebra $\\mathcal{M}_{2}(C^{\\infty}(\\mathbb{R}^{2})[[\\vartheta]],*^r)$ with the help of a family of gauge-equivalent star products $*^{r}$. The energy eigenvalues of the SUSY Hamiltonian all turned out to be independent of not only the gauge parameter $r$","but also the noncommutativity parameter $\\vartheta$. The nontrivial Fermionic ground state was subsequently computed associated with the zero energy which indicates that supersymmetry remains unbroken in all orders of $\\vartheta$. The Witten index for the noncommutative SUSY Landau problem turns out to be $-1$ corroborating the fact that there is no broken supersymmetry for the model we are considering."],"url":"http://arxiv.org/abs/2405.02239v1","category":"math-ph"} +{"created":"2024-05-03 16:45:15","title":"Learning Optimal Deterministic Policies with Stochastic Policy Gradients","abstract":"Policy gradient (PG) methods are successful approaches to deal with continuous reinforcement learning (RL) problems. They learn stochastic parametric (hyper)policies by either exploring in the space of actions or in the space of parameters. Stochastic controllers, however, are often undesirable from a practical perspective because of their lack of robustness, safety, and traceability. In common practice, stochastic (hyper)policies are learned only to deploy their deterministic version. In this paper, we make a step towards the theoretical understanding of this practice. After introducing a novel framework for modeling this scenario, we study the global convergence to the best deterministic policy, under (weak) gradient domination assumptions. Then, we illustrate how to tune the exploration level used for learning to optimize the trade-off between the sample complexity and the performance of the deployed deterministic policy. Finally, we quantitatively compare action-based and parameter-based exploration, giving a formal guise to intuitive results.","sentences":["Policy gradient (PG) methods are successful approaches to deal with continuous reinforcement learning (RL) problems.","They learn stochastic parametric (hyper)policies by either exploring in the space of actions or in the space of parameters.","Stochastic controllers, however, are often undesirable from a practical perspective because of their lack of robustness, safety, and traceability.","In common practice, stochastic (hyper)policies are learned only to deploy their deterministic version.","In this paper, we make a step towards the theoretical understanding of this practice.","After introducing a novel framework for modeling this scenario, we study the global convergence to the best deterministic policy, under (weak) gradient domination assumptions.","Then, we illustrate how to tune the exploration level used for learning to optimize the trade-off between the sample complexity and the performance of the deployed deterministic policy.","Finally, we quantitatively compare action-based and parameter-based exploration, giving a formal guise to intuitive results."],"url":"http://arxiv.org/abs/2405.02235v1","category":"cs.LG"} +{"created":"2024-05-03 16:41:40","title":"From Proof Complexity to Circuit Complexity via Interactive Protocols","abstract":"Folklore in complexity theory suspects that circuit lower bounds against $\\mathbf{NC}^1$ or $\\mathbf{P}/\\operatorname{poly}$, currently out of reach, are a necessary step towards proving strong proof complexity lower bounds for systems like Frege or Extended Frege. Establishing such a connection formally, however, is already daunting, as it would imply the breakthrough separation $\\mathbf{NEXP} \\not\\subseteq \\mathbf{P}/\\operatorname{poly}$, as recently observed by Pich and Santhanam (2023). We show such a connection conditionally for the Implicit Extended Frege proof system ($\\mathsf{iEF}$) introduced by Kraj\\'i\\v{c}ek (The Journal of Symbolic Logic, 2004), capable of formalizing most of contemporary complexity theory. In particular, we show that if $\\mathsf{iEF}$ proves efficiently the standard derandomization assumption that a concrete Boolean function is hard on average for subexponential-size circuits, then any superpolynomial lower bound on the length of $\\mathsf{iEF}$ proofs implies $\\#\\mathbf{P} \\not\\subseteq \\mathbf{FP}/\\operatorname{poly}$ (which would in turn imply, for example, $\\mathbf{PSPACE} \\not\\subseteq \\mathbf{P}/\\operatorname{poly}$). Our proof exploits the formalization inside $\\mathsf{iEF}$ of the soundness of the sum-check protocol of Lund, Fortnow, Karloff, and Nisan (Journal of the ACM, 1992). This has consequences for the self-provability of circuit upper bounds in $\\mathsf{iEF}$. Interestingly, further improving our result seems to require progress in constructing interactive proof systems with more efficient provers.","sentences":["Folklore in complexity theory suspects that circuit lower bounds against $\\mathbf{NC}^1$ or $\\mathbf{P}/\\operatorname{poly}$, currently out of reach, are a necessary step towards proving strong proof complexity lower bounds for systems like Frege or Extended Frege.","Establishing such a connection formally, however, is already daunting, as it would imply the breakthrough separation $\\mathbf{NEXP} \\not\\subseteq \\mathbf{P}/\\operatorname{poly}$, as recently observed by Pich and Santhanam (2023). ","We show such a connection conditionally for the Implicit Extended Frege proof system ($\\mathsf{iEF}$) introduced by Kraj\\'i\\v{c}ek (The Journal of Symbolic Logic, 2004), capable of formalizing most of contemporary complexity theory.","In particular, we show that if $\\mathsf{iEF}$ proves efficiently the standard derandomization assumption that a concrete Boolean function is hard on average for subexponential-size circuits, then any superpolynomial lower bound on the length of $\\mathsf{iEF}$ proofs implies $\\#\\mathbf{P} \\not\\subseteq \\mathbf{FP}/\\operatorname{poly}$ (which would in turn imply, for example, $\\mathbf{PSPACE} \\not\\subseteq \\mathbf{P}/\\operatorname{poly}$).","Our proof exploits the formalization inside $\\mathsf{iEF}$ of the soundness of the sum-check protocol of Lund, Fortnow, Karloff, and Nisan (Journal of the ACM, 1992).","This has consequences for the self-provability of circuit upper bounds in $\\mathsf{iEF}$. Interestingly, further improving our result seems to require progress in constructing interactive proof systems with more efficient provers."],"url":"http://arxiv.org/abs/2405.02232v1","category":"cs.CC"} +{"created":"2024-05-03 16:25:27","title":"FairEvalLLM. A Comprehensive Framework for Benchmarking Fairness in Large Language Model Recommender Systems","abstract":"This paper presents a framework for evaluating fairness in recommender systems powered by Large Language Models (RecLLMs), addressing the need for a unified approach that spans various fairness dimensions including sensitivity to user attributes, intrinsic fairness, and discussions of fairness based on underlying benefits. In addition, our framework introduces counterfactual evaluations and integrates diverse user group considerations to enhance the discourse on fairness evaluation for RecLLMs. Our key contributions include the development of a robust framework for fairness evaluation in LLM-based recommendations and a structured method to create \\textit{informative user profiles} from demographic data, historical user preferences, and recent interactions. We argue that the latter is essential for enhancing personalization in such systems, especially in temporal-driven scenarios. We demonstrate the utility of our framework through practical applications on two datasets, LastFM-1K and ML-1M. We conduct experiments on a subsample of 80 users from each dataset, testing and assessing the effectiveness of various prompt construction scenarios and in-context learning, comprising more than 50 scenarios. This results in more than 4000 recommendations (80 * 50 = 4000). Our study reveals that while there are no significant unfairness issues in scenarios involving sensitive attributes, some concerns remain. However, in terms of intrinsic fairness, which does not involve direct sensitivity, unfairness across demographic groups remains significant. The code and data used for this paper are available at: \\url{https://shorturl.at/awBFM}.","sentences":["This paper presents a framework for evaluating fairness in recommender systems powered by Large Language Models (RecLLMs), addressing the need for a unified approach that spans various fairness dimensions including sensitivity to user attributes, intrinsic fairness, and discussions of fairness based on underlying benefits.","In addition, our framework introduces counterfactual evaluations and integrates diverse user group considerations to enhance the discourse on fairness evaluation for RecLLMs. ","Our key contributions include the development of a robust framework for fairness evaluation in LLM-based recommendations and a structured method to create \\textit{informative user profiles} from demographic data, historical user preferences, and recent interactions.","We argue that the latter is essential for enhancing personalization in such systems, especially in temporal-driven scenarios.","We demonstrate the utility of our framework through practical applications on two datasets, LastFM-1K and ML-1M. We conduct experiments on a subsample of 80 users from each dataset, testing and assessing the effectiveness of various prompt construction scenarios and in-context learning, comprising more than 50 scenarios.","This results in more than 4000 recommendations (80 * 50 = 4000).","Our study reveals that while there are no significant unfairness issues in scenarios involving sensitive attributes, some concerns remain.","However, in terms of intrinsic fairness, which does not involve direct sensitivity, unfairness across demographic groups remains significant.","The code and data used for this paper are available at: \\url{https://shorturl.at/awBFM}."],"url":"http://arxiv.org/abs/2405.02219v1","category":"cs.IR"} +{"created":"2024-05-03 16:22:04","title":"Integrating Molecular Dynamics Simulations and Experimental Data for Azeotrope Predictions in Binary Mixtures","abstract":"An azeotrope is a constant boiling point mixture, and its behavior is important for fluid separation processes. Predicting azeotropes from atomistic simulations is difficult, due to the complexities and convergence problems in Monte Carlo and free-energy perturbation techniques. Here, we present a methodology for predicting the azeotropes of binary mixtures, which computes the compositional dependence of chemical potentials from molecular dynamics simulations using the S0 method, and employs experimental boiling point and vaporization enthalpy data. Using this methodology, we reproduce the azeotropes or the lack of in five case studies, including ethanol/water, ethanol/isooctane, methanol/water, hydrazine/water, and acetone/chloroform mixtures. We find that it is crucial to use the experimental boiling point and vaporization enthalpy for reliable azeotrope predictions, as empirical force fields are not accurate enough for these quantities. Finally, we use regular solution models to rationalize the azeotropes, and reveal that they tend to form when the mixture components have similar boiling points and strong interactions.","sentences":["An azeotrope is a constant boiling point mixture, and its behavior is important for fluid separation processes.","Predicting azeotropes from atomistic simulations is difficult, due to the complexities and convergence problems in Monte Carlo and free-energy perturbation techniques.","Here, we present a methodology for predicting the azeotropes of binary mixtures, which computes the compositional dependence of chemical potentials from molecular dynamics simulations using the S0 method, and employs experimental boiling point and vaporization enthalpy data.","Using this methodology, we reproduce the azeotropes or the lack of in five case studies, including ethanol/water, ethanol/isooctane, methanol/water, hydrazine/water, and acetone/chloroform mixtures.","We find that it is crucial to use the experimental boiling point and vaporization enthalpy for reliable azeotrope predictions, as empirical force fields are not accurate enough for these quantities.","Finally, we use regular solution models to rationalize the azeotropes, and reveal that they tend to form when the mixture components have similar boiling points and strong interactions."],"url":"http://arxiv.org/abs/2405.02216v1","category":"physics.chem-ph"} +{"created":"2024-05-03 16:14:01","title":"Piezoelectric microresonators for sensitive spin detection","abstract":"Piezoelectric microresonators are indispensable in wireless communications, and underpin radio frequency filtering in mobile phones. These devices are usually analyzed in the quasi-(electro)static regime with the magnetic field effectively ignored. On the other hand, at GHz frequencies and especially in piezoelectric devices exploiting strong dimensional confinement of acoustic fields, the surface magnetic fields ($B_{1}$) can be significant. This $B_1$ field, which oscillates at GHz frequencies, but is confined to ${\\mu}$m-scale wavelengths provides a natural route to efficiently interface with nanoscale spin systems. We show through scaling arguments that $B_1{\\propto}f^2$ for tightly focused acoustic fields at a given operation frequency $f$. We demonstrate the existence of these surface magnetic fields in a proof-of-principle experiment by showing excess power absorption at the focus of a surface acoustic wave (SAW), when a polished Yttrium-Iron-Garnet (YIG) sphere is positioned in the evanescent field, and the magnon resonance is tuned across the SAW transmission. Finally, we outline the prospects for sensitive spin detection using small mode volume piezoelectric microresonators, including the feasibility of electrical detection of single spins at cryogenic temperatures.","sentences":["Piezoelectric microresonators are indispensable in wireless communications, and underpin radio frequency filtering in mobile phones.","These devices are usually analyzed in the quasi-(electro)static regime with the magnetic field effectively ignored.","On the other hand, at GHz frequencies and especially in piezoelectric devices exploiting strong dimensional confinement of acoustic fields, the surface magnetic fields ($B_{1}$) can be significant.","This $B_1$ field, which oscillates at GHz frequencies, but is confined to ${\\mu}$m-scale wavelengths provides a natural route to efficiently interface with nanoscale spin systems.","We show through scaling arguments that $B_1{\\propto}f^2$ for tightly focused acoustic fields at a given operation frequency $f$. We demonstrate the existence of these surface magnetic fields in a proof-of-principle experiment by showing excess power absorption at the focus of a surface acoustic wave (SAW), when a polished Yttrium-Iron-Garnet (YIG) sphere is positioned in the evanescent field, and the magnon resonance is tuned across the SAW transmission.","Finally, we outline the prospects for sensitive spin detection using small mode volume piezoelectric microresonators, including the feasibility of electrical detection of single spins at cryogenic temperatures."],"url":"http://arxiv.org/abs/2405.02212v1","category":"physics.app-ph"} +{"created":"2024-05-03 16:12:02","title":"Performance Analysis of an Optimization Algorithm for Metamaterial Design on the Integrated High-Performance Computing and Quantum Systems","abstract":"Optimizing metamaterials with complex geometries is a big challenge. Although an active learning algorithm, combining machine learning (ML), quantum computing, and optical simulation, has emerged as an efficient optimization tool, it still faces difficulties in optimizing complex structures that have potentially high performance. In this work, we comprehensively analyze the performance of an optimization algorithm for metamaterial design on the integrated HPC and quantum systems. We demonstrate significant time advantages through message-passing interface (MPI) parallelization on the high-performance computing (HPC) system showing approximately 54% faster ML tasks and 67 times faster optical simulation against serial workloads. Furthermore, we analyze the performance of a quantum algorithm designed for optimization, which runs with various quantum simulators on a local computer or HPC-quantum system. Results showcase ~24 times speedup when executing the optimization algorithm on the HPC-quantum hybrid system. This study paves a way to optimize complex metamaterials using the integrated HPC-quantum system.","sentences":["Optimizing metamaterials with complex geometries is a big challenge.","Although an active learning algorithm, combining machine learning (ML), quantum computing, and optical simulation, has emerged as an efficient optimization tool, it still faces difficulties in optimizing complex structures that have potentially high performance.","In this work, we comprehensively analyze the performance of an optimization algorithm for metamaterial design on the integrated HPC and quantum systems.","We demonstrate significant time advantages through message-passing interface (MPI) parallelization on the high-performance computing (HPC) system showing approximately 54% faster ML tasks and 67 times faster optical simulation against serial workloads.","Furthermore, we analyze the performance of a quantum algorithm designed for optimization, which runs with various quantum simulators on a local computer or HPC-quantum system.","Results showcase ~24 times speedup when executing the optimization algorithm on the HPC-quantum hybrid system.","This study paves a way to optimize complex metamaterials using the integrated HPC-quantum system."],"url":"http://arxiv.org/abs/2405.02211v1","category":"quant-ph"} +{"created":"2024-05-03 16:10:30","title":"Zilber's Trichotomy in Hausdorff Geometric Structures","abstract":"We give a new axiomatic treatment of the Zilber trichotomy, and use it to complete the proof of the trichotomy for relics of algebraically closed fields, i.e., reducts of the ACF-induced structure on ACF-definable sets. More precisely, we introduce a class of geometric structures equipped with a Hausdorff topology, called \\textit{Hausdorff geometric structures}. Natural examples include the complex field; algebraically closed valued fields; o-minimal expansions of real closed fields; and characteristic zero Henselian fields (in particular $p$-adically closed fields). We then study the Zilber trichotomy for relics of Hausdorff geometric structures, showing that under additional assumptions, every non-locally modular strongly minimal relic on a real sort interprets a one-dimensional group. Combined with recent results, this allows us to prove the trichotomy for strongly minimal relics on the real sorts of algebraically closed valued fields. Finally, we make progress on the imaginary sorts, reducing the trichotomy for \\textit{all} ACVF relics (in all sorts) to a conjectural technical condition that we prove in characteristic $(0,0)$.","sentences":["We give a new axiomatic treatment of the Zilber trichotomy, and use it to complete the proof of the trichotomy for relics of algebraically closed fields, i.e., reducts of the ACF-induced structure on ACF-definable sets.","More precisely, we introduce a class of geometric structures equipped with a Hausdorff topology, called \\textit{Hausdorff geometric structures}.","Natural examples include the complex field; algebraically closed valued fields; o-minimal expansions of real closed fields; and characteristic zero Henselian fields (in particular $p$-adically closed fields).","We then study the Zilber trichotomy for relics of Hausdorff geometric structures, showing that under additional assumptions, every non-locally modular strongly minimal relic on a real sort interprets a one-dimensional group.","Combined with recent results, this allows us to prove the trichotomy for strongly minimal relics on the real sorts of algebraically closed valued fields. ","Finally, we make progress on the imaginary sorts, reducing the trichotomy for \\textit{all} ACVF relics (in all sorts) to a conjectural technical condition that we prove in characteristic $(0,0)$."],"url":"http://arxiv.org/abs/2405.02209v1","category":"math.LO"} +{"created":"2024-05-03 16:03:24","title":"Pseudo-monodromy and the Mandelbrot set","abstract":"We investigate the discontinuity of codings for the Julia set of a quadratic map. To each parameter ray, we associate a natural coding for Julia sets on the ray. Given a hyperbolic component $H$ of the Mandelbrot set, we consider the codings along the two parameter rays landing on the root point of $H$. Our main result describes the discontinuity of these two coding in terms of the kneading sequences of the hyperbolic components which are conspicuous to $H$. This result can be interpreted as a solution to the degenerated case of Lipa's conjecture on the monodromy problem of the horseshoe locus for the complex H\\'enon family.","sentences":["We investigate the discontinuity of codings for the Julia set of a quadratic map.","To each parameter ray, we associate a natural coding for Julia sets on the ray.","Given a hyperbolic component $H$ of the Mandelbrot set, we consider the codings along the two parameter rays landing on the root point of $H$. Our main result describes the discontinuity of these two coding in terms of the kneading sequences of the hyperbolic components which are conspicuous to $H$. This result can be interpreted as a solution to the degenerated case of Lipa's conjecture on the monodromy problem of the horseshoe locus for the complex H\\'enon family."],"url":"http://arxiv.org/abs/2405.02204v1","category":"math.DS"} +{"created":"2024-05-03 15:54:20","title":"The Cambridge RoboMaster: An Agile Multi-Robot Research Platform","abstract":"Compact robotic platforms with powerful compute and actuation capabilities are key enablers for practical, real-world deployments of multi-agent research. This article introduces a tightly integrated hardware, control, and simulation software stack on a fleet of holonomic ground robot platforms designed with this motivation. Our robots, a fleet of customised DJI Robomaster S1 vehicles, offer a balance between small robots that do not possess sufficient compute or actuation capabilities and larger robots that are unsuitable for indoor multi-robot tests. They run a modular ROS2-based optimal estimation and control stack for full onboard autonomy, contain ad-hoc peer-to-peer communication infrastructure, and can zero-shot run multi-agent reinforcement learning (MARL) policies trained in our vectorized multi-agent simulation framework. We present an in-depth review of other platforms currently available, showcase new experimental validation of our system's capabilities, and introduce case studies that highlight the versatility and reliabilty of our system as a testbed for a wide range of research demonstrations. Our system as well as supplementary material is available online: https://proroklab.github.io/cambridge-robomaster","sentences":["Compact robotic platforms with powerful compute and actuation capabilities are key enablers for practical, real-world deployments of multi-agent research.","This article introduces a tightly integrated hardware, control, and simulation software stack on a fleet of holonomic ground robot platforms designed with this motivation.","Our robots, a fleet of customised DJI Robomaster S1 vehicles, offer a balance between small robots that do not possess sufficient compute or actuation capabilities and larger robots that are unsuitable for indoor multi-robot tests.","They run a modular ROS2-based optimal estimation and control stack for full onboard autonomy, contain ad-hoc peer-to-peer communication infrastructure, and can zero-shot run multi-agent reinforcement learning (MARL) policies trained in our vectorized multi-agent simulation framework.","We present an in-depth review of other platforms currently available, showcase new experimental validation of our system's capabilities, and introduce case studies that highlight the versatility and reliabilty of our system as a testbed for a wide range of research demonstrations.","Our system as well as supplementary material is available online: https://proroklab.github.io/cambridge-robomaster"],"url":"http://arxiv.org/abs/2405.02198v1","category":"cs.RO"} +{"created":"2024-05-03 15:51:02","title":"Impact of emoji exclusion on the performance of Arabic sarcasm detection models","abstract":"The complex challenge of detecting sarcasm in Arabic speech on social media is increased by the language diversity and the nature of sarcastic expressions. There is a significant gap in the capability of existing models to effectively interpret sarcasm in Arabic, which mandates the necessity for more sophisticated and precise detection methods. In this paper, we investigate the impact of a fundamental preprocessing component on sarcasm speech detection. While emojis play a crucial role in mitigating the absence effect of body language and facial expressions in modern communication, their impact on automated text analysis, particularly in sarcasm detection, remains underexplored. We investigate the impact of emoji exclusion from datasets on the performance of sarcasm detection models in social media content for Arabic as a vocabulary-super rich language. This investigation includes the adaptation and enhancement of AraBERT pre-training models, specifically by excluding emojis, to improve sarcasm detection capabilities. We use AraBERT pre-training to refine the specified models, demonstrating that the removal of emojis can significantly boost the accuracy of sarcasm detection. This approach facilitates a more refined interpretation of language, eliminating the potential confusion introduced by non-textual elements. The evaluated AraBERT models, through the focused strategy of emoji removal, adeptly navigate the complexities of Arabic sarcasm. This study establishes new benchmarks in Arabic natural language processing and presents valuable insights for social media platforms.","sentences":["The complex challenge of detecting sarcasm in Arabic speech on social media is increased by the language diversity and the nature of sarcastic expressions.","There is a significant gap in the capability of existing models to effectively interpret sarcasm in Arabic, which mandates the necessity for more sophisticated and precise detection methods.","In this paper, we investigate the impact of a fundamental preprocessing component on sarcasm speech detection.","While emojis play a crucial role in mitigating the absence effect of body language and facial expressions in modern communication, their impact on automated text analysis, particularly in sarcasm detection, remains underexplored.","We investigate the impact of emoji exclusion from datasets on the performance of sarcasm detection models in social media content for Arabic as a vocabulary-super rich language.","This investigation includes the adaptation and enhancement of AraBERT pre-training models, specifically by excluding emojis, to improve sarcasm detection capabilities.","We use AraBERT pre-training to refine the specified models, demonstrating that the removal of emojis can significantly boost the accuracy of sarcasm detection.","This approach facilitates a more refined interpretation of language, eliminating the potential confusion introduced by non-textual elements.","The evaluated AraBERT models, through the focused strategy of emoji removal, adeptly navigate the complexities of Arabic sarcasm.","This study establishes new benchmarks in Arabic natural language processing and presents valuable insights for social media platforms."],"url":"http://arxiv.org/abs/2405.02195v1","category":"cs.CL"} +{"created":"2024-05-03 15:42:44","title":"X-SLAM: Scalable Dense SLAM for Task-aware Optimization using CSFD","abstract":"We present X-SLAM, a real-time dense differentiable SLAM system that leverages the complex-step finite difference (CSFD) method for efficient calculation of numerical derivatives, bypassing the need for a large-scale computational graph. The key to our approach is treating the SLAM process as a differentiable function, enabling the calculation of the derivatives of important SLAM parameters through Taylor series expansion within the complex domain. Our system allows for the real-time calculation of not just the gradient, but also higher-order differentiation. This facilitates the use of high-order optimizers to achieve better accuracy and faster convergence. Building on X-SLAM, we implemented end-to-end optimization frameworks for two important tasks: camera relocalization in wide outdoor scenes and active robotic scanning in complex indoor environments. Comprehensive evaluations on public benchmarks and intricate real scenes underscore the improvements in the accuracy of camera relocalization and the efficiency of robotic navigation achieved through our task-aware optimization. The code and data are available at https://gapszju.github.io/X-SLAM.","sentences":["We present X-SLAM, a real-time dense differentiable SLAM system that leverages the complex-step finite difference (CSFD) method for efficient calculation of numerical derivatives, bypassing the need for a large-scale computational graph.","The key to our approach is treating the SLAM process as a differentiable function, enabling the calculation of the derivatives of important SLAM parameters through Taylor series expansion within the complex domain.","Our system allows for the real-time calculation of not just the gradient, but also higher-order differentiation.","This facilitates the use of high-order optimizers to achieve better accuracy and faster convergence.","Building on X-SLAM, we implemented end-to-end optimization frameworks for two important tasks: camera relocalization in wide outdoor scenes and active robotic scanning in complex indoor environments.","Comprehensive evaluations on public benchmarks and intricate real scenes underscore the improvements in the accuracy of camera relocalization and the efficiency of robotic navigation achieved through our task-aware optimization.","The code and data are available at https://gapszju.github.io/X-SLAM."],"url":"http://arxiv.org/abs/2405.02187v1","category":"cs.RO"} +{"created":"2024-05-03 15:36:50","title":"Hybrid Lyapunov-based feedback stabilization of bipedal locomotion based on reference spreading","abstract":"We propose a hybrid formulation of the linear inverted pendulum model for bipedal locomotion, where the foot switches are triggered based on the center of mass position, removing the need for pre-defined footstep timings. Using a concept similar to reference spreading, we define nontrivial tracking error coordinates induced by our hybrid model. These coordinates enjoy desirable linear flow dynamics and rather elegant jump dynamics perturbed by a suitable extended class ${\\mathcal K}_\\infty$ function of the position error. We stabilize this hybrid error dynamics using a saturated feedback controller, selecting its gains by solving a convex optimization problem. We prove local asymptotic stability of the tracking error and provide a certified estimate of the basin of attraction, comparing it with a numerical estimate obtained from the integration of the closed-loop dynamics. Simulations on a full-body model of a real robot show the practical applicability of the proposed framework and its advantages with respect to a standard model predictive control formulation.","sentences":["We propose a hybrid formulation of the linear inverted pendulum model for bipedal locomotion, where the foot switches are triggered based on the center of mass position, removing the need for pre-defined footstep timings.","Using a concept similar to reference spreading, we define nontrivial tracking error coordinates induced by our hybrid model.","These coordinates enjoy desirable linear flow dynamics and rather elegant jump dynamics perturbed by a suitable extended class ${\\mathcal K}_\\infty$ function of the position error.","We stabilize this hybrid error dynamics using a saturated feedback controller, selecting its gains by solving a convex optimization problem.","We prove local asymptotic stability of the tracking error and provide a certified estimate of the basin of attraction, comparing it with a numerical estimate obtained from the integration of the closed-loop dynamics.","Simulations on a full-body model of a real robot show the practical applicability of the proposed framework and its advantages with respect to a standard model predictive control formulation."],"url":"http://arxiv.org/abs/2405.02184v1","category":"eess.SY"} +{"created":"2024-05-03 15:26:22","title":"Panoptic-SLAM: Visual SLAM in Dynamic Environments using Panoptic Segmentation","abstract":"The majority of visual SLAM systems are not robust in dynamic scenarios. The ones that deal with dynamic objects in the scenes usually rely on deep-learning-based methods to detect and filter these objects. However, these methods cannot deal with unknown moving objects. This work presents Panoptic-SLAM, an open-source visual SLAM system robust to dynamic environments, even in the presence of unknown objects. It uses panoptic segmentation to filter dynamic objects from the scene during the state estimation process. Panoptic-SLAM is based on ORB-SLAM3, a state-of-the-art SLAM system for static environments. The implementation was tested using real-world datasets and compared with several state-of-the-art systems from the literature, including DynaSLAM, DS-SLAM, SaD-SLAM, PVO and FusingPanoptic. For example, Panoptic-SLAM is on average four times more accurate than PVO, the most recent panoptic-based approach for visual SLAM. Also, experiments were performed using a quadruped robot with an RGB-D camera to test the applicability of our method in real-world scenarios. The tests were validated by a ground-truth created with a motion capture system.","sentences":["The majority of visual SLAM systems are not robust in dynamic scenarios.","The ones that deal with dynamic objects in the scenes usually rely on deep-learning-based methods to detect and filter these objects.","However, these methods cannot deal with unknown moving objects.","This work presents Panoptic-SLAM, an open-source visual SLAM system robust to dynamic environments, even in the presence of unknown objects.","It uses panoptic segmentation to filter dynamic objects from the scene during the state estimation process.","Panoptic-SLAM is based on ORB-SLAM3, a state-of-the-art SLAM system for static environments.","The implementation was tested using real-world datasets and compared with several state-of-the-art systems from the literature, including DynaSLAM, DS-SLAM, SaD-SLAM, PVO and FusingPanoptic.","For example, Panoptic-SLAM is on average four times more accurate than PVO, the most recent panoptic-based approach for visual SLAM.","Also, experiments were performed using a quadruped robot with an RGB-D camera to test the applicability of our method in real-world scenarios.","The tests were validated by a ground-truth created with a motion capture system."],"url":"http://arxiv.org/abs/2405.02177v1","category":"cs.RO"} +{"created":"2024-05-03 15:18:30","title":"Transimpedance Amplifier with Automatic Gain Control Based on Memristors for Optical Signal Acquisition","abstract":"Transimpedance amplifiers (TIA) play a crucial role in various electronic systems, especially in optical signal acquisition. However, their performance is often hampered by saturation issues due to high input currents, leading to prolonged recovery times. This paper addresses this challenge by introducing a novel approach utilizing a memristive automatic gain control (AGC) to adjust the TIA's gain and enhance its dynamic range. We replace the typical feedback resistor of a TIA with a valence-change mechanism (VCM) memristor. This substitution enables the TIA to adapt to a broader range of input signals, leveraging the substantial OFF/ON resistance ratio of the memristor. This paper also presents the reading and resetting sub-circuits essential for monitoring and controling the memristor's state. The proposed circuit is evaluated through SPICE simulations. Furthermore, we extend our evaluation to practical testing using a printed circuit board (PCB) integrating the TIA and memristor. We show a remarkable 40 dB increase in the dynamic range of our TIA memristor circuit compared to traditional resistor-based TIAs.","sentences":["Transimpedance amplifiers (TIA) play a crucial role in various electronic systems, especially in optical signal acquisition.","However, their performance is often hampered by saturation issues due to high input currents, leading to prolonged recovery times.","This paper addresses this challenge by introducing a novel approach utilizing a memristive automatic gain control (AGC) to adjust the TIA's gain and enhance its dynamic range.","We replace the typical feedback resistor of a TIA with a valence-change mechanism (VCM) memristor.","This substitution enables the TIA to adapt to a broader range of input signals, leveraging the substantial OFF/ON resistance ratio of the memristor.","This paper also presents the reading and resetting sub-circuits essential for monitoring and controling the memristor's state.","The proposed circuit is evaluated through SPICE simulations.","Furthermore, we extend our evaluation to practical testing using a printed circuit board (PCB) integrating the TIA and memristor.","We show a remarkable 40 dB increase in the dynamic range of our TIA memristor circuit compared to traditional resistor-based TIAs."],"url":"http://arxiv.org/abs/2405.02169v1","category":"cs.AR"} +{"created":"2024-05-03 15:15:09","title":"Symmetry-enforced metal-insulator transition and topological adiabatic charge pump in sliding bilayers of threefold symmetric materials","abstract":"Sliding bilayers are systems that exploit the possibility of relatively translating two monolayers along a specific direction in real space, such that different stackings could be implemented in the process. This simple approach allows for manipulating the electronic properties of layered materials similarly as in twisted multilayers. In this work, the sliding of bilayers, composed of one type of monolayer with spatial symmetry described by space group P$\\bar{3}1m$ is studied. Using a minimal tight-binding model along with symmetry analysis, we propose two effects that arise in a specific sliding direction. First, the sliding-induced control of the band gap magnitude, which produces a metal-insulator transition, is demonstrated. In addition, the potential to achieve a topological adiabatic charge pump for cyclic sliding is discussed. For each effect, we also present material implementations using first-principles calculations. Bilayer GaS is selected for the metal-insulator transition and bilayer transition metal dichalcogenide ZrS$_2$ is found to display the topological pump effect. Both realizations show good agreement with the predictions of the model.","sentences":["Sliding bilayers are systems that exploit the possibility of relatively translating two monolayers along a specific direction in real space, such that different stackings could be implemented in the process.","This simple approach allows for manipulating the electronic properties of layered materials similarly as in twisted multilayers.","In this work, the sliding of bilayers, composed of one type of monolayer with spatial symmetry described by space group P$\\bar{3}1m$ is studied.","Using a minimal tight-binding model along with symmetry analysis, we propose two effects that arise in a specific sliding direction.","First, the sliding-induced control of the band gap magnitude, which produces a metal-insulator transition, is demonstrated.","In addition, the potential to achieve a topological adiabatic charge pump for cyclic sliding is discussed.","For each effect, we also present material implementations using first-principles calculations.","Bilayer GaS is selected for the metal-insulator transition and bilayer transition metal dichalcogenide ZrS$_2$ is found to display the topological pump effect.","Both realizations show good agreement with the predictions of the model."],"url":"http://arxiv.org/abs/2405.02167v1","category":"cond-mat.mes-hall"} +{"created":"2024-05-03 15:07:49","title":"On the origin of circular rolls in rotor-stator flow","abstract":"Rotor-stator flows are known to exhibit instabilities in the form of circular and spiral rolls. While the spirals are known to emanate from a supercritical Hopf bifurcation, the origin of the circular rolls is still unclear. In the present work we suggest a quantitative scenario for the circular rolls as a response of the system to external forcing. We consider two types of axisymmetric forcing: bulk forcing (based on the resolvent analysis) and boundary forcing using direct numerical simulation. Using the singular value decomposition of the resolvent operator the optimal response is shown to take the form of circular rolls. The linear gain curve shows strong amplification at non-zero frequencies following a pseudo-resonance mechanism. The optimal energy gain is found to grow rapidly with the Reynolds number (based on the rotation rate and interdisc spacing $H$) in connection with huge levels of non-normality. The results for both types of forcing are compared with former experimental works and previous numerical studies. Our findings suggest that the circular rolls observed experimentally are the combined effect of the high forcing gain and the roll-like form of the leading response of the linearised operator. For high enough Reynolds number it is possible to delineate between linear and nonlinear response. For sufficiently strong forcing amplitudes, the nonlinear response is consistent with the self-sustained states found recently for the unforced problem. The onset of such non-trivial dynamics is shown to correspond in state space to a deterministic leaky attractor, as in other subcritical wall-bounded shear flows.","sentences":["Rotor-stator flows are known to exhibit instabilities in the form of circular and spiral rolls.","While the spirals are known to emanate from a supercritical Hopf bifurcation, the origin of the circular rolls is still unclear.","In the present work we suggest a quantitative scenario for the circular rolls as a response of the system to external forcing.","We consider two types of axisymmetric forcing: bulk forcing (based on the resolvent analysis) and boundary forcing using direct numerical simulation.","Using the singular value decomposition of the resolvent operator the optimal response is shown to take the form of circular rolls.","The linear gain curve shows strong amplification at non-zero frequencies following a pseudo-resonance mechanism.","The optimal energy gain is found to grow rapidly with the Reynolds number (based on the rotation rate and interdisc spacing $H$) in connection with huge levels of non-normality.","The results for both types of forcing are compared with former experimental works and previous numerical studies.","Our findings suggest that the circular rolls observed experimentally are the combined effect of the high forcing gain and the roll-like form of the leading response of the linearised operator.","For high enough Reynolds number it is possible to delineate between linear and nonlinear response.","For sufficiently strong forcing amplitudes, the nonlinear response is consistent with the self-sustained states found recently for the unforced problem.","The onset of such non-trivial dynamics is shown to correspond in state space to a deterministic leaky attractor, as in other subcritical wall-bounded shear flows."],"url":"http://arxiv.org/abs/2405.02160v1","category":"physics.flu-dyn"} +{"created":"2024-05-03 15:05:30","title":"Dynamics of dilute nuclear matter with light clusters and in-medium effects","abstract":"We investigate the dynamics of dilute systems composed of nucleons and light clusters within a linear response approach, taking into account the in-medium Mott effects on cluster appearance, through a density-dependent momentum cut-off. We find that spinodal instabilities and associated growth rates are severely affected by the presence of light clusters and, in particular, by the treatment of in-medium effects, foreshadowing intriguing consequences for fragment formation in heavy-ion collisions and in the broader astrophysical context.","sentences":["We investigate the dynamics of dilute systems composed of nucleons and light clusters within a linear response approach, taking into account the in-medium Mott effects on cluster appearance, through a density-dependent momentum cut-off.","We find that spinodal instabilities and associated growth rates are severely affected by the presence of light clusters and, in particular, by the treatment of in-medium effects, foreshadowing intriguing consequences for fragment formation in heavy-ion collisions and in the broader astrophysical context."],"url":"http://arxiv.org/abs/2405.02157v1","category":"nucl-th"} +{"created":"2024-05-03 14:48:20","title":"MedReadMe: A Systematic Study for Fine-grained Sentence Readability in Medical Domain","abstract":"Medical texts are notoriously challenging to read. Properly measuring their readability is the first step towards making them more accessible. In this paper, we present a systematic study on fine-grained readability measurements in the medical domain at both sentence-level and span-level. We introduce a new dataset MedReadMe, which consists of manually annotated readability ratings and fine-grained complex span annotation for 4,520 sentences, featuring two novel \"Google-Easy\" and \"Google-Hard\" categories. It supports our quantitative analysis, which covers 650 linguistic features and automatic complex word and jargon identification. Enabled by our high-quality annotation, we benchmark and improve several state-of-the-art sentence-level readability metrics for the medical domain specifically, which include unsupervised, supervised, and prompting-based methods using recently developed large language models (LLMs). Informed by our fine-grained complex span annotation, we find that adding a single feature, capturing the number of jargon spans, into existing readability formulas can significantly improve their correlation with human judgments. We will publicly release the dataset and code.","sentences":["Medical texts are notoriously challenging to read.","Properly measuring their readability is the first step towards making them more accessible.","In this paper, we present a systematic study on fine-grained readability measurements in the medical domain at both sentence-level and span-level.","We introduce a new dataset MedReadMe, which consists of manually annotated readability ratings and fine-grained complex span annotation for 4,520 sentences, featuring two novel \"Google-Easy\" and \"Google-Hard\" categories.","It supports our quantitative analysis, which covers 650 linguistic features and automatic complex word and jargon identification.","Enabled by our high-quality annotation, we benchmark and improve several state-of-the-art sentence-level readability metrics for the medical domain specifically, which include unsupervised, supervised, and prompting-based methods using recently developed large language models (LLMs).","Informed by our fine-grained complex span annotation, we find that adding a single feature, capturing the number of jargon spans, into existing readability formulas can significantly improve their correlation with human judgments.","We will publicly release the dataset and code."],"url":"http://arxiv.org/abs/2405.02144v1","category":"cs.CL"} +{"created":"2024-05-03 14:40:20","title":"Global regularity and infinite Prandtl number limit of temperature patches for the 2D Boussinesq system","abstract":"We prove global regularity and study the infinite Prandtl number limit of temperature patches for the 2D non-diffusive Boussinesq system with dissipation in the full subcritical regime. The temperature satisfies a transport equation and the temperature initial data are given in the form of non-constant patches. Our first main result is a persistence of regularity of the patches globally in time. More precisely, we prove that if the boundary of the initial temperature patch lies in $C^{k+\\gamma}$ with $k\\geq 1$ and $\\gamma\\in(0,1)$ then this initial regularity is preserved for all time. Importantly, our proof is robust enough to show uniform dependence on the Prandtl number in some cases. This result solves a question in Khor and Xu \\cite{KX22} concerning the global control of the curvature of the patch boundary. Besides, by studying the limit when the Prandtl number goes to infinity, we find that the patch solutions to the 2D Boussinesq-Navier-Stokes system in the torus converge to the unique patch solutions of the (fractional) Stokes-transport equation and that the $C^{k+\\gamma}$ regularity of the patch boundary is globally preserved. This allows us to extend the $C^{k+\\gamma}$ persistence result of Grayer II \\cite{Gray23} from the range $k\\in \\{0,1,2\\}$ to the full range $k\\geq 1$.","sentences":["We prove global regularity and study the infinite Prandtl number limit of temperature patches for the 2D non-diffusive Boussinesq system with dissipation in the full subcritical regime.","The temperature satisfies a transport equation and the temperature initial data are given in the form of non-constant patches.","Our first main result is a persistence of regularity of the patches globally in time.","More precisely, we prove that if the boundary of the initial temperature patch lies in $C^{k+\\gamma}$ with $k\\geq 1$ and $\\gamma\\in(0,1)$ then this initial regularity is preserved for all time.","Importantly, our proof is robust enough to show uniform dependence on the Prandtl number in some cases.","This result solves a question in Khor and Xu \\cite{KX22} concerning the global control of the curvature of the patch boundary.","Besides, by studying the limit when the Prandtl number goes to infinity, we find that the patch solutions to the 2D Boussinesq-Navier-Stokes system in the torus converge to the unique patch solutions of the (fractional) Stokes-transport equation and that the $C^{k+\\gamma}$ regularity of the patch boundary is globally preserved.","This allows us to extend the $C^{k+\\gamma}$ persistence result of Grayer II \\cite{Gray23} from the range $k\\in \\{0,1,2\\}$ to the full range $k\\geq 1$."],"url":"http://arxiv.org/abs/2405.02137v1","category":"math.AP"} +{"created":"2024-05-03 14:35:58","title":"Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets","abstract":"Large Language Models have demonstrated unparalleled effectiveness in various NLP tasks, and integrating LLMs with automatic speech recognition is becoming a mainstream paradigm. Building upon this momentum, our research delves into an indepth examination of this paradigm on a large opensource Chinese dataset. Specifically, our research aims to evaluate the impact of various configurations of speech encoders, LLMs, and projector modules in the context of the speech foundation encoderLLM ASR paradigm. Furthermore, we introduce a threestage training approach, expressly developed to enhance the model's ability to align auditory and textual information. The implementation of this approach, alongside the strategic integration of ASR components, enabled us to achieve the SOTA performance on the AISHELL1, TestNet, and TestMeeting test sets. Our analysis presents an empirical foundation for future research in LLMbased ASR systems and offers insights into optimizing performance using Chinese datasets. We will publicly release all scripts used for data preparation, training, inference, and scoring, as well as pretrained models and training logs to promote reproducible research.","sentences":["Large Language Models have demonstrated unparalleled effectiveness in various NLP tasks, and integrating LLMs with automatic speech recognition is becoming a mainstream paradigm.","Building upon this momentum, our research delves into an indepth examination of this paradigm on a large opensource Chinese dataset.","Specifically, our research aims to evaluate the impact of various configurations of speech encoders, LLMs, and projector modules in the context of the speech foundation encoderLLM ASR paradigm.","Furthermore, we introduce a threestage training approach, expressly developed to enhance the model's ability to align auditory and textual information.","The implementation of this approach, alongside the strategic integration of ASR components, enabled us to achieve the SOTA performance on the AISHELL1, TestNet, and TestMeeting test sets.","Our analysis presents an empirical foundation for future research in LLMbased ASR systems and offers insights into optimizing performance using Chinese datasets.","We will publicly release all scripts used for data preparation, training, inference, and scoring, as well as pretrained models and training logs to promote reproducible research."],"url":"http://arxiv.org/abs/2405.02132v1","category":"cs.SD"} +{"created":"2024-05-03 14:24:59","title":"2 x 2 hyperbolic systems of conservation laws in classes of functions of bounded p-variation","abstract":"In this paper, we consider $2 \\times 2$ hyperbolic systems of conservation laws in one space dimension with characteristic fields satisfying a condition that encompasses genuine nonlinearity and linear degeneracy as well as intermediate cases, namely, with standard notations, $r_i\\cdot \\nabla \\lambda_i \\geq 0$. We prove the existence of entropy solutions in the fractional $BV$ spaces ${W}_p(\\mathbb{R})$ of functions of bounded $p$-variation, $p \\in [1,\\frac{3}{2}]$, for small initial data.","sentences":["In this paper, we consider $2 \\times 2$ hyperbolic systems of conservation laws in one space dimension with characteristic fields satisfying a condition that encompasses genuine nonlinearity and linear degeneracy as well as intermediate cases, namely, with standard notations, $r_i\\cdot \\nabla \\lambda_i \\geq 0$. ","We prove the existence of entropy solutions in the fractional $BV$ spaces ${W}_p(\\mathbb{R})$ of functions of bounded $p$-variation, $p \\in [1,\\frac{3}{2}]$, for small initial data."],"url":"http://arxiv.org/abs/2405.02123v1","category":"math.AP"} +{"created":"2024-05-03 14:24:27","title":"Accurate Pose Prediction on Signed Distance Fields for Mobile Ground Robots in Rough Terrain","abstract":"Autonomous locomotion for mobile ground robots in unstructured environments such as waypoint navigation or flipper control requires a sufficiently accurate prediction of the robot-terrain interaction. Heuristics like occupancy grids or traversability maps are widely used but limit actions available to robots with active flippers as joint positions are not taken into account. We present a novel iterative geometric method to predict the 3D pose of mobile ground robots with active flippers on uneven ground with high accuracy and online planning capabilities. This is achieved by utilizing the ability of signed distance fields to represent surfaces with sub-voxel accuracy. The effectiveness of the presented approach is demonstrated on two different tracked robots in simulation and on a real platform. Compared to a tracking system as ground truth, our method predicts the robot position and orientation with an average accuracy of 3.11 cm and 3.91{\\deg}, outperforming a recent heightmap-based approach. The implementation is made available as an open-source ROS package.","sentences":["Autonomous locomotion for mobile ground robots in unstructured environments such as waypoint navigation or flipper control requires a sufficiently accurate prediction of the robot-terrain interaction.","Heuristics like occupancy grids or traversability maps are widely used but limit actions available to robots with active flippers as joint positions are not taken into account.","We present a novel iterative geometric method to predict the 3D pose of mobile ground robots with active flippers on uneven ground with high accuracy and online planning capabilities.","This is achieved by utilizing the ability of signed distance fields to represent surfaces with sub-voxel accuracy.","The effectiveness of the presented approach is demonstrated on two different tracked robots in simulation and on a real platform.","Compared to a tracking system as ground truth, our method predicts the robot position and orientation with an average accuracy of 3.11 cm and 3.91{\\deg}, outperforming a recent heightmap-based approach.","The implementation is made available as an open-source ROS package."],"url":"http://arxiv.org/abs/2405.02121v1","category":"cs.RO"} +{"created":"2024-05-03 14:04:51","title":"Got Root? A Linux Priv-Esc Benchmark","abstract":"Linux systems are integral to the infrastructure of modern computing environments, necessitating robust security measures to prevent unauthorized access. Privilege escalation attacks represent a significant threat, typically allowing attackers to elevate their privileges from an initial low-privilege account to the all-powerful root account. A benchmark set of vulnerable systems is of high importance to evaluate the effectiveness of privilege-escalation techniques performed by both humans and automated tooling. Analyzing their behavior allows defenders to better fortify their entrusted Linux systems and thus protect their infrastructure from potentially devastating attacks. To address this gap, we developed a comprehensive benchmark for Linux privilege escalation. It provides a standardized platform to evaluate and compare the performance of human and synthetic actors, e.g., hacking scripts or automated tooling.","sentences":["Linux systems are integral to the infrastructure of modern computing environments, necessitating robust security measures to prevent unauthorized access.","Privilege escalation attacks represent a significant threat, typically allowing attackers to elevate their privileges from an initial low-privilege account to the all-powerful root account. ","A benchmark set of vulnerable systems is of high importance to evaluate the effectiveness of privilege-escalation techniques performed by both humans and automated tooling.","Analyzing their behavior allows defenders to better fortify their entrusted Linux systems and thus protect their infrastructure from potentially devastating attacks. ","To address this gap, we developed a comprehensive benchmark for Linux privilege escalation.","It provides a standardized platform to evaluate and compare the performance of human and synthetic actors, e.g., hacking scripts or automated tooling."],"url":"http://arxiv.org/abs/2405.02106v1","category":"cs.CR"} +{"created":"2024-05-03 13:58:07","title":"Spectral density of complex eigenvalues and associated mean eigenvector self-overlaps at the edge of elliptic Ginibre ensembles","abstract":"We consider the density of complex eigenvalues, $\\rho(z)$, and the associated mean eigenvector self-overlaps, $\\mathcal{O}(z)$, at the spectral edge of $N \\times N$ real and complex elliptic Ginibre matrices, as $N \\to \\infty$. Two different regimes of ellipticity are studied: strong non-Hermiticity, keeping the ellipticity parameter $\\tau$ fixed and weak non-Hermiticity with $\\tau \\rightarrow 1 $ as $N \\rightarrow \\infty$. At strong non-Hermiticity, we find that both $\\rho(z)$ and $\\mathcal{O}(z)$ have the same leading order behaviour across the elliptic Ginibre ensembles, establishing the expected universality. In the limit of weak non-Hermiticity, we find different results for $\\rho(z)$ and $\\mathcal{O}(z)$ across the two ensembles. This paper is the final of three papers that we have presented addressing the mean self-overlap of eigenvectors in these ensembles.","sentences":["We consider the density of complex eigenvalues, $\\rho(z)$, and the associated mean eigenvector self-overlaps, $\\mathcal{O}(z)$, at the spectral edge of $N \\times N$ real and complex elliptic Ginibre matrices, as $N \\to \\infty$. Two different regimes of ellipticity are studied: strong non-Hermiticity, keeping the ellipticity parameter $\\tau$ fixed and weak non-Hermiticity with $\\tau \\rightarrow 1 $ as $N \\rightarrow \\infty$. At strong non-Hermiticity, we find that both $\\rho(z)$ and $\\mathcal{O}(z)$ have the same leading order behaviour across the elliptic Ginibre ensembles, establishing the expected universality.","In the limit of weak non-Hermiticity, we find different results for $\\rho(z)$ and $\\mathcal{O}(z)$ across the two ensembles.","This paper is the final of three papers that we have presented addressing the mean self-overlap of eigenvectors in these ensembles."],"url":"http://arxiv.org/abs/2405.02103v1","category":"math-ph"} +{"created":"2024-05-03 13:54:59","title":"Discrete Aware Matrix Completion via Convexized $\\ell_0$-Norm Approximation","abstract":"We consider a novel algorithm, for the completion of partially observed low-rank matrices in a structured setting where each entry can be chosen from a finite discrete alphabet set, such as in common recommender systems. The proposed low-rank matrix completion (MC) method is an improved variation of state-of-the-art (SotA) discrete aware matrix completion method which we previously proposed, in which discreteness is enforced by an $\\ell_0$-norm regularizer, not by replaced with the $\\ell_1$-norm, but instead approximated by a continuous and differentiable function normalized via fractional programming (FP) under a proximal gradient (PG) framework. Simulation results demonstrate the superior performance of the new method compared to the SotA techniques as well as the earlier $\\ell_1$-norm-based discrete-aware matrix completion approach.","sentences":["We consider a novel algorithm, for the completion of partially observed low-rank matrices in a structured setting where each entry can be chosen from a finite discrete alphabet set, such as in common recommender systems.","The proposed low-rank matrix completion (MC) method is an improved variation of state-of-the-art (SotA) discrete aware matrix completion method which we previously proposed, in which discreteness is enforced by an $\\ell_0$-norm regularizer, not by replaced with the $\\ell_1$-norm, but instead approximated by a continuous and differentiable function normalized via fractional programming (FP) under a proximal gradient (PG) framework.","Simulation results demonstrate the superior performance of the new method compared to the SotA techniques as well as the earlier $\\ell_1$-norm-based discrete-aware matrix completion approach."],"url":"http://arxiv.org/abs/2405.02101v1","category":"eess.SP"} +{"created":"2024-05-03 13:51:28","title":"Data-Driven Stable Neural Feedback Loop Design","abstract":"This paper proposes a data-driven approach to design a feedforward Neural Network (NN) controller with a stability guarantee for systems with unknown dynamics. We first introduce data-driven representations of stability conditions for Neural Feedback Loops (NFLs) with linear plants. These conditions are then formulated into a semidefinite program (SDP). Subsequently, this SDP constraint is integrated into the NN training process resulting in a stable NN controller. We propose an iterative algorithm to solve this problem efficiently. Finally, we illustrate the effectiveness of the proposed method and its superiority compared to model-based methods via numerical examples.","sentences":["This paper proposes a data-driven approach to design a feedforward Neural Network (NN) controller with a stability guarantee for systems with unknown dynamics.","We first introduce data-driven representations of stability conditions for Neural Feedback Loops (NFLs) with linear plants.","These conditions are then formulated into a semidefinite program (SDP).","Subsequently, this SDP constraint is integrated into the NN training process resulting in a stable NN controller.","We propose an iterative algorithm to solve this problem efficiently.","Finally, we illustrate the effectiveness of the proposed method and its superiority compared to model-based methods via numerical examples."],"url":"http://arxiv.org/abs/2405.02100v1","category":"math.OC"} +{"created":"2024-05-03 13:45:27","title":"Transformer Models for Quantum Gate Set Tomography","abstract":"Quantum computation represents a promising frontier in the domain of high-performance computing, blending quantum information theory with practical applications to overcome the limitations of classical computation. This study investigates the challenges of manufacturing high-fidelity and scalable quantum processors. Quantum gate set tomography (QGST) is a critical method for characterizing quantum processors and understanding their operational capabilities and limitations. This paper introduces ML4QGST as a novel approach to QGST by integrating machine learning techniques, specifically utilizing a transformer neural network model. Adapting the transformer model for QGST addresses the computational complexity of modeling quantum systems. Advanced training strategies, including data grouping and curriculum learning, are employed to enhance model performance, demonstrating significant congruence with ground-truth values. We benchmark this training pipeline on the constructed learning model, to successfully perform QGST for $3$ gates on a $1$ qubit system with over-rotation error and depolarizing noise estimation with comparable accuracy to pyGSTi. This research marks a pioneering step in applying deep neural networks to the complex problem of quantum gate set tomography, showcasing the potential of machine learning to tackle nonlinear tomography challenges in quantum computing.","sentences":["Quantum computation represents a promising frontier in the domain of high-performance computing, blending quantum information theory with practical applications to overcome the limitations of classical computation.","This study investigates the challenges of manufacturing high-fidelity and scalable quantum processors.","Quantum gate set tomography (QGST) is a critical method for characterizing quantum processors and understanding their operational capabilities and limitations.","This paper introduces ML4QGST as a novel approach to QGST by integrating machine learning techniques, specifically utilizing a transformer neural network model.","Adapting the transformer model for QGST addresses the computational complexity of modeling quantum systems.","Advanced training strategies, including data grouping and curriculum learning, are employed to enhance model performance, demonstrating significant congruence with ground-truth values.","We benchmark this training pipeline on the constructed learning model, to successfully perform QGST for $3$ gates on a $1$ qubit system with over-rotation error and depolarizing noise estimation with comparable accuracy to pyGSTi.","This research marks a pioneering step in applying deep neural networks to the complex problem of quantum gate set tomography, showcasing the potential of machine learning to tackle nonlinear tomography challenges in quantum computing."],"url":"http://arxiv.org/abs/2405.02097v1","category":"quant-ph"} +{"created":"2024-05-03 13:27:23","title":"Pair coalescence times of ancestral lineages of two-dimensional logistic branching random walks","abstract":"Consider two ancestral lineages sampled from a system of two-dimensional branching random walks with logistic regulation in the stationary regime. We study the asymptotics of their coalescence time for large initial separation and find that it agrees with well known results for a suitably scaled two-dimensional stepping stone model and also with Mal\\'ecot's continuous-space approximation for the probability of identity by descent as a function of sampling distance. This can be viewed as a justification for the replacement of locally fluctuating population sizes by fixed effective sizes. Our main tool is a joint regeneration construction for the spatial embeddings of the two ancestral lineages.","sentences":["Consider two ancestral lineages sampled from a system of two-dimensional branching random walks with logistic regulation in the stationary regime.","We study the asymptotics of their coalescence time for large initial separation and find that it agrees with well known results for a suitably scaled two-dimensional stepping stone model and also with Mal\\'ecot's continuous-space approximation for the probability of identity by descent as a function of sampling distance.","This can be viewed as a justification for the replacement of locally fluctuating population sizes by fixed effective sizes.","Our main tool is a joint regeneration construction for the spatial embeddings of the two ancestral lineages."],"url":"http://arxiv.org/abs/2405.02090v1","category":"math.PR"} +{"created":"2024-05-03 13:21:46","title":"AFDM Chirp-Permutation-Index Modulation with Quantum-Accelerated Codebook Design","abstract":"We describe a novel index modulation (IM) scheme exploiting a unique feature of the recently proposed affine frequency division multiplexing (AFDM) in doubly-dispersive (DD) channels. Dubbed AFDM chirp-permutation-index modulation (CPIM), the proposed method encodes additional information via the permutation of the discrete affine Fourier Transform (DAFT) chirp sequence, without any sacrifice of the various beneficial properties of the AFDM waveform in DD channels. The effectiveness of the proposed method is validated via simulation results leveraging a novel reduced-complexity minimum mean-squared-error (MMSE)-based maximum-likelihood (ML) detector, highlighting the gains over the classical AFDM. As part of the work two interesting problems related to optimizing AFDM-CPIM are identified: the optimal codebook design problem, over a discrete solution space of dimension $\\binom{N!}{K}$, where $N$ is the number of subcarriers and $K$ is the number of codewords; and the ML detection problem whose solution space is of dimension $KM^N$, where $M$ is the constellation size. In order to alleviate the computational complexity of these problems and enable large-scale variations of AFDM-CPIM, the two problems are reformulated as a higher-order binary optimization problem and mapped to the well-known quantum Grover adaptive search (GAS) algorithm for their solution.","sentences":["We describe a novel index modulation (IM) scheme exploiting a unique feature of the recently proposed affine frequency division multiplexing (AFDM) in doubly-dispersive (DD) channels.","Dubbed AFDM chirp-permutation-index modulation (CPIM), the proposed method encodes additional information via the permutation of the discrete affine Fourier Transform (DAFT) chirp sequence, without any sacrifice of the various beneficial properties of the AFDM waveform in DD channels.","The effectiveness of the proposed method is validated via simulation results leveraging a novel reduced-complexity minimum mean-squared-error (MMSE)-based maximum-likelihood (ML) detector, highlighting the gains over the classical AFDM.","As part of the work two interesting problems related to optimizing AFDM-CPIM are identified: the optimal codebook design problem, over a discrete solution space of dimension $\\binom{N!}{K}$, where $N$ is the number of subcarriers and $K$ is the number of codewords; and the ML detection problem whose solution space is of dimension $KM^N$, where $M$ is the constellation size.","In order to alleviate the computational complexity of these problems and enable large-scale variations of AFDM-CPIM, the two problems are reformulated as a higher-order binary optimization problem and mapped to the well-known quantum","Grover adaptive search (GAS) algorithm for their solution."],"url":"http://arxiv.org/abs/2405.02085v1","category":"eess.SP"} +{"created":"2024-05-03 13:12:53","title":"Coding for Synthesis Defects","abstract":"Motivated by DNA based data storage system, we investigate the errors that occur when synthesizing DNA strands in parallel, where each strand is appended one nucleotide at a time by the machine according to a template supersequence. If there is a cycle such that the machine fails, then the strands meant to be appended at this cycle will not be appended, and we refer to this as a synthesis defect. In this paper, we present two families of codes correcting synthesis defects, which are t-known-synthesis-defect correcting codes and t-synthesis-defect correcting codes. For the first one, it is assumed that the defective cycles are known, and each of the codeword is a quaternary sequence. We provide constructions for this family of codes for t = 1, 2, with redundancy log 4 and log n+18 log 3, respectively. For the second one, the codeword is a set of M ordered sequences, and we give constructions for t = 1, 2 to show a strategy for constructing this family of codes. Finally, we derive a lower bound on the redundancy for single-known-synthesis-defect correcting codes, which assures that our construction is almost optimal.","sentences":["Motivated by DNA based data storage system, we investigate the errors that occur when synthesizing DNA strands in parallel, where each strand is appended one nucleotide at a time by the machine according to a template supersequence.","If there is a cycle such that the machine fails, then the strands meant to be appended at this cycle will not be appended, and we refer to this as a synthesis defect.","In this paper, we present two families of codes correcting synthesis defects, which are t-known-synthesis-defect correcting codes and t-synthesis-defect correcting codes.","For the first one, it is assumed that the defective cycles are known, and each of the codeword is a quaternary sequence.","We provide constructions for this family of codes for t = 1, 2, with redundancy log 4 and log n+18 log 3, respectively.","For the second one, the codeword is a set of M ordered sequences, and we give constructions for t = 1, 2 to show a strategy for constructing this family of codes.","Finally, we derive a lower bound on the redundancy for single-known-synthesis-defect correcting codes, which assures that our construction is almost optimal."],"url":"http://arxiv.org/abs/2405.02080v1","category":"cs.IT"} +{"created":"2024-05-03 13:10:07","title":"SCIMAP: A Python Toolkit for Integrated Spatial Analysis of Multiplexed Imaging Data","abstract":"Multiplexed imaging data are revolutionizing our understanding of the composition and organization of tissues and tumors. A critical aspect of such tissue profiling is quantifying the spatial relationship relationships among cells at different scales from the interaction of neighboring cells to recurrent communities of cells of multiple types. This often involves statistical analysis of 10^7 or more cells in which up to 100 biomolecules (commonly proteins) have been measured. While software tools currently cater to the analysis of spatial transcriptomics data, there remains a need for toolkits explicitly tailored to the complexities of multiplexed imaging data including the need to seamlessly integrate image visualization with data analysis and exploration. We introduce SCIMAP, a Python package specifically crafted to address these challenges. With SCIMAP, users can efficiently preprocess, analyze, and visualize large datasets, facilitating the exploration of spatial relationships and their statistical significance. SCIMAP's modular design enables the integration of new algorithms, enhancing its capabilities for spatial analysis.","sentences":["Multiplexed imaging data are revolutionizing our understanding of the composition and organization of tissues and tumors.","A critical aspect of such tissue profiling is quantifying the spatial relationship relationships among cells at different scales from the interaction of neighboring cells to recurrent communities of cells of multiple types.","This often involves statistical analysis of 10^7 or more cells in which up to 100 biomolecules (commonly proteins) have been measured.","While software tools currently cater to the analysis of spatial transcriptomics data, there remains a need for toolkits explicitly tailored to the complexities of multiplexed imaging data including the need to seamlessly integrate image visualization with data analysis and exploration.","We introduce SCIMAP, a Python package specifically crafted to address these challenges.","With SCIMAP, users can efficiently preprocess, analyze, and visualize large datasets, facilitating the exploration of spatial relationships and their statistical significance.","SCIMAP's modular design enables the integration of new algorithms, enhancing its capabilities for spatial analysis."],"url":"http://arxiv.org/abs/2405.02076v1","category":"q-bio.QM"} +{"created":"2024-05-03 13:06:21","title":"Iterative Reconstruction Methods for Cosmological X-Ray Tomography","abstract":"We consider the imaging of cosmic strings by using Cosmic Microwave Background (CMB) data. Mathematically, we study the inversion of an X-ray transform in Lorentzian geometry, called the light ray transform. The inverse problem is highly ill-posed, with additional complexities of being large-scale and dynamic, with unknown parameters that represent multidimensional objects. This presents significant computational challenges for the numerical reconstruction of images that have high spatial and temporal resolution. In this paper, we begin with a microlocal stability analysis for inverting the light ray transform using the Landweber iteration. Next, we discretize the spatiotemporal object and light ray transform and consider iterative computational methods for solving the resulting inverse problem. We provide a numerical investigation and comparison of some advanced iterative methods for regularization including Tikhonov and sparsity-promoting regularizers for various example scalar functions with conormal type singularities.","sentences":["We consider the imaging of cosmic strings by using Cosmic Microwave Background (CMB) data.","Mathematically, we study the inversion of an X-ray transform in Lorentzian geometry, called the light ray transform.","The inverse problem is highly ill-posed, with additional complexities of being large-scale and dynamic, with unknown parameters that represent multidimensional objects.","This presents significant computational challenges for the numerical reconstruction of images that have high spatial and temporal resolution.","In this paper, we begin with a microlocal stability analysis for inverting the light ray transform using the Landweber iteration.","Next, we discretize the spatiotemporal object and light ray transform and consider iterative computational methods for solving the resulting inverse problem.","We provide a numerical investigation and comparison of some advanced iterative methods for regularization including Tikhonov and sparsity-promoting regularizers for various example scalar functions with conormal type singularities."],"url":"http://arxiv.org/abs/2405.02073v1","category":"math.NA"} +{"created":"2024-05-03 13:00:36","title":"Strategies for Intrusion Monitoring in Cloud Services","abstract":"Effective activity and event monitoring is an essential aspect of digital forensic readiness. Techniques for capturing log and other event data are familiar from conventional networked hosts and transfer directly to the Cloud context. In both contexts, a major concern is the risk that monitoring systems may be targeted and impaired by intruders seeking to conceal their illicit presence and activities. We outline an approach to intrusion monitoring that aims (i)~to ensure the credibility of log data and (ii)~provide a means of data sharing that supports log reconstruction in the event that one or more logging systems is maliciously impaired.","sentences":["Effective activity and event monitoring is an essential aspect of digital forensic readiness.","Techniques for capturing log and other event data are familiar from conventional networked hosts and transfer directly to the Cloud context.","In both contexts, a major concern is the risk that monitoring systems may be targeted and impaired by intruders seeking to conceal their illicit presence and activities.","We outline an approach to intrusion monitoring that aims (i)~to ensure the credibility of log data and (ii)~provide a means of data sharing that supports log reconstruction in the event that one or more logging systems is maliciously impaired."],"url":"http://arxiv.org/abs/2405.02070v1","category":"cs.CR"} +{"created":"2024-05-03 13:00:32","title":"Quantum Circuit Learning on NISQ Hardware","abstract":"Current quantum computers are small and error-prone systems for which the term noisy intermediate-scale quantum (NISQ) has become established. Since large scale, fault-tolerant quantum computers are not expected to be available in the near future, the task of finding NISQ suitable algorithms has received a lot of attention in recent years. The most prominent candidates in this context are variational quantum algorithms. Due to their hybrid quantum-classical architecture they require fewer qubits and quantum gates so that they can cope with the limitations of NISQ computers. An important class of variational quantum algorithms is the quantum circuit learning (QCL) framework. Consisting of a data encoding and a trainable, parametrized layer, these schemes implement a quantum model function that can be fitted to the problem at hand. For instance, in combination with the parameter shift rule to compute derivatives, they can be used to solve differential equations. QCL and related algorithms have been widely studied in the literature. However, numerical experiments are usually limited to simulators and results from real quantum computers are scarce. In this paper we close this gap by executing QCL circuits on a superconducting IBM quantum processor in conjunction with an analysis of the hardware errors. We show that exemplary QCL circuits with up to three qubits are executable on the IBM quantum computer. For this purpose, multiple functions are learned and an exemplary differential equation is solved on the quantum computer. Moreover, we present how the QCL framework can be used to learn different quantum model functions in parallel, which can be applied to solve coupled differential equations in an efficient way.","sentences":["Current quantum computers are small and error-prone systems for which the term noisy intermediate-scale quantum (NISQ) has become established.","Since large scale, fault-tolerant quantum computers are not expected to be available in the near future, the task of finding NISQ suitable algorithms has received a lot of attention in recent years.","The most prominent candidates in this context are variational quantum algorithms.","Due to their hybrid quantum-classical architecture they require fewer qubits and quantum gates so that they can cope with the limitations of NISQ computers.","An important class of variational quantum algorithms is the quantum circuit learning (QCL) framework.","Consisting of a data encoding and a trainable, parametrized layer, these schemes implement a quantum model function that can be fitted to the problem at hand.","For instance, in combination with the parameter shift rule to compute derivatives, they can be used to solve differential equations.","QCL and related algorithms have been widely studied in the literature.","However, numerical experiments are usually limited to simulators and results from real quantum computers are scarce.","In this paper we close this gap by executing QCL circuits on a superconducting IBM quantum processor in conjunction with an analysis of the hardware errors.","We show that exemplary QCL circuits with up to three qubits are executable on the IBM quantum computer.","For this purpose, multiple functions are learned and an exemplary differential equation is solved on the quantum computer.","Moreover, we present how the QCL framework can be used to learn different quantum model functions in parallel, which can be applied to solve coupled differential equations in an efficient way."],"url":"http://arxiv.org/abs/2405.02069v1","category":"quant-ph"} +{"created":"2024-05-03 12:44:52","title":"Dyna-Style Learning with A Macroscopic Model for Vehicle Platooning in Mixed-Autonomy Traffic","abstract":"Platooning of connected and autonomous vehicles (CAVs) plays a vital role in modernizing highways, ushering in enhanced efficiency and safety. This paper explores the significance of platooning in smart highways, employing a coupled partial differential equation (PDE) and ordinary differential equation (ODE) model to elucidate the complex interaction between bulk traffic flow and CAV platoons. Our study focuses on developing a Dyna-style planning and learning framework tailored for platoon control, with a specific goal of reducing fuel consumption. By harnessing the coupled PDE-ODE model, we improve data efficiency in Dyna-style learning through virtual experiences. Simulation results validate the effectiveness of our macroscopic model in modeling platoons within mixed-autonomy settings, demonstrating a notable $10.11\\%$ reduction in vehicular fuel consumption compared to conventional approaches.","sentences":["Platooning of connected and autonomous vehicles (CAVs) plays a vital role in modernizing highways, ushering in enhanced efficiency and safety.","This paper explores the significance of platooning in smart highways, employing a coupled partial differential equation (PDE) and ordinary differential equation (ODE) model to elucidate the complex interaction between bulk traffic flow and CAV platoons.","Our study focuses on developing a Dyna-style planning and learning framework tailored for platoon control, with a specific goal of reducing fuel consumption.","By harnessing the coupled PDE-ODE model, we improve data efficiency in Dyna-style learning through virtual experiences.","Simulation results validate the effectiveness of our macroscopic model in modeling platoons within mixed-autonomy settings, demonstrating a notable $10.11\\%$ reduction in vehicular fuel consumption compared to conventional approaches."],"url":"http://arxiv.org/abs/2405.02062v1","category":"cs.LG"} +{"created":"2024-05-03 12:37:42","title":"Probing fragile topology with a screw dislocation","abstract":"Fragile topology, akin to twisted bilayer graphene and the exotic phases therein, is a notable topological class with intriguing properties. However, due to its unique nature and the lack of bulk-edge correspondence, the experimental signature of fragile topology has been under debated since its birth. Here, we demonstrate experimentally that fragile topological phases with filling anomaly can be probed via screw dislocations, despite that they do not support gapless edge states. Using a designer hexagonal phononic crystal with a fragile topological band gap, we find that 1D gapless bound modes can emerge at a screw dislocation due to the bulk fragile topology. We then establish a connection between our system and the twisted boundary condition via the gauge invariance principle and illustrate that such an emergent phenomenon is an intrinsic property of fragile topological phases with filling anomaly. We observe experimentally the 1D topological bound states using the pump-probe measurements of their dispersion and wavefunctions, which unveils a novel bulk-defect correspondence of fragile topology and a powerful tool for probing fragile topological phases and materials.","sentences":["Fragile topology, akin to twisted bilayer graphene and the exotic phases therein, is a notable topological class with intriguing properties.","However, due to its unique nature and the lack of bulk-edge correspondence, the experimental signature of fragile topology has been under debated since its birth.","Here, we demonstrate experimentally that fragile topological phases with filling anomaly can be probed via screw dislocations, despite that they do not support gapless edge states.","Using a designer hexagonal phononic crystal with a fragile topological band gap, we find that 1D gapless bound modes can emerge at a screw dislocation due to the bulk fragile topology.","We then establish a connection between our system and the twisted boundary condition via the gauge invariance principle and illustrate that such an emergent phenomenon is an intrinsic property of fragile topological phases with filling anomaly.","We observe experimentally the 1D topological bound states using the pump-probe measurements of their dispersion and wavefunctions, which unveils a novel bulk-defect correspondence of fragile topology and a powerful tool for probing fragile topological phases and materials."],"url":"http://arxiv.org/abs/2405.02057v1","category":"cond-mat.mes-hall"} +{"created":"2024-05-03 12:36:19","title":"An analogue of the Milnor conjecture for the de Rham-Witt complex in characteristic 2","abstract":"We describe the modulo $2$ de Rham-Witt complex of a field of characteristic $2$, in terms of the powers of the augmentation ideal of the $\\mathbb{Z}/2$-geometric fixed points of real topological restriction homology TRR. This is analogous to the conjecture of Milnor, proved by Kato for fields of characteristic $2$, which describes the modulo $2$ Milnor K-theory in terms of the powers of the augmentation ideal of the Witt group of symmetric forms. Our proof provides a somewhat explicit description of these objects, as well as a calculation of the homotopy groups of the geometric fixed points of TRR and of real topological cyclic homology, for all fields.","sentences":["We describe the modulo $2$ de Rham-Witt complex of a field of characteristic $2$, in terms of the powers of the augmentation ideal of the $\\mathbb{Z}/2$-geometric fixed points of real topological restriction homology TRR.","This is analogous to the conjecture of Milnor, proved by Kato for fields of characteristic $2$, which describes the modulo $2$ Milnor K-theory in terms of the powers of the augmentation ideal of the Witt group of symmetric forms.","Our proof provides a somewhat explicit description of these objects, as well as a calculation of the homotopy groups of the geometric fixed points of TRR and of real topological cyclic homology, for all fields."],"url":"http://arxiv.org/abs/2405.02054v1","category":"math.AT"} +{"created":"2024-05-03 12:30:27","title":"Ah, that's the great puzzle: On the Quest of a Holistic Understanding of the Harms of Recommender Systems on Children","abstract":"Children come across various media items online, many of which are selected by recommender systems (RS) primarily designed for adults. The specific nature of the content selected by RS to display on online platforms used by children - although not necessarily targeting them as a user base - remains largely unknown. This raises questions about whether such content is appropriate given children's vulnerable stages of development and the potential risks to their well-being. In this position paper, we reflect on the relationship between RS and children, emphasizing the possible adverse effects of the content this user group might be exposed to online. As a step towards fostering safer interactions for children in online environments, we advocate for researchers, practitioners, and policymakers to undertake a more comprehensive examination of the impact of RS on children - one focused on harms. This would result in a more holistic understanding that could inform the design and deployment of strategies that would better suit children's needs and preferences while actively mitigating the potential harm posed by RS; acknowledging that identifying and addressing these harms is complex and multifaceted.","sentences":["Children come across various media items online, many of which are selected by recommender systems (RS) primarily designed for adults.","The specific nature of the content selected by RS to display on online platforms used by children - although not necessarily targeting them as a user base - remains largely unknown.","This raises questions about whether such content is appropriate given children's vulnerable stages of development and the potential risks to their well-being. ","In this position paper, we reflect on the relationship between RS and children, emphasizing the possible adverse effects of the content this user group might be exposed to online.","As a step towards fostering safer interactions for children in online environments, we advocate for researchers, practitioners, and policymakers to undertake a more comprehensive examination of the impact of RS on children - one focused on harms.","This would result in a more holistic understanding that could inform the design and deployment of strategies that would better suit children's needs and preferences while actively mitigating the potential harm posed by RS; acknowledging that identifying and addressing these harms is complex and multifaceted."],"url":"http://arxiv.org/abs/2405.02050v1","category":"cs.IR"} +{"created":"2024-05-03 12:22:35","title":"Are We in The Zone? Exploring The Features and Method of Detecting Simultaneous Flow Experiences Based on EEG Signals","abstract":"When executing interdependent personal tasks for the team's purpose, simultaneous individual flow(simultaneous flow) is the antecedent condition of achieving shared team flow. Detecting simultaneous flow helps better understanding the status of team members, which is thus important for optimizing multi-user interaction systems. However, there is currently a lack exploration on objective features and methods for detecting simultaneous flow. Based on brain mechanism of flow in teamwork and previous studies on electroencephalogram (EEG)-based individual flow detection, this study aims to explore the significant EEG features related to simultaneous flow, as well as effective detection methods based on EEG signals. First, a two-player simultaneous flow task is designed, based on which we construct the first multi-EEG signals dataset of simultaneous flow. Then, we explore the potential EEG signal features that may be related to individual and simultaneous flow and validate their effectiveness in simultaneous flow detection with various machine learning models. The results show that 1) the inter-brain synchrony features are relevant to simultaneous flow due to enhancing the models' performance in detecting different types of simultaneous flow; 2) the features from the frontal lobe area seem to be given priority attention when detecting simultaneous flows; 3) Random Forests performed best in binary classification while Neural Network and Deep Neural Network3 performed best in ternary classification.","sentences":["When executing interdependent personal tasks for the team's purpose, simultaneous individual flow(simultaneous flow) is the antecedent condition of achieving shared team flow.","Detecting simultaneous flow helps better understanding the status of team members, which is thus important for optimizing multi-user interaction systems.","However, there is currently a lack exploration on objective features and methods for detecting simultaneous flow.","Based on brain mechanism of flow in teamwork and previous studies on electroencephalogram (EEG)-based individual flow detection, this study aims to explore the significant EEG features related to simultaneous flow, as well as effective detection methods based on EEG signals.","First, a two-player simultaneous flow task is designed, based on which we construct the first multi-EEG signals dataset of simultaneous flow.","Then, we explore the potential EEG signal features that may be related to individual and simultaneous flow and validate their effectiveness in simultaneous flow detection with various machine learning models.","The results show that 1) the inter-brain synchrony features are relevant to simultaneous flow due to enhancing the models' performance in detecting different types of simultaneous flow; 2) the features from the frontal lobe area seem to be given priority attention when detecting simultaneous flows; 3) Random Forests performed best in binary classification while Neural Network and Deep Neural Network3 performed best in ternary classification."],"url":"http://arxiv.org/abs/2405.02045v1","category":"cs.HC"} +{"created":"2024-05-03 12:21:38","title":"On human-centred security: A new systems model based on modes and mode transitions","abstract":"We propose an abstract conceptual framework for analysing complex security systems using a new notion of modes and mode transitions. A mode is an independent component of a system with its own objectives, monitoring data, algorithms, and scope and limits. The behaviour of a mode, including its transitions to other modes, is determined by interpretations of the mode's monitoring data in the light of its objectives and capabilities -- these interpretations we call beliefs. We formalise the conceptual framework mathematically and, by quantifying and visualising beliefs in higher-dimensional geometric spaces, we argue our models may help both design, analyse and explain systems. The mathematical models are based on simplicial complexes.","sentences":["We propose an abstract conceptual framework for analysing complex security systems using a new notion of modes and mode transitions.","A mode is an independent component of a system with its own objectives, monitoring data, algorithms, and scope and limits.","The behaviour of a mode, including its transitions to other modes, is determined by interpretations of the mode's monitoring data in the light of its objectives and capabilities -- these interpretations we call beliefs.","We formalise the conceptual framework mathematically and, by quantifying and visualising beliefs in higher-dimensional geometric spaces, we argue our models may help both design, analyse and explain systems.","The mathematical models are based on simplicial complexes."],"url":"http://arxiv.org/abs/2405.02043v1","category":"cs.CR"} +{"created":"2024-05-03 12:20:08","title":"Stabilizing Backpropagation Through Time to Learn Complex Physics","abstract":"Of all the vector fields surrounding the minima of recurrent learning setups, the gradient field with its exploding and vanishing updates appears a poor choice for optimization, offering little beyond efficient computability. We seek to improve this suboptimal practice in the context of physics simulations, where backpropagating feedback through many unrolled time steps is considered crucial to acquiring temporally coherent behavior. The alternative vector field we propose follows from two principles: physics simulators, unlike neural networks, have a balanced gradient flow, and certain modifications to the backpropagation pass leave the positions of the original minima unchanged. As any modification of backpropagation decouples forward and backward pass, the rotation-free character of the gradient field is lost. Therefore, we discuss the negative implications of using such a rotational vector field for optimization and how to counteract them. Our final procedure is easily implementable via a sequence of gradient stopping and component-wise comparison operations, which do not negatively affect scalability. Our experiments on three control problems show that especially as we increase the complexity of each task, the unbalanced updates from the gradient can no longer provide the precise control signals necessary while our method still solves the tasks. Our code can be found at https://github.com/tum-pbs/StableBPTT.","sentences":["Of all the vector fields surrounding the minima of recurrent learning setups, the gradient field with its exploding and vanishing updates appears a poor choice for optimization, offering little beyond efficient computability.","We seek to improve this suboptimal practice in the context of physics simulations, where backpropagating feedback through many unrolled time steps is considered crucial to acquiring temporally coherent behavior.","The alternative vector field we propose follows from two principles: physics simulators, unlike neural networks, have a balanced gradient flow, and certain modifications to the backpropagation pass leave the positions of the original minima unchanged.","As any modification of backpropagation decouples forward and backward pass, the rotation-free character of the gradient field is lost.","Therefore, we discuss the negative implications of using such a rotational vector field for optimization and how to counteract them.","Our final procedure is easily implementable via a sequence of gradient stopping and component-wise comparison operations, which do not negatively affect scalability.","Our experiments on three control problems show that especially as we increase the complexity of each task, the unbalanced updates from the gradient can no longer provide the precise control signals necessary while our method still solves the tasks.","Our code can be found at https://github.com/tum-pbs/StableBPTT."],"url":"http://arxiv.org/abs/2405.02041v1","category":"cs.LG"} +{"created":"2024-05-03 12:17:11","title":"Dimensionality reduction of neuronal degeneracy reveals two interfering physiological mechanisms","abstract":"Neuronal systems maintain stable functions despite large variability in their physiological components. Ion channel expression, in particular, is highly variable in neurons exhibiting similar electrophysiological phenotypes, which poses questions regarding how specific ion channel subsets reliably shape neuron intrinsic properties. Here, we use detailed conductance-based modeling to explore the origin of stable neuronal function from variable channel composition. Using dimensionality reduction, we uncover two principal dimensions in the channel conductance space that capture most of the variance of the observed variability. Those two dimensions correspond to two physiologically relevant sources of variability that can be explained by feedback mechanisms underlying regulation of neuronal activity, providing quantitative insights into how channel composition links to neuronal electrophysiological activity. These insights allowed us to understand and design a model-independent, reliable neuromodulation rule for variable neuronal populations.","sentences":["Neuronal systems maintain stable functions despite large variability in their physiological components.","Ion channel expression, in particular, is highly variable in neurons exhibiting similar electrophysiological phenotypes, which poses questions regarding how specific ion channel subsets reliably shape neuron intrinsic properties.","Here, we use detailed conductance-based modeling to explore the origin of stable neuronal function from variable channel composition.","Using dimensionality reduction, we uncover two principal dimensions in the channel conductance space that capture most of the variance of the observed variability.","Those two dimensions correspond to two physiologically relevant sources of variability that can be explained by feedback mechanisms underlying regulation of neuronal activity, providing quantitative insights into how channel composition links to neuronal electrophysiological activity.","These insights allowed us to understand and design a model-independent, reliable neuromodulation rule for variable neuronal populations."],"url":"http://arxiv.org/abs/2405.02038v1","category":"q-bio.NC"} +{"created":"2024-05-03 12:14:14","title":"Spontaneous Conducting Boundary Channels in 1T-TaS$_{2}$","abstract":"Materials that transition between metal and insulator, the two opposing states that distinguish all solids, are fascinating because they underlie many mysteries in the physics of the solid state. In 1T-TaS$_{2}$, the metal-insulator transition is linked to a series of metastable states of a chiral charge density wave whose basic nature is still an open question. In this work, we show that pulses of current through these materials create current-carrying boundary channels that distinguish the metallic and insulating states. We demonstrate electrical control of these channels' properties, suggesting their formation could be due to the complex interplay of the formation of domain walls and the viscous flow of electrons. Our findings show that physical boundaries play a key role in the properties of the metastable states of the metal-insulator transition, highlighting new possibilities for in-situ electrical design and active manipulation of electrical components.","sentences":["Materials that transition between metal and insulator, the two opposing states that distinguish all solids, are fascinating because they underlie many mysteries in the physics of the solid state.","In 1T-TaS$_{2}$, the metal-insulator transition is linked to a series of metastable states of a chiral charge density wave whose basic nature is still an open question.","In this work, we show that pulses of current through these materials create current-carrying boundary channels that distinguish the metallic and insulating states.","We demonstrate electrical control of these channels' properties, suggesting their formation could be due to the complex interplay of the formation of domain walls and the viscous flow of electrons.","Our findings show that physical boundaries play a key role in the properties of the metastable states of the metal-insulator transition, highlighting new possibilities for in-situ electrical design and active manipulation of electrical components."],"url":"http://arxiv.org/abs/2405.02036v1","category":"cond-mat.str-el"} +{"created":"2024-05-03 12:06:32","title":"Vibrational Entanglement through the Lens of Quantum Information Measures","abstract":"We introduce a quantum information analysis of vibrational wave functions to understand complex vibrational spectra of molecules with strong anharmonic couplings and vibrational resonances. For this purpose, we define one- and two-modal entropies to guide the identification of strongly coupled vibrational modes and to characterize correlations within modal basis sets. We evaluate these descriptors for multi-configurational vibrational wave functions which we calculate with the n-mode vibrational density matrix renormalization group algorithm. Based on the quantum information measures, we present a vibrational entanglement analysis of the vibrational ground and excited states of CO2, which display strong anharmonic effects due to the symmetry-induced and accidental (near-) degeneracies. We investigate the entanglement signature of the Fermi resonance and discuss the maximally entangled state arising from the two degenerate bending modes.","sentences":["We introduce a quantum information analysis of vibrational wave functions to understand complex vibrational spectra of molecules with strong anharmonic couplings and vibrational resonances.","For this purpose, we define one-","and two-modal entropies to guide the identification of strongly coupled vibrational modes and to characterize correlations within modal basis sets.","We evaluate these descriptors for multi-configurational vibrational wave functions which we calculate with the n-mode vibrational density matrix renormalization group algorithm.","Based on the quantum information measures, we present a vibrational entanglement analysis of the vibrational ground and excited states of CO2, which display strong anharmonic effects due to the symmetry-induced and accidental (near-) degeneracies.","We investigate the entanglement signature of the Fermi resonance and discuss the maximally entangled state arising from the two degenerate bending modes."],"url":"http://arxiv.org/abs/2405.02031v1","category":"physics.chem-ph"} +{"created":"2024-05-03 12:02:40","title":"Obstacle Avoidance of Autonomous Vehicles: An LPVMPC with Scheduling Trust Region","abstract":"Reference tracking and obstacle avoidance rank among the foremost challenging aspects of autonomous driving. This paper proposes control designs for solving reference tracking problems in autonomous driving tasks while considering static obstacles. We suggest a model predictive control (MPC) strategy that evades the computational burden of nonlinear nonconvex optimization methods after embedding the nonlinear model equivalently to a linear parameter-varying (LPV) formulation using the so-called scheduling parameter. This allows optimal and fast solutions of the underlying convex optimization scheme as a quadratic program (QP) at the expense of losing some performance due to the uncertainty of the future scheduling trajectory over the MPC horizon. Also, to ensure that the modeling error due to the application of the scheduling parameter predictions does not become significant, we propose the concept of scheduling trust region by enforcing further soft constraints on the states and inputs. A consequence of using the new constraints in the MPC is that we construct a region in which the scheduling parameter updates in two consecutive time instants are trusted for computing the system matrices, and therefore, the feasibility of the MPC optimization problem is retained. We test the method in different scenarios and compare the results to standard LPVMPC as well as nonlinear MPC (NMPC) schemes.","sentences":["Reference tracking and obstacle avoidance rank among the foremost challenging aspects of autonomous driving.","This paper proposes control designs for solving reference tracking problems in autonomous driving tasks while considering static obstacles.","We suggest a model predictive control (MPC) strategy that evades the computational burden of nonlinear nonconvex optimization methods after embedding the nonlinear model equivalently to a linear parameter-varying (LPV) formulation using the so-called scheduling parameter.","This allows optimal and fast solutions of the underlying convex optimization scheme as a quadratic program (QP) at the expense of losing some performance due to the uncertainty of the future scheduling trajectory over the MPC horizon.","Also, to ensure that the modeling error due to the application of the scheduling parameter predictions does not become significant, we propose the concept of scheduling trust region by enforcing further soft constraints on the states and inputs.","A consequence of using the new constraints in the MPC is that we construct a region in which the scheduling parameter updates in two consecutive time instants are trusted for computing the system matrices, and therefore, the feasibility of the MPC optimization problem is retained.","We test the method in different scenarios and compare the results to standard LPVMPC as well as nonlinear MPC (NMPC) schemes."],"url":"http://arxiv.org/abs/2405.02030v1","category":"eess.SY"} +{"created":"2024-05-03 12:02:31","title":"MemorAI: Energy-Efficient Last-Level Cache Memory Optimization for Virtualized RANs","abstract":"The virtualization of Radio Access Networks (vRAN) is well on its way to become a reality, driven by its advantages such as flexibility and cost-effectiveness. However, virtualization comes at a high price - virtual Base Stations (vBSs) sharing the same computing platform incur a significant computing overhead due to in extremis consumption of shared cache memory resources. Consequently, vRAN suffers from increased energy consumption, which fuels the already high operational costs in 5G networks. This paper investigates cache memory allocation mechanisms' effectiveness in reducing total energy consumption. Using an experimental vRAN platform, we profile the energy consumption and CPU utilization of vBS as a function of the network state (e.g., traffic demand, modulation scheme). Then, we address the high dimensionality of the problem by decomposing it per vBS, which is possible thanks to the Last-Level Cache (LLC) isolation implemented in our system. Based on this, we train a vBS digital twin, which allows us to train offline a classifier, avoiding the performance degradation of the system during training. Our results show that our approach performs very closely to an offline optimal oracle, outperforming standard approaches used in today's deployments.","sentences":["The virtualization of Radio Access Networks (vRAN) is well on its way to become a reality, driven by its advantages such as flexibility and cost-effectiveness.","However, virtualization comes at a high price - virtual Base Stations (vBSs) sharing the same computing platform incur a significant computing overhead due to in extremis consumption of shared cache memory resources.","Consequently, vRAN suffers from increased energy consumption, which fuels the already high operational costs in 5G networks.","This paper investigates cache memory allocation mechanisms' effectiveness in reducing total energy consumption.","Using an experimental vRAN platform, we profile the energy consumption and CPU utilization of vBS as a function of the network state (e.g., traffic demand, modulation scheme).","Then, we address the high dimensionality of the problem by decomposing it per vBS, which is possible thanks to the Last-Level Cache (LLC) isolation implemented in our system.","Based on this, we train a vBS digital twin, which allows us to train offline a classifier, avoiding the performance degradation of the system during training.","Our results show that our approach performs very closely to an offline optimal oracle, outperforming standard approaches used in today's deployments."],"url":"http://arxiv.org/abs/2405.02029v1","category":"cs.NI"} +{"created":"2024-05-03 12:01:50","title":"Many-body Localization Transition of Ising Spin-1 Chains","abstract":"In this paper, we theoretically investigate the many-body localization properties of one-dimensional Ising spin-1 chains by using the methods of exact matrix diagonalization. We compare it with the MBL properties of the Ising spin-1/2 chains. The results indicate that the one-dimensional Ising spin-1 chains can also undergo MBL phase transition. There are various forms of disorder, and we compare the effects of different forms of quasi-disorder and random disorder on many-body localization in this paper. First, we calculate the exctied-state fidelity to study the MBL phase transtion. By changing the form of the quasi-disorder, we study the MBL transition of the system with different forms of quasi-disorder and compare them with those of the random disordered system. The results show that both random disorder and quasi-disorder can cause the MBL phase transition in the one-dimensional Ising spin-1 chains. In order to study the effect of spin interactions, we compare Ising spin-1 chains and spin-1/2 chains with the next-nearest-neighbour(N-N) two-body interactions and the next-next-nearest-neighbour (N-N-N)interactions. The results show that the critical point increases with the addition of the interaction. Then we study the dynamical properties of the model by the dynamical behavior of diagonal entropy (DE), local magnetization and the time evolution of fidelity to further prove the occurrence of MBL phase transition in the disordered Ising spin-1 chains with the (N-N) coupling term and distinguish the ergodic phase (thermal phase) and the many-body localized phase. Lastly, we delve into the impact of periodic driving on one-dimensional Ising spin-1 chains. And we compare it with the results obtained from the Ising spin-1/2 chains. It shows that periodic driving can cause Ising spin-1 chains and Ising spin-1/2 chains to occur the MBL transition.","sentences":["In this paper, we theoretically investigate the many-body localization properties of one-dimensional Ising spin-1 chains by using the methods of exact matrix diagonalization.","We compare it with the MBL properties of the Ising spin-1/2 chains.","The results indicate that the one-dimensional Ising spin-1 chains can also undergo MBL phase transition.","There are various forms of disorder, and we compare the effects of different forms of quasi-disorder and random disorder on many-body localization in this paper.","First, we calculate the exctied-state fidelity to study the MBL phase transtion.","By changing the form of the quasi-disorder, we study the MBL transition of the system with different forms of quasi-disorder and compare them with those of the random disordered system.","The results show that both random disorder and quasi-disorder can cause the MBL phase transition in the one-dimensional Ising spin-1 chains.","In order to study the effect of spin interactions, we compare Ising spin-1 chains and spin-1/2 chains with the next-nearest-neighbour(N-N) two-body interactions and the next-next-nearest-neighbour (N-N-N)interactions.","The results show that the critical point increases with the addition of the interaction.","Then we study the dynamical properties of the model by the dynamical behavior of diagonal entropy (DE), local magnetization and the time evolution of fidelity to further prove the occurrence of MBL phase transition in the disordered Ising spin-1 chains with the (N-N) coupling term and distinguish the ergodic phase (thermal phase) and the many-body localized phase.","Lastly, we delve into the impact of periodic driving on one-dimensional Ising spin-1 chains.","And we compare it with the results obtained from the Ising spin-1/2 chains.","It shows that periodic driving can cause Ising spin-1 chains and Ising spin-1/2 chains to occur the MBL transition."],"url":"http://arxiv.org/abs/2405.02028v1","category":"cond-mat.dis-nn"} +{"created":"2024-05-03 11:58:43","title":"Exponential quantum advantages in learning quantum observables from classical data","abstract":"Quantum computers are believed to bring computational advantages in simulating quantum many body systems. However, recent works have shown that classical machine learning algorithms are able to predict numerous properties of quantum systems with classical data. Despite various examples of learning tasks with provable quantum advantages being proposed, they all involve cryptographic functions and do not represent any physical scenarios encountered in laboratory settings. In this paper we prove quantum advantages for the physically relevant task of learning quantum observables from classical (measured out) data. We consider two types of observables: first we prove a learning advantage for linear combinations of Pauli strings, then we extend the result for the broader case of unitarily parametrized observables. For each type of observable we delineate the boundaries that separate physically relevant tasks which classical computers can solve using data from quantum measurements, from those where a quantum computer is still necessary for data analysis. Our results shed light on the utility of quantum computers for machine learning problems in the domain of quantum many body physics, thereby suggesting new directions where quantum learning improvements may emerge.","sentences":["Quantum computers are believed to bring computational advantages in simulating quantum many body systems.","However, recent works have shown that classical machine learning algorithms are able to predict numerous properties of quantum systems with classical data.","Despite various examples of learning tasks with provable quantum advantages being proposed, they all involve cryptographic functions and do not represent any physical scenarios encountered in laboratory settings.","In this paper we prove quantum advantages for the physically relevant task of learning quantum observables from classical (measured out) data.","We consider two types of observables: first we prove a learning advantage for linear combinations of Pauli strings, then we extend the result for the broader case of unitarily parametrized observables.","For each type of observable we delineate the boundaries that separate physically relevant tasks which classical computers can solve using data from quantum measurements, from those where a quantum computer is still necessary for data analysis.","Our results shed light on the utility of quantum computers for machine learning problems in the domain of quantum many body physics, thereby suggesting new directions where quantum learning improvements may emerge."],"url":"http://arxiv.org/abs/2405.02027v1","category":"quant-ph"} +{"created":"2024-05-03 11:58:03","title":"Diversity of What? On the Different Conceptualizations of Diversity in Recommender Systems","abstract":"Diversity is a commonly known principle in the design of recommender systems, but also ambiguous in its conceptualization. Through semi-structured interviews we explore how practitioners at three different public service media organizations in the Netherlands conceptualize diversity within the scope of their recommender systems. We provide an overview of the goals that they have with diversity in their systems, which aspects are relevant, and how recommendations should be diversified. We show that even within this limited domain, conceptualization of diversity greatly varies, and argue that it is unlikely that a standardized conceptualization will be achieved. Instead, we should focus on effective communication of what diversity in this particular system means, thus allowing for operationalizations of diversity that are capable of expressing the nuances and requirements of that particular domain.","sentences":["Diversity is a commonly known principle in the design of recommender systems, but also ambiguous in its conceptualization.","Through semi-structured interviews we explore how practitioners at three different public service media organizations in the Netherlands conceptualize diversity within the scope of their recommender systems.","We provide an overview of the goals that they have with diversity in their systems, which aspects are relevant, and how recommendations should be diversified.","We show that even within this limited domain, conceptualization of diversity greatly varies, and argue that it is unlikely that a standardized conceptualization will be achieved.","Instead, we should focus on effective communication of what diversity in this particular system means, thus allowing for operationalizations of diversity that are capable of expressing the nuances and requirements of that particular domain."],"url":"http://arxiv.org/abs/2405.02026v1","category":"cs.IR"} +{"created":"2024-05-03 11:55:45","title":"IFNet: Deep Imaging and Focusing for Handheld SAR with Millimeter-wave Signals","abstract":"Recent advancements have showcased the potential of handheld millimeter-wave (mmWave) imaging, which applies synthetic aperture radar (SAR) principles in portable settings. However, existing studies addressing handheld motion errors either rely on costly tracking devices or employ simplified imaging models, leading to impractical deployment or limited performance. In this paper, we present IFNet, a novel deep unfolding network that combines the strengths of signal processing models and deep neural networks to achieve robust imaging and focusing for handheld mmWave systems. We first formulate the handheld imaging model by integrating multiple priors about mmWave images and handheld phase errors. Furthermore, we transform the optimization processes into an iterative network structure for improved and efficient imaging performance. Extensive experiments demonstrate that IFNet effectively compensates for handheld phase errors and recovers high-fidelity images from severely distorted signals. In comparison with existing methods, IFNet can achieve at least 11.89 dB improvement in average peak signal-to-noise ratio (PSNR) and 64.91% improvement in average structural similarity index measure (SSIM) on a real-world dataset.","sentences":["Recent advancements have showcased the potential of handheld millimeter-wave (mmWave) imaging, which applies synthetic aperture radar (SAR) principles in portable settings.","However, existing studies addressing handheld motion errors either rely on costly tracking devices or employ simplified imaging models, leading to impractical deployment or limited performance.","In this paper, we present IFNet, a novel deep unfolding network that combines the strengths of signal processing models and deep neural networks to achieve robust imaging and focusing for handheld mmWave systems.","We first formulate the handheld imaging model by integrating multiple priors about mmWave images and handheld phase errors.","Furthermore, we transform the optimization processes into an iterative network structure for improved and efficient imaging performance.","Extensive experiments demonstrate that IFNet effectively compensates for handheld phase errors and recovers high-fidelity images from severely distorted signals.","In comparison with existing methods, IFNet can achieve at least 11.89 dB improvement in average peak signal-to-noise ratio (PSNR) and 64.91% improvement in average structural similarity index measure (SSIM) on a real-world dataset."],"url":"http://arxiv.org/abs/2405.02023v1","category":"cs.CV"} +{"created":"2024-05-03 11:23:18","title":"A proximitized quantum dot in germanium","abstract":"Planar germanium quantum wells have recently been shown to host a hard-gapped superconductor-semiconductor interface. Additionally, quantum dot spin qubits in germanium are well-suited for quantum information processing, with isotopic purification to a nuclear spin-free material expected to yield long coherence times. Therefore, as one of the few group IV materials with the potential to host superconductor-semiconductor hybrid devices, proximitized quantum dots in germanium are a crucial ingredient towards topological superconductivity and novel qubit modalities. Here we demonstrate a quantum dot (QD) in a Ge/SiGe heterostructure proximitized by a platinum germanosilicide (PtGeSi) superconducting lead (SC), forming a SC-QD-SC junction. We show tunability of the QD-SC coupling strength, as well as gate control of the ratio of charging energy and the induced gap. We further exploit this tunability by exhibiting control of the ground state of the system between even and odd parity. Furthermore, we characterize the critical magnetic field strengths, finding a robust critical out-of-plane field of 0.91(5) T. Finally we explore sub-gap spin splitting in the device, observing rich physics in the resulting spectra, that we model using a zero-bandwidth model in the Yu-Shiba-Rusinov limit. The demonstration of controllable proximitization at the nanoscale of a germanium quantum dot opens up the physics of novel spin and superconducting qubits, and Josephson junction arrays in a group IV material.","sentences":["Planar germanium quantum wells have recently been shown to host a hard-gapped superconductor-semiconductor interface.","Additionally, quantum dot spin qubits in germanium are well-suited for quantum information processing, with isotopic purification to a nuclear spin-free material expected to yield long coherence times.","Therefore, as one of the few group IV materials with the potential to host superconductor-semiconductor hybrid devices, proximitized quantum dots in germanium are a crucial ingredient towards topological superconductivity and novel qubit modalities.","Here we demonstrate a quantum dot (QD) in a Ge/SiGe heterostructure proximitized by a platinum germanosilicide (PtGeSi) superconducting lead (SC), forming a SC-QD-SC junction.","We show tunability of the QD-SC coupling strength, as well as gate control of the ratio of charging energy and the induced gap.","We further exploit this tunability by exhibiting control of the ground state of the system between even and odd parity.","Furthermore, we characterize the critical magnetic field strengths, finding a robust critical out-of-plane field of 0.91(5)","T. Finally we explore sub-gap spin splitting in the device, observing rich physics in the resulting spectra, that we model using a zero-bandwidth model in the Yu-Shiba-Rusinov limit.","The demonstration of controllable proximitization at the nanoscale of a germanium quantum dot opens up the physics of novel spin and superconducting qubits, and Josephson junction arrays in a group IV material."],"url":"http://arxiv.org/abs/2405.02013v1","category":"cond-mat.mes-hall"} +{"created":"2024-05-03 11:22:08","title":"Backtesting Expected Shortfall: Accounting for both duration and severity with bivariate orthogonal polynomials","abstract":"We propose an original two-part, duration-severity approach for backtesting Expected Shortfall (ES). While Probability Integral Transform (PIT) based ES backtests have gained popularity, they have yet to allow for separate testing of the frequency and severity of Value-at-Risk (VaR) violations. This is a crucial aspect, as ES measures the average loss in the event of such violations. To overcome this limitation, we introduce a backtesting framework that relies on the sequence of inter-violation durations and the sequence of severities in case of violations. By leveraging the theory of (bivariate) orthogonal polynomials, we derive orthogonal moment conditions satisfied by these two sequences. Our approach includes a straightforward, model-free Wald test, which encompasses various unconditional and conditional coverage backtests for both VaR and ES. This test aids in identifying any mis-specified components of the internal model used by banks to forecast ES. Moreover, it can be extended to analyze other systemic risk measures such as Marginal Expected Shortfall. Simulation experiments indicate that our test exhibits good finite sample properties for realistic sample sizes. Through application to two stock indices, we demonstrate how our methodology provides insights into the reasons for rejections in testing ES validity.","sentences":["We propose an original two-part, duration-severity approach for backtesting Expected Shortfall (ES).","While Probability Integral Transform (PIT) based ES backtests have gained popularity, they have yet to allow for separate testing of the frequency and severity of Value-at-Risk (VaR) violations.","This is a crucial aspect, as ES measures the average loss in the event of such violations.","To overcome this limitation, we introduce a backtesting framework that relies on the sequence of inter-violation durations and the sequence of severities in case of violations.","By leveraging the theory of (bivariate) orthogonal polynomials, we derive orthogonal moment conditions satisfied by these two sequences.","Our approach includes a straightforward, model-free Wald test, which encompasses various unconditional and conditional coverage backtests for both VaR and ES.","This test aids in identifying any mis-specified components of the internal model used by banks to forecast ES.","Moreover, it can be extended to analyze other systemic risk measures such as Marginal Expected Shortfall.","Simulation experiments indicate that our test exhibits good finite sample properties for realistic sample sizes.","Through application to two stock indices, we demonstrate how our methodology provides insights into the reasons for rejections in testing ES validity."],"url":"http://arxiv.org/abs/2405.02012v1","category":"q-fin.RM"} +{"created":"2024-05-03 11:19:35","title":"Autonomous Active Mapping in Steep Alpine Environments with Fixed-wing Aerial Vehicles","abstract":"Monitoring large scale environments is a crucial task for managing remote alpine environments, especially for hazardous events such as avalanches. One key information for avalanche risk forecast is imagery of released avalanches. As these happen in remote and potentially dangerous locations this data is difficult to obtain. Fixed-wing vehicles, due to their long range and travel speeds are a promising platform to gather aerial imagery to map avalanche activities. However, operating such vehicles in mountainous terrain remains a challenge due to the complex topography, regulations, and uncertain environment. In this work, we present a system that is capable of safely navigating and mapping an avalanche using a fixed-wing aerial system and discuss the challenges arising when executing such a mission. We show in our field experiments that we can effectively navigate in steep terrain environments while maximizing the map quality. We expect our work to enable more autonomous operations of fixed-wing vehicles in alpine environments to maximize the quality of the data gathered.","sentences":["Monitoring large scale environments is a crucial task for managing remote alpine environments, especially for hazardous events such as avalanches.","One key information for avalanche risk forecast is imagery of released avalanches.","As these happen in remote and potentially dangerous locations this data is difficult to obtain.","Fixed-wing vehicles, due to their long range and travel speeds are a promising platform to gather aerial imagery to map avalanche activities.","However, operating such vehicles in mountainous terrain remains a challenge due to the complex topography, regulations, and uncertain environment.","In this work, we present a system that is capable of safely navigating and mapping an avalanche using a fixed-wing aerial system and discuss the challenges arising when executing such a mission.","We show in our field experiments that we can effectively navigate in steep terrain environments while maximizing the map quality.","We expect our work to enable more autonomous operations of fixed-wing vehicles in alpine environments to maximize the quality of the data gathered."],"url":"http://arxiv.org/abs/2405.02011v1","category":"cs.RO"} +{"created":"2024-05-03 11:03:09","title":"Smoothly vanishing density in the contact process by an interplay of disorder and long-distance dispersal","abstract":"Realistic modeling of ecological population dynamics requires spatially explicit descriptions that can take into account spatial heterogeneity as well as long-distance dispersal. Here, we present Monte Carlo simulations and numerical renormalization group results for the paradigmatic model, the contact process, in the combined presence of these factors in both one and two-dimensional systems. Our results confirm our analytic arguments stating that the density vanishes smoothly at the extinction threshold, in a way characteristic of infinite-order transitions. This extremely smooth vanishing of the global density entails an enhanced exposure of the population to extinction events. At the same time, a reverse order parameter, the local persistence displays a discontinuity characteristic of mixed-order transitions, as it approaches a non-universal critical value algebraically with an exponent $\\beta_p'<1$.","sentences":["Realistic modeling of ecological population dynamics requires spatially explicit descriptions that can take into account spatial heterogeneity as well as long-distance dispersal.","Here, we present Monte Carlo simulations and numerical renormalization group results for the paradigmatic model, the contact process, in the combined presence of these factors in both one and two-dimensional systems.","Our results confirm our analytic arguments stating that the density vanishes smoothly at the extinction threshold, in a way characteristic of infinite-order transitions.","This extremely smooth vanishing of the global density entails an enhanced exposure of the population to extinction events.","At the same time, a reverse order parameter, the local persistence displays a discontinuity characteristic of mixed-order transitions, as it approaches a non-universal critical value algebraically with an exponent $\\beta_p'<1$."],"url":"http://arxiv.org/abs/2405.02003v1","category":"cond-mat.stat-mech"} +{"created":"2024-05-03 10:50:30","title":"Mathematics of statistical sequential decision-making: concentration, risk-awareness and modelling in stochastic bandits, with applications to bariatric surgery","abstract":"This thesis aims to study some of the mathematical challenges that arise in the analysis of statistical sequential decision-making algorithms for postoperative patients follow-up. Stochastic bandits (multiarmed, contextual) model the learning of a sequence of actions (policy) by an agent in an uncertain environment in order to maximise observed rewards. To learn optimal policies, bandit algorithms have to balance the exploitation of current knowledge and the exploration of uncertain actions. Such algorithms have largely been studied and deployed in industrial applications with large datasets, low-risk decisions and clear modelling assumptions, such as clickthrough rate maximisation in online advertising. By contrast, digital health recommendations call for a whole new paradigm of small samples, risk-averse agents and complex, nonparametric modelling. To this end, we developed new safe, anytime-valid concentration bounds, (Bregman, empirical Chernoff), introduced a new framework for risk-aware contextual bandits (with elicitable risk measures) and analysed a novel class of nonparametric bandit algorithms under weak assumptions (Dirichlet sampling). In addition to the theoretical guarantees, these results are supported by in-depth empirical evidence. Finally, as a first step towards personalised postoperative follow-up recommendations, we developed with medical doctors and surgeons an interpretable machine learning model to predict the long-term weight trajectories of patients after bariatric surgery.","sentences":["This thesis aims to study some of the mathematical challenges that arise in the analysis of statistical sequential decision-making algorithms for postoperative patients follow-up.","Stochastic bandits (multiarmed, contextual) model the learning of a sequence of actions (policy) by an agent in an uncertain environment in order to maximise observed rewards.","To learn optimal policies, bandit algorithms have to balance the exploitation of current knowledge and the exploration of uncertain actions.","Such algorithms have largely been studied and deployed in industrial applications with large datasets, low-risk decisions and clear modelling assumptions, such as clickthrough rate maximisation in online advertising.","By contrast, digital health recommendations call for a whole new paradigm of small samples, risk-averse agents and complex, nonparametric modelling.","To this end, we developed new safe, anytime-valid concentration bounds, (Bregman, empirical Chernoff), introduced a new framework for risk-aware contextual bandits (with elicitable risk measures) and analysed a novel class of nonparametric bandit algorithms under weak assumptions (Dirichlet sampling).","In addition to the theoretical guarantees, these results are supported by in-depth empirical evidence.","Finally, as a first step towards personalised postoperative follow-up recommendations, we developed with medical doctors and surgeons an interpretable machine learning model to predict the long-term weight trajectories of patients after bariatric surgery."],"url":"http://arxiv.org/abs/2405.01994v1","category":"stat.ML"} +{"created":"2024-05-03 10:49:27","title":"Evolution of Planetary Chaotic Zones in Planetesimal Disks","abstract":"Extensive numerical experiments on the long-term dynamics of planetesimal disks with planets in systems of single stars have been carried out. The planetary chaotic zone clearing timescales $T_\\mathrm{cl}$ as a function of mass parameter $\\mu$ (planet-star mass ratio) have been determined numerically with a high accuracy separately for the outer and inner parts of the chaotic zone. Diffusional components $\\propto \\mu^{-6/7}$ and $\\propto \\mu^{-2}$ have been revealed in the dependence $T_\\mathrm{cl}(\\mu)$. The results obtained are discussed and interpreted in light of existing analytical theories based on the mean motion resonance overlap criterion and in comparison with previous numerical approaches to the problem.","sentences":["Extensive numerical experiments on the long-term dynamics of planetesimal disks with planets in systems of single stars have been carried out.","The planetary chaotic zone clearing timescales $T_\\mathrm{cl}$ as a function of mass parameter $\\mu$ (planet-star mass ratio) have been determined numerically with a high accuracy separately for the outer and inner parts of the chaotic zone.","Diffusional components $\\propto \\mu^{-6/7}$ and $\\propto \\mu^{-2}$ have been revealed in the dependence $T_\\mathrm{cl}(\\mu)$.","The results obtained are discussed and interpreted in light of existing analytical theories based on the mean motion resonance overlap criterion and in comparison with previous numerical approaches to the problem."],"url":"http://arxiv.org/abs/2405.01993v1","category":"astro-ph.EP"} +{"created":"2024-05-03 10:40:26","title":"Noise classification in small quantum networks by Machine Learning","abstract":"We investigate machine learning-based noise classification aimed at the recognition of the Markovian character of a dynamics and the identification of correlations of classical noise, as well as their interplay in small quantum networks. We operate control based on Coherent Tunneling by Adiabatic Passage (CTAP) or Stimulated Raman Adiabatic Passage (STIRAP) in a three-level system using different pulse configurations as inputs to train a feedforward neural network. Our results show that supervised learning can classify distinct types of classical diagonal noise affecting the system. Three non-Markovian (quasistatic correlated, anti-correlated, and uncorrelated) and Markovian noise mechanisms are classified with $99\\%$ accuracy. Instead, correlations of Markovian noises cannot be classified with our method. The approach is robust against statistical measurement errors keeping its effectiveness even for physical measurements where a limited number of samples is available.","sentences":["We investigate machine learning-based noise classification aimed at the recognition of the Markovian character of a dynamics and the identification of correlations of classical noise, as well as their interplay in small quantum networks.","We operate control based on Coherent Tunneling by Adiabatic Passage (CTAP) or Stimulated Raman Adiabatic Passage (STIRAP) in a three-level system using different pulse configurations as inputs to train a feedforward neural network.","Our results show that supervised learning can classify distinct types of classical diagonal noise affecting the system.","Three non-Markovian (quasistatic correlated, anti-correlated, and uncorrelated) and Markovian noise mechanisms are classified with $99\\%$ accuracy.","Instead, correlations of Markovian noises cannot be classified with our method.","The approach is robust against statistical measurement errors keeping its effectiveness even for physical measurements where a limited number of samples is available."],"url":"http://arxiv.org/abs/2405.01987v1","category":"quant-ph"} +{"created":"2024-05-03 10:38:39","title":"Systematic study of capture thresholds with time dependent Hartree-Fock theory","abstract":"With the time dependent Hartree-Fock (TDHF) theory, capture thresholds $E_{\\rm cap}$ for 144 fusion systems with nearly spherical nuclei are systematically studied for the first time. We find that for the reactions between doubly-magic nuclei, the calculated $E_{\\rm cap}$ are very close to the extracted barrier heights from measured fusion excitation functions. For the fusion reactions with nearly spherical nuclei, an excitation energy of about 1 MeV at the capture position need to be considered to better reproduce the data due to the lower excitation threshold. The rms deviation with respect to the barrier heights is only 1.43 MeV from the TDHF calcualtions, which is smaller than the results from three empirical nuclear potentials. Together with Siwek-Wilczy\\'{n}ski formula in which the three parameters are determined by the TDHF calculations, the measured fusion cross sections at energies around the barriers can be well reproduced for seven fusion reactions $^{40}$Ca+$^{48}$Ca, $^{16}$O+$^{208}$Pb, $^{40}$Ca+$^{90,96}$Zr, $^{28}$Si+$^{96}$Zr and $^{132}$Sn+$^{40,48}$Ca.","sentences":["With the time dependent Hartree-Fock (TDHF) theory, capture thresholds $E_{\\rm cap}$ for 144 fusion systems with nearly spherical nuclei are systematically studied for the first time.","We find that for the reactions between doubly-magic nuclei, the calculated $E_{\\rm cap}$ are very close to the extracted barrier heights from measured fusion excitation functions.","For the fusion reactions with nearly spherical nuclei, an excitation energy of about 1 MeV at the capture position need to be considered to better reproduce the data due to the lower excitation threshold.","The rms deviation with respect to the barrier heights is only 1.43 MeV from the TDHF calcualtions, which is smaller than the results from three empirical nuclear potentials.","Together with Siwek-Wilczy\\'{n}ski formula in which the three parameters are determined by the TDHF calculations, the measured fusion cross sections at energies around the barriers can be well reproduced for seven fusion reactions $^{40}$Ca+$^{48}$Ca, $^{16}$O+$^{208}$Pb, $^{40}$Ca+$^{90,96}$Zr, $^{28}$Si+$^{96}$Zr and $^{132}$Sn+$^{40,48}$Ca."],"url":"http://arxiv.org/abs/2405.01985v1","category":"nucl-th"} +{"created":"2024-05-03 10:18:35","title":"Universal Performance Gap of Neural Quantum States Applied to the Hofstadter-Bose-Hubbard Model","abstract":"Neural Quantum States (NQS) have demonstrated significant potential in approximating ground states of many-body quantum systems, though their performance can be inconsistent across different models. This study investigates the performance of NQS in approximating the ground state of the Hofstadter-Bose-Hubbard (HBH) model, a boson system on a two-dimensional square lattice with a perpendicular magnetic field. Our results indicate that increasing magnetic flux leads to a substantial increase in energy error, up to three orders of magnitude. Importantly, this decline in NQS performance is consistent across different optimization methods, neural network architectures, and physical model parameters, suggesting a fundamental challenge intrinsic to the model. Despite investigating potential causes such as wave function phase structure, quantum entanglement, fractional quantum Hall effect, and the variational loss landscape, the precise reasons for this degradation remain elusive. The HBH model thus proves to be an effective testing ground for exploring the capabilities and limitations of NQS. Our study highlights the need for advanced theoretical frameworks to better understand the expressive power of NQS which would allow a systematic development of methods that could potentially overcome these challenges.","sentences":["Neural Quantum States (NQS) have demonstrated significant potential in approximating ground states of many-body quantum systems, though their performance can be inconsistent across different models.","This study investigates the performance of NQS in approximating the ground state of the Hofstadter-Bose-Hubbard (HBH) model, a boson system on a two-dimensional square lattice with a perpendicular magnetic field.","Our results indicate that increasing magnetic flux leads to a substantial increase in energy error, up to three orders of magnitude.","Importantly, this decline in NQS performance is consistent across different optimization methods, neural network architectures, and physical model parameters, suggesting a fundamental challenge intrinsic to the model.","Despite investigating potential causes such as wave function phase structure, quantum entanglement, fractional quantum Hall effect, and the variational loss landscape, the precise reasons for this degradation remain elusive.","The HBH model thus proves to be an effective testing ground for exploring the capabilities and limitations of NQS.","Our study highlights the need for advanced theoretical frameworks to better understand the expressive power of NQS which would allow a systematic development of methods that could potentially overcome these challenges."],"url":"http://arxiv.org/abs/2405.01981v1","category":"quant-ph"} +{"created":"2024-05-03 10:00:45","title":"Conformal Prediction for Natural Language Processing: A Survey","abstract":"The rapid proliferation of large language models and natural language processing (NLP) applications creates a crucial need for uncertainty quantification to mitigate risks such as hallucinations and to enhance decision-making reliability in critical applications. Conformal prediction is emerging as a theoretically sound and practically useful framework, combining flexibility with strong statistical guarantees. Its model-agnostic and distribution-free nature makes it particularly promising to address the current shortcomings of NLP systems that stem from the absence of uncertainty quantification. This paper provides a comprehensive survey of conformal prediction techniques, their guarantees, and existing applications in NLP, pointing to directions for future research and open challenges.","sentences":["The rapid proliferation of large language models and natural language processing (NLP) applications creates a crucial need for uncertainty quantification to mitigate risks such as hallucinations and to enhance decision-making reliability in critical applications.","Conformal prediction is emerging as a theoretically sound and practically useful framework, combining flexibility with strong statistical guarantees.","Its model-agnostic and distribution-free nature makes it particularly promising to address the current shortcomings of NLP systems that stem from the absence of uncertainty quantification.","This paper provides a comprehensive survey of conformal prediction techniques, their guarantees, and existing applications in NLP, pointing to directions for future research and open challenges."],"url":"http://arxiv.org/abs/2405.01976v1","category":"cs.CL"} +{"created":"2024-05-03 09:53:28","title":"A Sonar-based AUV Positioning System for Underwater Environments with Low Infrastructure Density","abstract":"The increasing demand for underwater vehicles highlights the necessity for robust localization solutions in inspection missions. In this work, we present a novel real-time sonar-based underwater global positioning algorithm for AUVs (Autonomous Underwater Vehicles) designed for environments with a sparse distribution of human-made assets. Our approach exploits two synergistic data interpretation frontends applied to the same stream of sonar data acquired by a multibeam Forward-Looking Sonar (FSD). These observations are fused within a Particle Filter (PF) either to weigh more particles that belong to high-likelihood regions or to solve symmetric ambiguities. Preliminary experiments carried out on a simulated environment resembling a real underwater plant provided promising results. This work represents a starting point towards future developments of the method and consequent exhaustive evaluations also in real-world scenarios.","sentences":["The increasing demand for underwater vehicles highlights the necessity for robust localization solutions in inspection missions.","In this work, we present a novel real-time sonar-based underwater global positioning algorithm for AUVs (Autonomous Underwater Vehicles) designed for environments with a sparse distribution of human-made assets.","Our approach exploits two synergistic data interpretation frontends applied to the same stream of sonar data acquired by a multibeam Forward-Looking Sonar (FSD).","These observations are fused within a Particle Filter (PF) either to weigh more particles that belong to high-likelihood regions or to solve symmetric ambiguities.","Preliminary experiments carried out on a simulated environment resembling a real underwater plant provided promising results.","This work represents a starting point towards future developments of the method and consequent exhaustive evaluations also in real-world scenarios."],"url":"http://arxiv.org/abs/2405.01971v1","category":"cs.RO"} +{"created":"2024-05-03 09:53:12","title":"Direct detectability of tidally heated exomoons by photometric orbital modulation","abstract":"(Aims) We investigate whether volcanic exomoons can be detected in thermal wavelength light curves due to their phase variability along their orbit. The method we use is based on the photometric signal variability that volcanic features or hotspots would cause in infrared (IR) wavelengths, when they are inhomogeneously distributed on the surface of a tidally heated exomoon (THEM). (Methods) We simulated satellites of various sizes around an isolated planet and modeled the system's variability in two IR wavelengths, taking into account photon shot noise. The moon's periodic signal as it orbits the planet introduces a peak in the frequency space of the system's time-variable flux. We investigated the THEM and system properties that would make a moon stand out in the frequency space of its host's variable flux. (Results) The moon's signal can produce a prominent feature in its host's flux periodogram at shorter IR wavelengths for hotspots with temperatures similar to the ones seen on the Jovian moon, Io, while the same moon would not be identifiable in longer IR wavelengths. By comparing observations at two different wavelengths, we are able to disentangle an exomoon's signal from the planet's one in the frequency domain for system distances up to $\\sim$10 pc for Mars-sized exomoons and even further for Earth-sized ones for transiting and non-transiting orbital inclinations. (Conclusions) This method enlarges the parameter space of detectable exomoons around isolated planetary mass objects and directly imaged exoplanets, as it is sensitive to Io-Earth sized exomoons with hot volcanic features for a wide range of non-transiting orbital inclinations. Exomoon transits and the detection of outgassed volcanic molecules can subsequently confirm a putative detection.","sentences":["(Aims)","We investigate whether volcanic exomoons can be detected in thermal wavelength light curves due to their phase variability along their orbit.","The method we use is based on the photometric signal variability that volcanic features or hotspots would cause in infrared (IR) wavelengths, when they are inhomogeneously distributed on the surface of a tidally heated exomoon (THEM).","(Methods)","We simulated satellites of various sizes around an isolated planet and modeled the system's variability in two IR wavelengths, taking into account photon shot noise.","The moon's periodic signal as it orbits the planet introduces a peak in the frequency space of the system's time-variable flux.","We investigated the THEM and system properties that would make a moon stand out in the frequency space of its host's variable flux.","(Results)","The moon's signal can produce a prominent feature in its host's flux periodogram at shorter IR wavelengths for hotspots with temperatures similar to the ones seen on the Jovian moon, Io, while the same moon would not be identifiable in longer IR wavelengths.","By comparing observations at two different wavelengths, we are able to disentangle an exomoon's signal from the planet's one in the frequency domain for system distances up to $\\sim$10 pc for Mars-sized exomoons and even further for Earth-sized ones for transiting and non-transiting orbital inclinations.","(Conclusions)","This method enlarges the parameter space of detectable exomoons around isolated planetary mass objects and directly imaged exoplanets, as it is sensitive to Io-Earth sized exomoons with hot volcanic features for a wide range of non-transiting orbital inclinations.","Exomoon transits and the detection of outgassed volcanic molecules can subsequently confirm a putative detection."],"url":"http://arxiv.org/abs/2405.01970v1","category":"astro-ph.EP"} +{"created":"2024-05-03 09:45:51","title":"Convex optimization on CAT(0) cubical complexes","abstract":"We consider geodesically convex optimization problems involving distances to a finite set of points $A$ in a CAT(0) cubical complex. Examples include the minimum enclosing ball problem, the weighted mean and median problems, and the feasibility and projection problems for intersecting balls with centers in $A$. We propose a decomposition approach relying on standard Euclidean cutting plane algorithms. The cutting planes are readily derivable from efficient algorithms for computing geodesics in the complex.","sentences":["We consider geodesically convex optimization problems involving distances to a finite set of points $A$ in a CAT(0) cubical complex.","Examples include the minimum enclosing ball problem, the weighted mean and median problems, and the feasibility and projection problems for intersecting balls with centers in $A$.","We propose a decomposition approach relying on standard Euclidean cutting plane algorithms.","The cutting planes are readily derivable from efficient algorithms for computing geodesics in the complex."],"url":"http://arxiv.org/abs/2405.01968v1","category":"math.OC"} +{"created":"2024-05-03 09:33:43","title":"The atomizing pulsed jet","abstract":"Direct Numerical Simulations of the injection of a pulsed round liquid jet in a stagnant gas are performed. The Reynolds and Weber numbers and the density ratio are sufficiently large for reaching a complex high-speed atomization regime. The Weber number based on grid size is small, an indication that the simulations are very well resolved. Computations are performed using octree adaptive mesh refinement using the Basilisk free-code platform, down to a specified minimum grid size $\\Delta$. Qualitative analysis of the flow and its topology reveal a complex structure of ligaments, sheets, droplets and bubbles that evolve and interact through impacts, ligament breakup, sheet rupture and engulfment of air bubbles in the liquid. A rich gallery of images of entangled structures is produced. Most processes occurring in this type of atomization are reproduced in detail, except at the instant of thin sheet perforation or breakup. We analyze drop statistics, showing that as the grid resolution is increased, the small-scale part of the distribution does not converge, and contains a large number of droplets close in order of magnitude to the minimum grid size with a significant peak at $d = 3\\Delta$ . This non-convergence arises from the {\\em numerical sheet breakup} effect, in which the interface becomes rough just before it breaks. The rough appearance of the interface is associated to a high-wavenumber oscillation of the curvature. To recover convergence, we apply the controlled \"manifold death\" numerical procedure, in which thin sheets are detected, and then pierced by fiat before they reach a set critical thickness $h_c$ that is always larger than $6 \\Delta$. This allows convergence of the droplet frequency above a certain critical diameter $d_c$ above and close to $h_c$. A unimodal distribution is observed in the converged range.","sentences":["Direct Numerical Simulations of the injection of a pulsed round liquid jet in a stagnant gas are performed.","The Reynolds and Weber numbers and the density ratio are sufficiently large for reaching a complex high-speed atomization regime.","The Weber number based on grid size is small, an indication that the simulations are very well resolved.","Computations are performed using octree adaptive mesh refinement using the Basilisk free-code platform, down to a specified minimum grid size $\\Delta$. Qualitative analysis of the flow and its topology reveal a complex structure of ligaments, sheets, droplets and bubbles that evolve and interact through impacts, ligament breakup, sheet rupture and engulfment of air bubbles in the liquid.","A rich gallery of images of entangled structures is produced.","Most processes occurring in this type of atomization are reproduced in detail, except at the instant of thin sheet perforation or breakup.","We analyze drop statistics, showing that as the grid resolution is increased, the small-scale part of the distribution does not converge, and contains a large number of droplets close in order of magnitude to the minimum grid size with a significant peak at $d = 3\\Delta$ .","This non-convergence arises from the {\\em numerical sheet breakup} effect, in which the interface becomes rough just before it breaks.","The rough appearance of the interface is associated to a high-wavenumber oscillation of the curvature.","To recover convergence, we apply the controlled \"manifold death\" numerical procedure, in which thin sheets are detected, and then pierced by fiat before they reach a set critical thickness $h_c$ that is always larger than $6 \\Delta$. This allows convergence of the droplet frequency above a certain critical diameter $d_c$ above and close to $h_c$. A unimodal distribution is observed in the converged range."],"url":"http://arxiv.org/abs/2405.01959v1","category":"physics.flu-dyn"} +{"created":"2024-05-03 09:23:51","title":"Probabilistic Lagrangian bias estimators and the cumulant bias expansion","abstract":"The spatial distribution of galaxies is a highly complex phenomenon currently impossible to predict deterministically. However, by using a statistical $\\textit{bias}$ relation, it becomes possible to robustly model the average abundance of galaxies as a function of the underlying matter density field. Understanding the properties and parametric description of the bias relation is key to extract cosmological information from future galaxy surveys. Here, we contribute to this topic primarily in two ways: (1) We develop a new set of probabilistic estimators for bias parameters using the moments of the Lagrangian galaxy environment distribution. These estimators include spatial corrections at different orders to measure bias parameters independently of the damping scale. We report robust measurements of a variety of bias parameters for haloes, including the tidal bias and its dependence with spin at a fixed mass. (2) We propose an alternative formulation of the bias expansion in terms of \"cumulant bias parameters\" that describe the response of the logarithmic galaxy density to large-scale perturbations. We find that cumulant biases of haloes are consistent with zero at orders $n > 2$. This suggests that: (i) previously reported bias relations at order $n > 2$ are an artefact of the entangled basis of the canonical bias expansion; (ii) the convergence of the bias expansion may be improved by phrasing it in terms of cumulants; (iii) the bias function is very well approximated by a Gaussian -- an avenue which we explore in a companion paper.","sentences":["The spatial distribution of galaxies is a highly complex phenomenon currently impossible to predict deterministically.","However, by using a statistical $\\textit{bias}$ relation, it becomes possible to robustly model the average abundance of galaxies as a function of the underlying matter density field.","Understanding the properties and parametric description of the bias relation is key to extract cosmological information from future galaxy surveys.","Here, we contribute to this topic primarily in two ways: (1) We develop a new set of probabilistic estimators for bias parameters using the moments of the Lagrangian galaxy environment distribution.","These estimators include spatial corrections at different orders to measure bias parameters independently of the damping scale.","We report robust measurements of a variety of bias parameters for haloes, including the tidal bias and its dependence with spin at a fixed mass.","(2) We propose an alternative formulation of the bias expansion in terms of \"cumulant bias parameters\" that describe the response of the logarithmic galaxy density to large-scale perturbations.","We find that cumulant biases of haloes are consistent with zero at orders $n > 2$.","This suggests that: (i) previously reported bias relations at order $n > 2$ are an artefact of the entangled basis of the canonical bias expansion; (ii) the convergence of the bias expansion may be improved by phrasing it in terms of cumulants; (iii) the bias function is very well approximated by a Gaussian -- an avenue which we explore in a companion paper."],"url":"http://arxiv.org/abs/2405.01950v1","category":"astro-ph.CO"} +{"created":"2024-05-03 09:20:50","title":"Complex pattern formation governed by a Cahn-Hilliard-Swift-Hohenberg system: Analysis and numerical simulations","abstract":"This paper investigates a Cahn-Hilliard-Swift-Hohenberg system, focusing on a three-species chemical mixture subject to physical constraints on volume fractions. The resulting system leads to complex patterns involving a separation into phases as typical of the Cahn-Hilliard equation and small scale stripes and dots as seen in the Swift-Hohenberg equation. We introduce singular potentials of logarithmic type to enhance the model's accuracy in adhering to essential physical constraints. The paper establishes the existence and uniqueness of weak solutions within this extended framework. The insights gained contribute to a deeper understanding of phase separation in complex systems, with potential applications in materials science and related fields. We introduce a stable finite element approximation based on an obstacle formulation. Subsequent numerical simulations demonstrate that the model allows for complex structures as seen in pigment patterns of animals and in porous polymeric materials.","sentences":["This paper investigates a Cahn-Hilliard-Swift-Hohenberg system, focusing on a three-species chemical mixture subject to physical constraints on volume fractions.","The resulting system leads to complex patterns involving a separation into phases as typical of the Cahn-Hilliard equation and small scale stripes and dots as seen in the Swift-Hohenberg equation.","We introduce singular potentials of logarithmic type to enhance the model's accuracy in adhering to essential physical constraints.","The paper establishes the existence and uniqueness of weak solutions within this extended framework.","The insights gained contribute to a deeper understanding of phase separation in complex systems, with potential applications in materials science and related fields.","We introduce a stable finite element approximation based on an obstacle formulation.","Subsequent numerical simulations demonstrate that the model allows for complex structures as seen in pigment patterns of animals and in porous polymeric materials."],"url":"http://arxiv.org/abs/2405.01947v1","category":"math.AP"} +{"created":"2024-05-03 09:19:24","title":"Shortcuts to adiabaticity in harmonic traps: a quantum-classical analog","abstract":"We present a new technique for efficiently transitioning a quantum system from an initial to a final stationary state in less time than is required by an adiabatic (quasi-static) process. Our approach makes use of Nelson's stochastic quantization, which represents the quantum system as a classical Brownian process. Thanks to this mathematical analogy, known protocols for classical overdamped systems can be translated into quantum protocols. In particular, one can use classical methods to find optimal quantum protocols that minimize both the time duration and some other cost function to be freely specified. We have applied this method to the time-dependent harmonic oscillator and tested it on two different cost functions: (i) the cumulative energy of the system over time and (ii) the dynamical phase of the wavefunction. In the latter case, it is possible to construct protocols that are \"adiabatically optimal\", i.e., they minimize their distance from an adiabatic process for a given duration.","sentences":["We present a new technique for efficiently transitioning a quantum system from an initial to a final stationary state in less time than is required by an adiabatic (quasi-static) process.","Our approach makes use of Nelson's stochastic quantization, which represents the quantum system as a classical Brownian process.","Thanks to this mathematical analogy, known protocols for classical overdamped systems can be translated into quantum protocols.","In particular, one can use classical methods to find optimal quantum protocols that minimize both the time duration and some other cost function to be freely specified.","We have applied this method to the time-dependent harmonic oscillator and tested it on two different cost functions: (i) the cumulative energy of the system over time and (ii) the dynamical phase of the wavefunction.","In the latter case, it is possible to construct protocols that are \"adiabatically optimal\", i.e., they minimize their distance from an adiabatic process for a given duration."],"url":"http://arxiv.org/abs/2405.01946v1","category":"quant-ph"} +{"created":"2024-05-03 09:07:01","title":"Giant effective $g$-factor due to spin bifurcations in polariton condensates","abstract":"We predict giant susceptibility of spin-bifurcating polariton condensates to externally applied permanent magnetic field. In the presence of spin-anisotropic polariton-polariton interactions, the condensate spontaneously takes an elliptically polarised state, {whose perturbation dynamics can be interpreted in terms of the presence of strong effective magnetic field} significantly surpassing the external one. Surprisingly, this behaviour of the addressed strongly out-of-equilibrium system in the vicinity of a critical point exhibits intriguing analogy with the second-order phase transition. The predicted field-enhancement effect can be utilized for creation of topologically nontrivial states of Bogoliubov's excitations existing on top of the polariton condensate.","sentences":["We predict giant susceptibility of spin-bifurcating polariton condensates to externally applied permanent magnetic field.","In the presence of spin-anisotropic polariton-polariton interactions, the condensate spontaneously takes an elliptically polarised state, {whose perturbation dynamics can be interpreted in terms of the presence of strong effective magnetic field} significantly surpassing the external one.","Surprisingly, this behaviour of the addressed strongly out-of-equilibrium system in the vicinity of a critical point exhibits intriguing analogy with the second-order phase transition.","The predicted field-enhancement effect can be utilized for creation of topologically nontrivial states of Bogoliubov's excitations existing on top of the polariton condensate."],"url":"http://arxiv.org/abs/2405.01941v1","category":"cond-mat.mes-hall"} +{"created":"2024-05-03 09:05:05","title":"On the Relative Completeness of Satisfaction-based Quantum Hoare Logic","abstract":"Quantum Hoare logic (QHL) is a formal verification tool specifically designed to ensure the correctness of quantum programs. There has been an ongoing challenge to achieve a relatively complete satisfaction-based QHL with while-loop since its inception in 2006. This paper presents a solution by proposing the first relatively complete satisfaction-based QHL with while-loop. The completeness is proved in two steps. First, we establish a semantics and proof system of Hoare triples with quantum programs and deterministic assertions. Then, by utilizing the weakest precondition of deterministic assertion, we construct the weakest preterm calculus of probabilistic expressions. The relative completeness of QHL is then obtained as a consequence of the weakest preterm calculus. Using our QHL, we formally verify the correctness of Deutsch's algorithm and quantum teleportation.","sentences":["Quantum Hoare logic (QHL) is a formal verification tool specifically designed to ensure the correctness of quantum programs.","There has been an ongoing challenge to achieve a relatively complete satisfaction-based QHL with while-loop since its inception in 2006.","This paper presents a solution by proposing the first relatively complete satisfaction-based QHL with while-loop.","The completeness is proved in two steps.","First, we establish a semantics and proof system of Hoare triples with quantum programs and deterministic assertions.","Then, by utilizing the weakest precondition of deterministic assertion, we construct the weakest preterm calculus of probabilistic expressions.","The relative completeness of QHL is then obtained as a consequence of the weakest preterm calculus.","Using our QHL, we formally verify the correctness of Deutsch's algorithm and quantum teleportation."],"url":"http://arxiv.org/abs/2405.01940v1","category":"cs.LO"} +{"created":"2024-05-03 09:04:09","title":"Conservative semi-lagrangian finite difference scheme for transport simulations using graph neural networks","abstract":"Semi-Lagrangian (SL) schemes are highly efficient for simulating transport equations and are widely used across various applications. Despite their success, designing genuinely multi-dimensional and conservative SL schemes remains a significant challenge. Building on our previous work [Chen et al., J. Comput. Phys., V490 112329, (2023)], we introduce a conservative machine-learning-based SL finite difference (FD) method that allows for extra-large time step evolution. At the core of our approach is a novel dynamical graph neural network designed to handle the complexities associated with tracking accurately upstream points along characteristics. This proposed neural transport solver learns the conservative SL FD discretization directly from data, improving accuracy and efficiency compared to traditional numerical schemes, while significantly simplifying algorithm implementation. We validate the method' s effectiveness and efficiency through numerical tests on benchmark transport equations in both one and two dimensions, as well as the nonlinear Vlasov-Poisson system.","sentences":["Semi-Lagrangian (SL) schemes are highly efficient for simulating transport equations and are widely used across various applications.","Despite their success, designing genuinely multi-dimensional and conservative SL schemes remains a significant challenge.","Building on our previous work [Chen et al., J. Comput.","Phys., V490 112329, (2023)], we introduce a conservative machine-learning-based SL finite difference (FD) method that allows for extra-large time step evolution.","At the core of our approach is a novel dynamical graph neural network designed to handle the complexities associated with tracking accurately upstream points along characteristics.","This proposed neural transport solver learns the conservative SL FD discretization directly from data, improving accuracy and efficiency compared to traditional numerical schemes, while significantly simplifying algorithm implementation.","We validate the method' s effectiveness and efficiency through numerical tests on benchmark transport equations in both one and two dimensions, as well as the nonlinear Vlasov-Poisson system."],"url":"http://arxiv.org/abs/2405.01938v1","category":"math.NA"} +{"created":"2024-05-03 08:44:31","title":"Novel Local Characteristic Decomposition Based Path-Conservative Central-Upwind Schemes","abstract":"We introduce local characteristic decomposition based path-conservative central-upwind schemes for (nonconservative) hyperbolic systems of balance laws. The proposed schemes are made to be well-balanced via a flux globalization approach, in which source terms are incorporated into the fluxes: This helps to enforce the well-balanced property when the resulting quasi-conservative system is solved using the local characteristic decomposition based central-upwind scheme recently introduced in [{\\sc A. Chertock, S. Chu, M. Herty, A. Kurganov, and M. Luk\\'{a}\\v{c}ov\\'{a}-Medvi{\\softd}ov\\'{a}}, J. Comput. Phys., 473 (2023), Paper No. 111718]. Nonconservative product terms are also incorporated into the global fluxes using a path-conservative technique. We illustrate the performance of the developed schemes by applying them to one- and two-dimensional compressible multifluid systems and thermal rotating shallow water equations.","sentences":["We introduce local characteristic decomposition based path-conservative central-upwind schemes for (nonconservative) hyperbolic systems of balance laws.","The proposed schemes are made to be well-balanced via a flux globalization approach, in which source terms are incorporated into the fluxes: This helps to enforce the well-balanced property when the resulting quasi-conservative system is solved using the local characteristic decomposition based central-upwind scheme recently introduced in [{\\sc A. Chertock, S. Chu, M. Herty, A. Kurganov, and M. Luk\\'{a}\\v{c}ov\\'{a}-Medvi{\\softd}ov\\'{a}}, J. Comput.","Phys., 473 (2023), Paper No. 111718].","Nonconservative product terms are also incorporated into the global fluxes using a path-conservative technique.","We illustrate the performance of the developed schemes by applying them to one-","and two-dimensional compressible multifluid systems and thermal rotating shallow water equations."],"url":"http://arxiv.org/abs/2405.01929v1","category":"math.NA"} +{"created":"2024-05-03 08:44:04","title":"SlotGAT: Slot-based Message Passing for Heterogeneous Graph Neural Network","abstract":"Heterogeneous graphs are ubiquitous to model complex data. There are urgent needs on powerful heterogeneous graph neural networks to effectively support important applications. We identify a potential semantic mixing issue in existing message passing processes, where the representations of the neighbors of a node $v$ are forced to be transformed to the feature space of $v$ for aggregation, though the neighbors are in different types. That is, the semantics in different node types are entangled together into node $v$'s representation. To address the issue, we propose SlotGAT with separate message passing processes in slots, one for each node type, to maintain the representations in their own node-type feature spaces. Moreover, in a slot-based message passing layer, we design an attention mechanism for effective slot-wise message aggregation. Further, we develop a slot attention technique after the last layer of SlotGAT, to learn the importance of different slots in downstream tasks. Our analysis indicates that the slots in SlotGAT can preserve different semantics in various feature spaces. The superiority of SlotGAT is evaluated against 13 baselines on 6 datasets for node classification and link prediction. Our code is at https://github.com/scottjiao/SlotGAT_ICML23/.","sentences":["Heterogeneous graphs are ubiquitous to model complex data.","There are urgent needs on powerful heterogeneous graph neural networks to effectively support important applications.","We identify a potential semantic mixing issue in existing message passing processes, where the representations of the neighbors of a node $v$ are forced to be transformed to the feature space of $v$ for aggregation, though the neighbors are in different types.","That is, the semantics in different node types are entangled together into node $v$'s representation.","To address the issue, we propose SlotGAT with separate message passing processes in slots, one for each node type, to maintain the representations in their own node-type feature spaces.","Moreover, in a slot-based message passing layer, we design an attention mechanism for effective slot-wise message aggregation.","Further, we develop a slot attention technique after the last layer of SlotGAT, to learn the importance of different slots in downstream tasks.","Our analysis indicates that the slots in SlotGAT can preserve different semantics in various feature spaces.","The superiority of SlotGAT is evaluated against 13 baselines on 6 datasets for node classification and link prediction.","Our code is at https://github.com/scottjiao/SlotGAT_ICML23/."],"url":"http://arxiv.org/abs/2405.01927v1","category":"cs.LG"} +{"created":"2024-05-03 08:23:39","title":"Lightweight Change Detection in Heterogeneous Remote Sensing Images with Online All-Integer Pruning Training","abstract":"Detection of changes in heterogeneous remote sensing images is vital, especially in response to emergencies like earthquakes and floods. Current homogenous transformation-based change detection (CD) methods often suffer from high computation and memory costs, which are not friendly to edge-computation devices like onboard CD devices at satellites. To address this issue, this paper proposes a new lightweight CD method for heterogeneous remote sensing images that employs the online all-integer pruning (OAIP) training strategy to efficiently fine-tune the CD network using the current test data. The proposed CD network consists of two visual geometry group (VGG) subnetworks as the backbone architecture. In the OAIP-based training process, all the weights, gradients, and intermediate data are quantized to integers to speed up training and reduce memory usage, where the per-layer block exponentiation scaling scheme is utilized to reduce the computation errors of network parameters caused by quantization. Second, an adaptive filter-level pruning method based on the L1-norm criterion is employed to further lighten the fine-tuning process of the CD network. Experimental results show that the proposed OAIP-based method attains similar detection performance (but with significantly reduced computation complexity and memory usage) in comparison with state-of-the-art CD methods.","sentences":["Detection of changes in heterogeneous remote sensing images is vital, especially in response to emergencies like earthquakes and floods.","Current homogenous transformation-based change detection (CD) methods often suffer from high computation and memory costs, which are not friendly to edge-computation devices like onboard CD devices at satellites.","To address this issue, this paper proposes a new lightweight CD method for heterogeneous remote sensing images that employs the online all-integer pruning (OAIP) training strategy to efficiently fine-tune the CD network using the current test data.","The proposed CD network consists of two visual geometry group (VGG) subnetworks as the backbone architecture.","In the OAIP-based training process, all the weights, gradients, and intermediate data are quantized to integers to speed up training and reduce memory usage, where the per-layer block exponentiation scaling scheme is utilized to reduce the computation errors of network parameters caused by quantization.","Second, an adaptive filter-level pruning method based on the L1-norm criterion is employed to further lighten the fine-tuning process of the CD network.","Experimental results show that the proposed OAIP-based method attains similar detection performance (but with significantly reduced computation complexity and memory usage) in comparison with state-of-the-art CD methods."],"url":"http://arxiv.org/abs/2405.01920v1","category":"cs.CV"} +{"created":"2024-05-03 17:37:00","title":"The injectivity radius of the compact Stiefel manifold under the Euclidean metric","abstract":"The injectivity radius of a manifold is an important quantity, both from a theoretical point of view and in terms of numerical applications. It is the largest possible radius within which all geodesics are unique and length-minimizing. In consequence, it is the largest possible radius within which calculations in Riemannian normal coordinates are well-defined. A matrix manifold that arises frequently in a wide range of practical applications is the compact Stiefel manifold of orthogonal $p$-frames in $\\mathbb{R}^n$. We observe that geodesics on this manifold are space curves of constant Frenet curvatures. Using this fact, we prove that the injectivity radius on the Stiefel manifold under the Euclidean metric is $\\pi$.","sentences":["The injectivity radius of a manifold is an important quantity, both from a theoretical point of view and in terms of numerical applications.","It is the largest possible radius within which all geodesics are unique and length-minimizing.","In consequence, it is the largest possible radius within which calculations in Riemannian normal coordinates are well-defined.","A matrix manifold that arises frequently in a wide range of practical applications is the compact Stiefel manifold of orthogonal $p$-frames in $\\mathbb{R}^n$. We observe that geodesics on this manifold are space curves of constant Frenet curvatures.","Using this fact, we prove that the injectivity radius on the Stiefel manifold under the Euclidean metric is $\\pi$."],"url":"http://arxiv.org/abs/2405.02268v1","category":"math.DG"} +{"created":"2024-05-03 17:24:08","title":"Comparing Personalized Relevance Algorithms for Directed Graphs","abstract":"We present an interactive Web platform that, given a directed graph, allows identifying the most relevant nodes related to a given query node. Besides well-established algorithms such as PageRank and Personalized PageRank, the demo includes Cyclerank, a novel algorithm that addresses some of their limitations by leveraging cyclic paths to compute personalized relevance scores. Our demo design enables two use cases: (a) algorithm comparison, comparing the results obtained with different algorithms, and (b) dataset comparison, for exploring and gaining insights into a dataset and comparing it with others. We provide 50 pre-loaded datasets from Wikipedia, Twitter, and Amazon and seven algorithms. Users can upload new datasets, and new algorithms can be easily added. By showcasing efficient algorithms to compute relevance scores in directed graphs, our tool helps to uncover hidden relationships within the data, which makes of it a valuable addition to the repertoire of graph analysis algorithms.","sentences":["We present an interactive Web platform that, given a directed graph, allows identifying the most relevant nodes related to a given query node.","Besides well-established algorithms such as PageRank and Personalized PageRank, the demo includes Cyclerank, a novel algorithm that addresses some of their limitations by leveraging cyclic paths to compute personalized relevance scores.","Our demo design enables two use cases: (a) algorithm comparison, comparing the results obtained with different algorithms, and (b) dataset comparison, for exploring and gaining insights into a dataset and comparing it with others.","We provide 50 pre-loaded datasets from Wikipedia, Twitter, and Amazon and seven algorithms.","Users can upload new datasets, and new algorithms can be easily added.","By showcasing efficient algorithms to compute relevance scores in directed graphs, our tool helps to uncover hidden relationships within the data, which makes of it a valuable addition to the repertoire of graph analysis algorithms."],"url":"http://arxiv.org/abs/2405.02261v1","category":"cs.IR"} +{"created":"2024-05-03 17:22:15","title":"Leveraging Large Language Models to Enhance Domain Expert Inclusion in Data Science Workflows","abstract":"Domain experts can play a crucial role in guiding data scientists to optimize machine learning models while ensuring contextual relevance for downstream use. However, in current workflows, such collaboration is challenging due to differing expertise, abstract documentation practices, and lack of access and visibility into low-level implementation artifacts. To address these challenges and enable domain expert participation, we introduce CellSync, a collaboration framework comprising (1) a Jupyter Notebook extension that continuously tracks changes to dataframes and model metrics and (2) a Large Language Model powered visualization dashboard that makes those changes interpretable to domain experts. Through CellSync's cell-level dataset visualization with code summaries, domain experts can interactively examine how individual data and modeling operations impact different data segments. The chat features enable data-centric conversations and targeted feedback to data scientists. Our preliminary evaluation shows that CellSync provides transparency and promotes critical discussions about the intents and implications of data operations.","sentences":["Domain experts can play a crucial role in guiding data scientists to optimize machine learning models while ensuring contextual relevance for downstream use.","However, in current workflows, such collaboration is challenging due to differing expertise, abstract documentation practices, and lack of access and visibility into low-level implementation artifacts.","To address these challenges and enable domain expert participation, we introduce CellSync, a collaboration framework comprising (1) a Jupyter Notebook extension that continuously tracks changes to dataframes and model metrics and (2) a Large Language Model powered visualization dashboard that makes those changes interpretable to domain experts.","Through CellSync's cell-level dataset visualization with code summaries, domain experts can interactively examine how individual data and modeling operations impact different data segments.","The chat features enable data-centric conversations and targeted feedback to data scientists.","Our preliminary evaluation shows that CellSync provides transparency and promotes critical discussions about the intents and implications of data operations."],"url":"http://arxiv.org/abs/2405.02260v1","category":"cs.HC"} +{"created":"2024-05-03 17:12:59","title":"QCD analysis of valence structure functions using deep inelastic lepton-nucleon scattering","abstract":"A new ''$\\mathtt{SK24}$'' non-singlet QCD analysis of the structure functions at the NNLO approximation is performed, utilizing the global fit of the data from various charged lepton scattering experiments. We extract the valence parton distribution functions (PDFs) and provide a parametrization of them, along with the correlated errors for a wide range of $x$ and $Q^2$. We compare valence PDFs and their uncertainties with those from different PDF sets provided by various groups. We also obtain valence PDFs and the strong coupling constant $\\alpha_{s}(M_Z^2)$, taking into account the nuclear correction concerning large $x$ as well as the target mass correction (TMC) and higher twist (HT) effects at the NNLO. In the large $x$ region, we extract the higher twist contributions of $xF_3(x,Q^2)$, $F_2^p (x,Q^2)$, and $F_2^d(x,Q^2)$. We determine $\\alpha_{s}(M_Z^2)$ without and with considering the TMC and HT corrections and perform a comparison with the world average of $\\alpha_{s}(M_Z^2)$ and other reported results. The extracted results concerning valence PDFs with their uncertainties and $\\alpha_{s}(M_Z^2)$ value agree with available theoretical models.","sentences":["A new ''$\\mathtt{SK24}$'' non-singlet QCD analysis of the structure functions at the NNLO approximation is performed, utilizing the global fit of the data from various charged lepton scattering experiments.","We extract the valence parton distribution functions (PDFs) and provide a parametrization of them, along with the correlated errors for a wide range of $x$ and $Q^2$. We compare valence PDFs and their uncertainties with those from different PDF sets provided by various groups.","We also obtain valence PDFs and the strong coupling constant $\\alpha_{s}(M_Z^2)$, taking into account the nuclear correction concerning large $x$ as well as the target mass correction (TMC) and higher twist (HT) effects at the NNLO.","In the large $x$ region, we extract the higher twist contributions of $xF_3(x,Q^2)$, $F_2^p (x,Q^2)$, and $F_2^d(x,Q^2)$. We determine $\\alpha_{s}(M_Z^2)$ without and with considering the TMC and HT corrections and perform a comparison with the world average of $\\alpha_{s}(M_Z^2)$ and other reported results.","The extracted results concerning valence PDFs with their uncertainties and $\\alpha_{s}(M_Z^2)$ value agree with available theoretical models."],"url":"http://arxiv.org/abs/2405.02254v1","category":"hep-ph"} +{"created":"2024-05-03 16:51:18","title":"Subgraph2vec: A random walk-based algorithm for embedding knowledge graphs","abstract":"Graph is an important data representation which occurs naturally in the real world applications \\cite{goyal2018graph}. Therefore, analyzing graphs provides users with better insights in different areas such as anomaly detection \\cite{ma2021comprehensive}, decision making \\cite{fan2023graph}, clustering \\cite{tsitsulin2023graph}, classification \\cite{wang2021mixup} and etc. However, most of these methods require high levels of computational time and space. We can use other ways like embedding to reduce these costs. Knowledge graph (KG) embedding is a technique that aims to achieve the vector representation of a KG. It represents entities and relations of a KG in a low-dimensional space while maintaining the semantic meanings of them. There are different methods for embedding graphs including random walk-based methods such as node2vec, metapath2vec and regpattern2vec. However, most of these methods bias the walks based on a rigid pattern usually hard-coded in the algorithm. In this work, we introduce \\textit{subgraph2vec} for embedding KGs where walks are run inside a user-defined subgraph. We use this embedding for link prediction and prove our method has better performance in most cases in comparison with the previous ones.","sentences":["Graph is an important data representation which occurs naturally in the real world applications \\cite{goyal2018graph}.","Therefore, analyzing graphs provides users with better insights in different areas such as anomaly detection \\cite{ma2021comprehensive}, decision making \\cite{fan2023graph}, clustering \\cite{tsitsulin2023graph}, classification \\cite{wang2021mixup} and etc.","However, most of these methods require high levels of computational time and space.","We can use other ways like embedding to reduce these costs.","Knowledge graph (KG) embedding is a technique that aims to achieve the vector representation of a KG.","It represents entities and relations of a KG in a low-dimensional space while maintaining the semantic meanings of them.","There are different methods for embedding graphs including random walk-based methods such as node2vec, metapath2vec and regpattern2vec.","However, most of these methods bias the walks based on a rigid pattern usually hard-coded in the algorithm.","In this work, we introduce \\textit{subgraph2vec} for embedding KGs where walks are run inside a user-defined subgraph.","We use this embedding for link prediction and prove our method has better performance in most cases in comparison with the previous ones."],"url":"http://arxiv.org/abs/2405.02240v1","category":"cs.LG"} +{"created":"2024-05-03 15:57:22","title":"Position Paper: Rethinking Empirical Research in Machine Learning: Addressing Epistemic and Methodological Challenges of Experimentation","abstract":"We warn against a common but incomplete understanding of empirical research in machine learning (ML) that leads to non-replicable results, makes findings unreliable, and threatens to undermine progress in the field. To overcome this alarming situation, we call for more awareness of the plurality of ways of gaining knowledge experimentally but also of some epistemic limitations. In particular, we argue most current empirical ML research is fashioned as confirmatory research while it should rather be considered exploratory.","sentences":["We warn against a common but incomplete understanding of empirical research in machine learning (ML) that leads to non-replicable results, makes findings unreliable, and threatens to undermine progress in the field.","To overcome this alarming situation, we call for more awareness of the plurality of ways of gaining knowledge experimentally but also of some epistemic limitations.","In particular, we argue most current empirical ML research is fashioned as confirmatory research while it should rather be considered exploratory."],"url":"http://arxiv.org/abs/2405.02200v1","category":"cs.LG"} +{"created":"2024-05-03 15:48:48","title":"Tangentially Active Polymers in Cylindrical Channels","abstract":"We present an analytical and computational study characterizing the structural and dynamical properties of an active filament confined in cylindrical channels. We first outline the effects of the interplay between confinement and polar self-propulsion on the conformation of the chains. We observe that the scaling of the polymer size in the channel, quantified by the end-to-end distance, shows different anomalous behaviours at different confinement and activity conditions. Interestingly, we show that the universal relation, describing the ratio between the end-to-end distance of passive polymer chains in cylindrical channels and in bulk is broken by activity. Finally, we show that the long-time diffusion coefficient under confinement can be rationalised by an analytical model, that takes into account the presence of the channel and the elongated nature of the polymer.","sentences":["We present an analytical and computational study characterizing the structural and dynamical properties of an active filament confined in cylindrical channels.","We first outline the effects of the interplay between confinement and polar self-propulsion on the conformation of the chains.","We observe that the scaling of the polymer size in the channel, quantified by the end-to-end distance, shows different anomalous behaviours at different confinement and activity conditions.","Interestingly, we show that the universal relation, describing the ratio between the end-to-end distance of passive polymer chains in cylindrical channels and in bulk is broken by activity.","Finally, we show that the long-time diffusion coefficient under confinement can be rationalised by an analytical model, that takes into account the presence of the channel and the elongated nature of the polymer."],"url":"http://arxiv.org/abs/2405.02192v1","category":"cond-mat.soft"} +{"created":"2024-05-03 15:45:08","title":"Edge-length preserving embeddings of graphs between normed spaces","abstract":"The concept of graph flattenability, initially formalized by Belk and Connelly and later expanded by Sitharam and Willoughby, extends the question of embedding finite metric spaces into a given normed space. A finite simple graph $G=(V,E)$ is said to be $(X,Y)$-flattenable if any set of induced edge lengths from an embedding of $G$ into a normed space $Y$ can also be realised by an embedding of $G$ into a normed space $X$. This property, being minor-closed, can be characterized by a finite list of forbidden minors. Following the establishment of fundamental results about $(X,Y)$-flattenability, we identify sufficient conditions under which it implies independence with respect to the associated rigidity matroids for $X$ and $Y$. We show that the spaces $\\ell_2$ and $\\ell_\\infty$ serve as two natural extreme spaces of flattenability and discuss $(X, \\ell_p )$-flattenability for varying $p$. We provide a complete characterization of $(X,Y)$-flattenable graphs for the specific case when $X$ is 2-dimensional and $Y$ is infinite-dimensional.","sentences":["The concept of graph flattenability, initially formalized by Belk and Connelly and later expanded by Sitharam and Willoughby, extends the question of embedding finite metric spaces into a given normed space.","A finite simple graph $G=(V,E)$ is said to be $(X,Y)$-flattenable if any set of induced edge lengths from an embedding of $G$ into a normed space $Y$ can also be realised by an embedding of $G$ into a normed space $X$.","This property, being minor-closed, can be characterized by a finite list of forbidden minors.","Following the establishment of fundamental results about $(X,Y)$-flattenability, we identify sufficient conditions under which it implies independence with respect to the associated rigidity matroids for $X$ and $Y$. We show that the spaces $\\ell_2$ and $\\ell_\\infty$ serve as two natural extreme spaces of flattenability and discuss $(X, \\ell_p )$-flattenability for varying $p$. We provide a complete characterization of $(X,Y)$-flattenable graphs for the specific case when $X$ is 2-dimensional and $Y$ is infinite-dimensional."],"url":"http://arxiv.org/abs/2405.02189v1","category":"math.MG"} +{"created":"2024-05-03 15:31:18","title":"Metalearners for Ranking Treatment Effects","abstract":"Efficiently allocating treatments with a budget constraint constitutes an important challenge across various domains. In marketing, for example, the use of promotions to target potential customers and boost conversions is limited by the available budget. While much research focuses on estimating causal effects, there is relatively limited work on learning to allocate treatments while considering the operational context. Existing methods for uplift modeling or causal inference primarily estimate treatment effects, without considering how this relates to a profit maximizing allocation policy that respects budget constraints. The potential downside of using these methods is that the resulting predictive model is not aligned with the operational context. Therefore, prediction errors are propagated to the optimization of the budget allocation problem, subsequently leading to a suboptimal allocation policy. We propose an alternative approach based on learning to rank. Our proposed methodology directly learns an allocation policy by prioritizing instances in terms of their incremental profit. We propose an efficient sampling procedure for the optimization of the ranking model to scale our methodology to large-scale data sets. Theoretically, we show how learning to rank can maximize the area under a policy's incremental profit curve. Empirically, we validate our methodology and show its effectiveness in practice through a series of experiments on both synthetic and real-world data.","sentences":["Efficiently allocating treatments with a budget constraint constitutes an important challenge across various domains.","In marketing, for example, the use of promotions to target potential customers and boost conversions is limited by the available budget.","While much research focuses on estimating causal effects, there is relatively limited work on learning to allocate treatments while considering the operational context.","Existing methods for uplift modeling or causal inference primarily estimate treatment effects, without considering how this relates to a profit maximizing allocation policy that respects budget constraints.","The potential downside of using these methods is that the resulting predictive model is not aligned with the operational context.","Therefore, prediction errors are propagated to the optimization of the budget allocation problem, subsequently leading to a suboptimal allocation policy.","We propose an alternative approach based on learning to rank.","Our proposed methodology directly learns an allocation policy by prioritizing instances in terms of their incremental profit.","We propose an efficient sampling procedure for the optimization of the ranking model to scale our methodology to large-scale data sets.","Theoretically, we show how learning to rank can maximize the area under a policy's incremental profit curve.","Empirically, we validate our methodology and show its effectiveness in practice through a series of experiments on both synthetic and real-world data."],"url":"http://arxiv.org/abs/2405.02183v1","category":"cs.LG"} +{"created":"2024-05-03 15:22:04","title":"All-fiber microendoscopic polarization sensing at single-photon level aided by deep-learning","abstract":"The polarization of light conveys crucial information about the spatial ordering and optical properties of a specimen. However, precise polarization measurement in challenging conditions, including constrained spaces, low light levels, and high-speed scenarios, remains a severe challenge. Addressing this problem, we introduce a real-time polarization measurement method accurate down to a single-photon level that provides complete information about the polarization state. Free of moving components, the polarization sensor utilizes a few-mode fiber followed by a fiber array and a detector array. The calibration of the sensor relies on a neural network yielding unprecedented accuracy across all polarization states, including partially polarized light. We validate the approach by visualizing the polarization structure of a biological specimen. Our method offers an efficient and reliable solution for real-time polarization sensing and microendoscopy under low-light conditions.","sentences":["The polarization of light conveys crucial information about the spatial ordering and optical properties of a specimen.","However, precise polarization measurement in challenging conditions, including constrained spaces, low light levels, and high-speed scenarios, remains a severe challenge.","Addressing this problem, we introduce a real-time polarization measurement method accurate down to a single-photon level that provides complete information about the polarization state.","Free of moving components, the polarization sensor utilizes a few-mode fiber followed by a fiber array and a detector array.","The calibration of the sensor relies on a neural network yielding unprecedented accuracy across all polarization states, including partially polarized light.","We validate the approach by visualizing the polarization structure of a biological specimen.","Our method offers an efficient and reliable solution for real-time polarization sensing and microendoscopy under low-light conditions."],"url":"http://arxiv.org/abs/2405.02172v1","category":"physics.optics"} +{"created":"2024-05-03 15:19:11","title":"Fourier-Laplace transforms in polynomial Ornstein-Uhlenbeck volatility models","abstract":"We consider the Fourier-Laplace transforms of a broad class of polynomial Ornstein-Uhlenbeck (OU) volatility models, including the well-known Stein-Stein, Sch\\\"obel-Zhu, one-factor Bergomi, and the recently introduced Quintic OU models motivated by the SPX-VIX joint calibration problem. We show the connection between the joint Fourier-Laplace functional of the log-price and the integrated variance, and the solution of an infinite dimensional Riccati equation. Next, under some non-vanishing conditions of the Fourier-Laplace transforms, we establish an existence result for such Riccati equation and we provide a discretized approximation of the joint characteristic functional that is exponentially entire. On the practical side, we develop a numerical scheme to solve the stiff infinite dimensional Riccati equations and demonstrate the efficiency and accuracy of the scheme for pricing SPX options and volatility swaps using Fourier and Laplace inversions, with specific examples of the Quintic OU and the one-factor Bergomi models and their calibration to real market data.","sentences":["We consider the Fourier-Laplace transforms of a broad class of polynomial Ornstein-Uhlenbeck (OU) volatility models, including the well-known Stein-Stein, Sch\\\"obel-Zhu, one-factor Bergomi, and the recently introduced Quintic OU models motivated by the SPX-VIX joint calibration problem.","We show the connection between the joint Fourier-Laplace functional of the log-price and the integrated variance, and the solution of an infinite dimensional Riccati equation.","Next, under some non-vanishing conditions of the Fourier-Laplace transforms, we establish an existence result for such Riccati equation and we provide a discretized approximation of the joint characteristic functional that is exponentially entire.","On the practical side, we develop a numerical scheme to solve the stiff infinite dimensional Riccati equations and demonstrate the efficiency and accuracy of the scheme for pricing SPX options and volatility swaps using Fourier and Laplace inversions, with specific examples of the Quintic OU and the one-factor Bergomi models and their calibration to real market data."],"url":"http://arxiv.org/abs/2405.02170v1","category":"q-fin.MF"} +{"created":"2024-05-03 14:19:40","title":"Can We Identify Unknown Audio Recording Environments in Forensic Scenarios?","abstract":"Audio recordings may provide important evidence in criminal investigations. One such case is the forensic association of the recorded audio to the recording location. For example, a voice message may be the only investigative cue to narrow down the candidate sites for a crime. Up to now, several works provide tools for closed-set recording environment classification under relatively clean recording conditions. However, in forensic investigations, the candidate locations are case-specific. Thus, closed-set tools are not applicable without retraining on a sufficient amount of training samples for each case and respective candidate set. In addition, a forensic tool has to deal with audio material from uncontrolled sources with variable properties and quality. In this work, we therefore attempt a major step towards practical forensic application scenarios. We propose a representation learning framework called EnvId, short for environment identification. EnvId avoids case-specific retraining. Instead, it is the first tool for robust few-shot classification of unseen environment locations. We demonstrate that EnvId can handle forensically challenging material. It provides good quality predictions even under unseen signal degradations, environment characteristics or recording position mismatches. Our code and datasets will be made publicly available upon acceptance.","sentences":["Audio recordings may provide important evidence in criminal investigations.","One such case is the forensic association of the recorded audio to the recording location.","For example, a voice message may be the only investigative cue to narrow down the candidate sites for a crime.","Up to now, several works provide tools for closed-set recording environment classification under relatively clean recording conditions.","However, in forensic investigations, the candidate locations are case-specific.","Thus, closed-set tools are not applicable without retraining on a sufficient amount of training samples for each case and respective candidate set.","In addition, a forensic tool has to deal with audio material from uncontrolled sources with variable properties and quality. ","In this work, we therefore attempt a major step towards practical forensic application scenarios.","We propose a representation learning framework called EnvId, short for environment identification.","EnvId avoids case-specific retraining.","Instead, it is the first tool for robust few-shot classification of unseen environment locations.","We demonstrate that EnvId can handle forensically challenging material.","It provides good quality predictions even under unseen signal degradations, environment characteristics or recording position mismatches. ","Our code and datasets will be made publicly available upon acceptance."],"url":"http://arxiv.org/abs/2405.02119v1","category":"cs.SD"} +{"created":"2024-05-03 14:17:07","title":"On variable annuities with surrender charges","abstract":"In this paper we provide a theoretical analysis of Variable Annuities with a focus on the holder's right to an early termination of the contract. We obtain a rigorous pricing formula and the optimal exercise boundary for the surrender option. We also illustrate our theoretical results with extensive numerical experiments. The pricing problem is formulated as an optimal stopping problem with a time-dependent payoff which is discontinuous at the maturity of the contract and non-smooth. This structure leads to non-monotonic optimal stopping boundaries which we prove nevertheless to be continuous and regular in the sense of diffusions for the stopping set. The lack of monotonicity of the boundary makes it impossible to use classical methods from optimal stopping. Also more recent results about Lipschitz continuous boundaries are not applicable in our setup. Thus, we contribute a new methodology for non-monotone stopping boundaries.","sentences":["In this paper we provide a theoretical analysis of Variable Annuities with a focus on the holder's right to an early termination of the contract.","We obtain a rigorous pricing formula and the optimal exercise boundary for the surrender option.","We also illustrate our theoretical results with extensive numerical experiments.","The pricing problem is formulated as an optimal stopping problem with a time-dependent payoff which is discontinuous at the maturity of the contract and non-smooth.","This structure leads to non-monotonic optimal stopping boundaries which we prove nevertheless to be continuous and regular in the sense of diffusions for the stopping set.","The lack of monotonicity of the boundary makes it impossible to use classical methods from optimal stopping.","Also more recent results about Lipschitz continuous boundaries are not applicable in our setup.","Thus, we contribute a new methodology for non-monotone stopping boundaries."],"url":"http://arxiv.org/abs/2405.02115v1","category":"q-fin.MF"} +{"created":"2024-05-03 13:35:34","title":"Sharp Maximal function estimates for Multilinear pseudo-differential operators of type (0,0)","abstract":"In this paper, we study sharp maximal function estimates for multilinear pseudo-differential operators. Our target is operators of type (0, 0) for which a differentiation does not make any decay of the associated symbol. Analogous results for operators of type (\\rho, \\rho), 0 < \\rho < 1, appeared in an earlier work of the authors, but a different approach is given for \\rho = 0","sentences":["In this paper, we study sharp maximal function estimates for multilinear pseudo-differential operators.","Our target is operators of type (0, 0) for which a differentiation does not make any decay of the associated symbol.","Analogous results for operators of type (\\rho, \\rho), 0 <","\\rho < 1, appeared in an earlier work of the authors, but a different approach is given for \\rho = 0"],"url":"http://arxiv.org/abs/2405.02093v1","category":"math.AP"} +{"created":"2024-05-03 13:29:40","title":"Master equations with indefinite nonlinearities","abstract":"In this paper, we consider the following indefinite fully fractional heat equation involving the master operator \\begin{equation*} (\\partial_t -\\Delta)^{s} u(x,t) = x_1u^p(x,t)\\ \\ \\mbox{in}\\ \\R^n\\times\\R , \\end{equation*} where $s\\in(0,1)$, and $-\\infty < p < \\infty$. Under mild conditions, we prove that there is no positive bounded solutions. To this end, we first show that the solutions are strictly increasing along $x_1$ direction by employing the direct method of moving planes. Then by constructing an unbounded sub-solution, we derive the nonexistence of bounded solutions. To circumvent the difficulties caused by the fully fractional master operator, we introduced some new ideas and novel approaches that, as we believe, will become useful tool in studying a variety of other fractional elliptic and parabolic problems.","sentences":["In this paper, we consider the following indefinite fully fractional heat equation involving the master operator \\begin{equation*} (\\partial_t -\\Delta)^{s} u(x,t) = x_1u^p(x,t)\\ \\ \\mbox{in}\\ \\R^n\\times\\R , \\end{equation*} where $s\\in(0,1)$, and $-\\infty < p < \\infty$. Under mild conditions, we prove that there is no positive bounded solutions.","To this end, we first show that the solutions are strictly increasing along $x_1$ direction by employing the direct method of moving planes.","Then by constructing an unbounded sub-solution, we derive the nonexistence of bounded solutions. ","To circumvent the difficulties caused by the fully fractional master operator, we introduced some new ideas and novel approaches that, as we believe, will become useful tool in studying a variety of other fractional elliptic and parabolic problems."],"url":"http://arxiv.org/abs/2405.02091v1","category":"math.AP"} +{"created":"2024-05-03 13:23:09","title":"Testing for an Explosive Bubble using High-Frequency Volatility","abstract":"Based on a continuous-time stochastic volatility model with a linear drift, we develop a test for explosive behavior in financial asset prices at a low frequency when prices are sampled at a higher frequency. The test exploits the volatility information in the high-frequency data. The method consists of devolatizing log-asset price increments with realized volatility measures and performing a supremum-type recursive Dickey-Fuller test on the devolatized sample. The proposed test has a nuisance-parameter-free asymptotic distribution and is easy to implement. We study the size and power properties of the test in Monte Carlo simulations. A real-time date-stamping strategy based on the devolatized sample is proposed for the origination and conclusion dates of the explosive regime. Conditions under which the real-time date-stamping strategy is consistent are established. The test and the date-stamping strategy are applied to study explosive behavior in cryptocurrency and stock markets.","sentences":["Based on a continuous-time stochastic volatility model with a linear drift, we develop a test for explosive behavior in financial asset prices at a low frequency when prices are sampled at a higher frequency.","The test exploits the volatility information in the high-frequency data.","The method consists of devolatizing log-asset price increments with realized volatility measures and performing a supremum-type recursive Dickey-Fuller test on the devolatized sample.","The proposed test has a nuisance-parameter-free asymptotic distribution and is easy to implement.","We study the size and power properties of the test in Monte Carlo simulations.","A real-time date-stamping strategy based on the devolatized sample is proposed for the origination and conclusion dates of the explosive regime.","Conditions under which the real-time date-stamping strategy is consistent are established.","The test and the date-stamping strategy are applied to study explosive behavior in cryptocurrency and stock markets."],"url":"http://arxiv.org/abs/2405.02087v1","category":"econ.EM"} +{"created":"2024-05-03 13:10:16","title":"MVP-Shot: Multi-Velocity Progressive-Alignment Framework for Few-Shot Action Recognition","abstract":"Recent few-shot action recognition (FSAR) methods achieve promising performance by performing semantic matching on learned discriminative features. However, most FSAR methods focus on single-scale (e.g., frame-level, segment-level, \\etc) feature alignment, which ignores that human actions with the same semantic may appear at different velocities. To this end, we develop a novel Multi-Velocity Progressive-alignment (MVP-Shot) framework to progressively learn and align semantic-related action features at multi-velocity levels. Concretely, a Multi-Velocity Feature Alignment (MVFA) module is designed to measure the similarity between features from support and query videos with different velocity scales and then merge all similarity scores in a residual fashion. To avoid the multiple velocity features deviating from the underlying motion semantic, our proposed Progressive Semantic-Tailored Interaction (PSTI) module injects velocity-tailored text information into the video feature via feature interaction on channel and temporal domains at different velocities. The above two modules compensate for each other to predict query categories more accurately under the few-shot settings. Experimental results show our method outperforms current state-of-the-art methods on multiple standard few-shot benchmarks (i.e., HMDB51, UCF101, Kinetics, and SSv2-small).","sentences":["Recent few-shot action recognition (FSAR) methods achieve promising performance by performing semantic matching on learned discriminative features.","However, most FSAR methods focus on single-scale (e.g., frame-level, segment-level, \\etc) feature alignment, which ignores that human actions with the same semantic may appear at different velocities.","To this end, we develop a novel Multi-Velocity Progressive-alignment (MVP-Shot) framework to progressively learn and align semantic-related action features at multi-velocity levels.","Concretely, a Multi-Velocity Feature Alignment (MVFA) module is designed to measure the similarity between features from support and query videos with different velocity scales and then merge all similarity scores in a residual fashion.","To avoid the multiple velocity features deviating from the underlying motion semantic, our proposed Progressive Semantic-Tailored Interaction (PSTI) module injects velocity-tailored text information into the video feature via feature interaction on channel and temporal domains at different velocities.","The above two modules compensate for each other to predict query categories more accurately under the few-shot settings.","Experimental results show our method outperforms current state-of-the-art methods on multiple standard few-shot benchmarks (i.e., HMDB51, UCF101, Kinetics, and SSv2-small)."],"url":"http://arxiv.org/abs/2405.02077v1","category":"cs.CV"} +{"created":"2024-05-03 12:48:21","title":"Few-sample Variational Inference of Bayesian Neural Networks with Arbitrary Nonlinearities","abstract":"Bayesian Neural Networks (BNNs) extend traditional neural networks to provide uncertainties associated with their outputs. On the forward pass through a BNN, predictions (and their uncertainties) are made either by Monte Carlo sampling network weights from the learned posterior or by analytically propagating statistical moments through the network. Though flexible, Monte Carlo sampling is computationally expensive and can be infeasible or impractical under resource constraints or for large networks. While moment propagation can ameliorate the computational costs of BNN inference, it can be difficult or impossible for networks with arbitrary nonlinearities, thereby restricting the possible set of network layers permitted with such a scheme. In this work, we demonstrate a simple yet effective approach for propagating statistical moments through arbitrary nonlinearities with only 3 deterministic samples, enabling few-sample variational inference of BNNs without restricting the set of network layers used. Furthermore, we leverage this approach to demonstrate a novel nonlinear activation function that we use to inject physics-informed prior information into output nodes of a BNN.","sentences":["Bayesian Neural Networks (BNNs) extend traditional neural networks to provide uncertainties associated with their outputs.","On the forward pass through a BNN, predictions (and their uncertainties) are made either by Monte Carlo sampling network weights from the learned posterior or by analytically propagating statistical moments through the network.","Though flexible, Monte Carlo sampling is computationally expensive and can be infeasible or impractical under resource constraints or for large networks.","While moment propagation can ameliorate the computational costs of BNN inference, it can be difficult or impossible for networks with arbitrary nonlinearities, thereby restricting the possible set of network layers permitted with such a scheme.","In this work, we demonstrate a simple yet effective approach for propagating statistical moments through arbitrary nonlinearities with only 3 deterministic samples, enabling few-sample variational inference of BNNs without restricting the set of network layers used.","Furthermore, we leverage this approach to demonstrate a novel nonlinear activation function that we use to inject physics-informed prior information into output nodes of a BNN."],"url":"http://arxiv.org/abs/2405.02063v1","category":"cs.LG"} +{"created":"2024-05-03 12:20:24","title":"Sampling to Achieve the Goal: An Age-aware Remote Markov Decision Process","abstract":"Age of Information (AoI) has been recognized as an important metric to measure the freshness of information. Central to this consensus is that minimizing AoI can enhance the freshness of information, thereby facilitating the accuracy of subsequent decision-making processes. However, to date the direct causal relationship that links AoI to the utility of the decision-making process is unexplored. To fill this gap, this paper provides a sampling-control co-design problem, referred to as an age-aware remote Markov Decision Process (MDP) problem, to explore this unexplored relationship. Our framework revisits the sampling problem in [1] with a refined focus: moving from AoI penalty minimization to directly optimizing goal-oriented remote decision-making process under random delay. We derive that the age-aware remote MDP problem can be reduced to a standard MDP problem without delays, and reveal that treating AoI solely as a metric for optimization is not optimal in achieving remote decision making. Instead, AoI can serve as important side information to facilitate remote decision making.","sentences":["Age of Information (AoI) has been recognized as an important metric to measure the freshness of information.","Central to this consensus is that minimizing AoI can enhance the freshness of information, thereby facilitating the accuracy of subsequent decision-making processes.","However, to date the direct causal relationship that links AoI to the utility of the decision-making process is unexplored.","To fill this gap, this paper provides a sampling-control co-design problem, referred to as an age-aware remote Markov Decision Process (MDP) problem, to explore this unexplored relationship.","Our framework revisits the sampling problem in [1] with a refined focus: moving from AoI penalty minimization to directly optimizing goal-oriented remote decision-making process under random delay.","We derive that the age-aware remote MDP problem can be reduced to a standard MDP problem without delays, and reveal that treating AoI solely as a metric for optimization is not optimal in achieving remote decision making.","Instead, AoI can serve as important side information to facilitate remote decision making."],"url":"http://arxiv.org/abs/2405.02042v1","category":"cs.IT"} +{"created":"2024-05-03 11:08:04","title":"HoloGS: Instant Depth-based 3D Gaussian Splatting with Microsoft HoloLens 2","abstract":"In the fields of photogrammetry, computer vision and computer graphics, the task of neural 3D scene reconstruction has led to the exploration of various techniques. Among these, 3D Gaussian Splatting stands out for its explicit representation of scenes using 3D Gaussians, making it appealing for tasks like 3D point cloud extraction and surface reconstruction. Motivated by its potential, we address the domain of 3D scene reconstruction, aiming to leverage the capabilities of the Microsoft HoloLens 2 for instant 3D Gaussian Splatting. We present HoloGS, a novel workflow utilizing HoloLens sensor data, which bypasses the need for pre-processing steps like Structure from Motion by instantly accessing the required input data i.e. the images, camera poses and the point cloud from depth sensing. We provide comprehensive investigations, including the training process and the rendering quality, assessed through the Peak Signal-to-Noise Ratio, and the geometric 3D accuracy of the densified point cloud from Gaussian centers, measured by Chamfer Distance. We evaluate our approach on two self-captured scenes: An outdoor scene of a cultural heritage statue and an indoor scene of a fine-structured plant. Our results show that the HoloLens data, including RGB images, corresponding camera poses, and depth sensing based point clouds to initialize the Gaussians, are suitable as input for 3D Gaussian Splatting.","sentences":["In the fields of photogrammetry, computer vision and computer graphics, the task of neural 3D scene reconstruction has led to the exploration of various techniques.","Among these, 3D Gaussian Splatting stands out for its explicit representation of scenes using 3D Gaussians, making it appealing for tasks like 3D point cloud extraction and surface reconstruction.","Motivated by its potential, we address the domain of 3D scene reconstruction, aiming to leverage the capabilities of the Microsoft HoloLens 2 for instant 3D Gaussian Splatting.","We present HoloGS, a novel workflow utilizing HoloLens sensor data, which bypasses the need for pre-processing steps like Structure from Motion by instantly accessing the required input data i.e. the images, camera poses and the point cloud from depth sensing.","We provide comprehensive investigations, including the training process and the rendering quality, assessed through the Peak Signal-to-Noise Ratio, and the geometric 3D accuracy of the densified point cloud from Gaussian centers, measured by Chamfer Distance.","We evaluate our approach on two self-captured scenes: An outdoor scene of a cultural heritage statue and an indoor scene of a fine-structured plant.","Our results show that the HoloLens data, including RGB images, corresponding camera poses, and depth sensing based point clouds to initialize the Gaussians, are suitable as input for 3D Gaussian Splatting."],"url":"http://arxiv.org/abs/2405.02005v1","category":"cs.CV"} +{"created":"2024-05-03 10:50:30","title":"Cooperation and Federation in Distributed Radar Point Cloud Processing","abstract":"The paper considers the problem of human-scale RF sensing utilizing a network of resource-constrained MIMO radars with low range-azimuth resolution. The radars operate in the mmWave band and obtain time-varying 3D point cloud (PC) information that is sensitive to body movements. They also observe the same scene from different views and cooperate while sensing the environment using a sidelink communication channel. Conventional cooperation setups allow the radars to mutually exchange raw PC information to improve ego sensing. The paper proposes a federation mechanism where the radars exchange the parameters of a Bayesian posterior measure of the observed PCs, rather than raw data. The radars act as distributed parameter servers to reconstruct a global posterior (i.e., federated posterior) using Bayesian tools. The paper quantifies and compares the benefits of radar federation with respect to cooperation mechanisms. Both approaches are validated by experiments with a real-time demonstration platform. Federation makes minimal use of the sidelink communication channel (20 {\\div} 25 times lower bandwidth use) and is less sensitive to unresolved targets. On the other hand, cooperation reduces the mean absolute target estimation error of about 20%.","sentences":["The paper considers the problem of human-scale RF sensing utilizing a network of resource-constrained MIMO radars with low range-azimuth resolution.","The radars operate in the mmWave band and obtain time-varying 3D point cloud (PC) information that is sensitive to body movements.","They also observe the same scene from different views and cooperate while sensing the environment using a sidelink communication channel.","Conventional cooperation setups allow the radars to mutually exchange raw PC information to improve ego sensing.","The paper proposes a federation mechanism where the radars exchange the parameters of a Bayesian posterior measure of the observed PCs, rather than raw data.","The radars act as distributed parameter servers to reconstruct a global posterior (i.e., federated posterior) using Bayesian tools.","The paper quantifies and compares the benefits of radar federation with respect to cooperation mechanisms.","Both approaches are validated by experiments with a real-time demonstration platform.","Federation makes minimal use of the sidelink communication channel (20 {\\div} 25 times lower bandwidth use) and is less sensitive to unresolved targets.","On the other hand, cooperation reduces the mean absolute target estimation error of about 20%."],"url":"http://arxiv.org/abs/2405.01995v1","category":"cs.LG"} +{"created":"2024-05-03 10:40:25","title":"A comparison of regression models for static and dynamic prediction of a prognostic outcome during admission in electronic health care records","abstract":"Objective Hospitals register information in the electronic health records (EHR) continuously until discharge or death. As such, there is no censoring for in-hospital outcomes. We aimed to compare different dynamic regression modeling approaches to predict central line-associated bloodstream infections (CLABSI) in EHR while accounting for competing events precluding CLABSI. Materials and Methods We analyzed data from 30,862 catheter episodes at University Hospitals Leuven from 2012 and 2013 to predict 7-day risk of CLABSI. Competing events are discharge and death. Static models at catheter onset included logistic, multinomial logistic, Cox, cause-specific hazard, and Fine-Gray regression. Dynamic models updated predictions daily up to 30 days after catheter onset (i.e. landmarks 0 to 30 days), and included landmark supermodel extensions of the static models, separate Fine-Gray models per landmark time, and regularized multi-task learning (RMTL). Model performance was assessed using 100 random 2:1 train-test splits. Results The Cox model performed worst of all static models in terms of area under the receiver operating characteristic curve (AUC) and calibration. Dynamic landmark supermodels reached peak AUCs between 0.741-0.747 at landmark 5. The Cox landmark supermodel had the worst AUCs (<=0.731) and calibration up to landmark 7. Separate Fine-Gray models per landmark performed worst for later landmarks, when the number of patients at risk was low. Discussion and Conclusion Categorical and time-to-event approaches had similar performance in the static and dynamic settings, except Cox models. Ignoring competing risks caused problems for risk prediction in the time-to-event framework (Cox), but not in the categorical framework (logistic regression).","sentences":["Objective Hospitals register information in the electronic health records (EHR) continuously until discharge or death.","As such, there is no censoring for in-hospital outcomes.","We aimed to compare different dynamic regression modeling approaches to predict central line-associated bloodstream infections (CLABSI) in EHR while accounting for competing events precluding CLABSI.","Materials and Methods We analyzed data from 30,862 catheter episodes at University Hospitals Leuven from 2012 and 2013 to predict 7-day risk of CLABSI.","Competing events are discharge and death.","Static models at catheter onset included logistic, multinomial logistic, Cox, cause-specific hazard, and Fine-Gray regression.","Dynamic models updated predictions daily up to 30 days after catheter onset (i.e. landmarks 0 to 30 days), and included landmark supermodel extensions of the static models, separate Fine-Gray models per landmark time, and regularized multi-task learning (RMTL).","Model performance was assessed using 100 random 2:1 train-test splits.","Results The Cox model performed worst of all static models in terms of area under the receiver operating characteristic curve (AUC) and calibration.","Dynamic landmark supermodels reached peak AUCs between 0.741-0.747 at landmark 5.","The Cox landmark supermodel had the worst AUCs (<=0.731) and calibration up to landmark 7.","Separate Fine-Gray models per landmark performed worst for later landmarks, when the number of patients at risk was low.","Discussion and Conclusion Categorical and time-to-event approaches had similar performance in the static and dynamic settings, except Cox models.","Ignoring competing risks caused problems for risk prediction in the time-to-event framework (Cox), but not in the categorical framework (logistic regression)."],"url":"http://arxiv.org/abs/2405.01986v1","category":"stat.AP"} +{"created":"2024-05-03 09:54:20","title":"Bayesian approach to coherent combination of single photon beams","abstract":"We theoretically investigate the performance of coherent beam combination of two light beams under relative phase fluctuations in the photon starved regime. We apply a first-principles approach using the optimal Bayesian phase correction protocol. We analyze the efficiency of beam combination as a function of the phase fluctuations strength.","sentences":["We theoretically investigate the performance of coherent beam combination of two light beams under relative phase fluctuations in the photon starved regime.","We apply a first-principles approach using the optimal Bayesian phase correction protocol.","We analyze the efficiency of beam combination as a function of the phase fluctuations strength."],"url":"http://arxiv.org/abs/2405.01973v1","category":"quant-ph"} +{"created":"2024-05-03 09:49:11","title":"Hydrologic Cycle Weakening in Hothouse Climates","abstract":"The hydrologic cycle has wide impacts on the ocean salinity and circulation, carbon and nitrogen cycles, and the ecosystem. Under anthropogenic global warming, previous studies showed that the intensification of the hydrologic cycle is a robust feature. Whether this trend persists in hothouse climates, however, is unknown. Here we show in climate models that mean precipitation first increases with rising surface temperature, but the precipitation trend reverses when the surface is hotter than ~320-330 K. This non-monotonic phenomenon is robust to the cause of warming, convection scheme, ocean dynamics, atmospheric mass, planetary rotation, gravity, and stellar spectrum. The weakening occurs because of the existence of an upper limitation of outgoing longwave emission and the continuously increasing shortwave absorption by H2O, and is consistent with atmospheric dynamics featuring the strong increase of atmospheric stratification and dramatic reduction of convective mass flux. These results have wide implications for the climate evolutions of Earth, Venus, and potentially habitable exoplanets.","sentences":["The hydrologic cycle has wide impacts on the ocean salinity and circulation, carbon and nitrogen cycles, and the ecosystem.","Under anthropogenic global warming, previous studies showed that the intensification of the hydrologic cycle is a robust feature.","Whether this trend persists in hothouse climates, however, is unknown.","Here we show in climate models that mean precipitation first increases with rising surface temperature, but the precipitation trend reverses when the surface is hotter than ~320-330 K. This non-monotonic phenomenon is robust to the cause of warming, convection scheme, ocean dynamics, atmospheric mass, planetary rotation, gravity, and stellar spectrum.","The weakening occurs because of the existence of an upper limitation of outgoing longwave emission and the continuously increasing shortwave absorption by H2O, and is consistent with atmospheric dynamics featuring the strong increase of atmospheric stratification and dramatic reduction of convective mass flux.","These results have wide implications for the climate evolutions of Earth, Venus, and potentially habitable exoplanets."],"url":"http://arxiv.org/abs/2405.01969v1","category":"physics.ao-ph"} +{"created":"2024-05-03 09:33:20","title":"Improved distance correlation estimation","abstract":"Distance correlation is a novel class of multivariate dependence measure, taking positive values between 0 and 1, and applicable to random vectors of arbitrary dimensions, not necessarily equal. It offers several advantages over the well-known Pearson correlation coefficient, the most important is that distance correlation equals zero if and only if the random vectors are independent. There are two different estimators of the distance correlation available in the literature. The first one, proposed by Sz\\'ekely et al. (2007), is based on an asymptotically unbiased estimator of the distance covariance which turns out to be a V-statistic. The second one builds on an unbiased estimator of the distance covariance proposed in Sz\\'ekely et al. (2014), proved to be an U-statistic by Sz\\'ekely and Huo (2016). This study evaluates their efficiency (mean squared error) and compares computational times for both methods under different dependence structures. Under conditions of independence or near-independence, the V-estimates are biased, while the U-estimator frequently cannot be computed due to negative values. To address this challenge, a convex linear combination of the former estimators is proposed and studied, yielding good results regardless of the level of dependence.","sentences":["Distance correlation is a novel class of multivariate dependence measure, taking positive values between 0 and 1, and applicable to random vectors of arbitrary dimensions, not necessarily equal.","It offers several advantages over the well-known Pearson correlation coefficient, the most important is that distance correlation equals zero if and only if the random vectors are independent. ","There are two different estimators of the distance correlation available in the literature.","The first one, proposed by Sz\\'ekely et al. (2007), is based on an asymptotically unbiased estimator of the distance covariance which turns out to be a V-statistic.","The second one builds on an unbiased estimator of the distance covariance proposed in Sz\\'ekely et al. (2014), proved to be an U-statistic by Sz\\'ekely and Huo (2016).","This study evaluates their efficiency (mean squared error) and compares computational times for both methods under different dependence structures.","Under conditions of independence or near-independence, the V-estimates are biased, while the U-estimator frequently cannot be computed due to negative values.","To address this challenge, a convex linear combination of the former estimators is proposed and studied, yielding good results regardless of the level of dependence."],"url":"http://arxiv.org/abs/2405.01958v1","category":"stat.CO"} +{"created":"2024-05-03 09:31:03","title":"Saturation rank for nilradical of parabolic subalgebras in Type A","abstract":"Let $\\mfp(d)$ be a standard parabolic subalgebra of $\\mfsl_{n+1}(K)$ and $\\mfu$ be the corresponding nilradical defined over an algebraically closed field $K$ of characteristic $p>0$. We construct a finite connected quiver $Q(d)$, through which we provide a combinatorial characterization of the centralizer $c_{\\mfu}(x(d))$ of the Richardson element $x(d)$. We specifically focus on the centralizer when the levi factor of $\\mfp(d)$ is determined by either one or two simple roots. This allows us to demonstrate that, under certain mild restrictions, the saturation rank of $\\mfu$ equals the semisimple rank of the algebraic $K$-group $\\SL_{n+1}(K)$.","sentences":["Let $\\mfp(d)$ be a standard parabolic subalgebra of $\\mfsl_{n+1}(K)$ and $\\mfu$ be the corresponding nilradical defined over an algebraically closed field $K$ of characteristic $p>0$. We construct a finite connected quiver $Q(d)$, through which we provide a combinatorial characterization of the centralizer $c_{\\mfu}(x(d))$ of the Richardson element $x(d)$.","We specifically focus on the centralizer when the levi factor of $\\mfp(d)$ is determined by either one or two simple roots.","This allows us to demonstrate that, under certain mild restrictions, the saturation rank of $\\mfu$ equals the semisimple rank of the algebraic $K$-group $\\SL_{n+1}(K)$."],"url":"http://arxiv.org/abs/2405.01956v1","category":"math.RT"} +{"created":"2024-05-03 08:41:36","title":"A Modular, Tendon Driven Variable Stiffness Manipulator with Internal Routing for Improved Stability and Increased Payload Capacity","abstract":"Stability and reliable operation under a spectrum of environmental conditions is still an open challenge for soft and continuum style manipulators. The inability to carry sufficient load and effectively reject external disturbances are two drawbacks which limit the scale of continuum designs, preventing widespread adoption of this technology. To tackle these problems, this work details the design and experimental testing of a modular, tendon driven bead-style continuum manipulator with tunable stiffness. By embedding the ability to independently control the stiffness of distinct sections of the structure, the manipulator can regulate it's posture under greater loads of up to 1kg at the end-effector, with reference to the flexible state. Likewise, an internal routing scheme vastly improves the stability of the proximal segment when operating the distal segment, reducing deviations by at least 70.11%. Operation is validated when gravity is both tangential and perpendicular to the manipulator backbone, a feature uncommon in previous designs. The findings presented in this work are key to the development of larger scale continuum designs, demonstrating that flexibility and tip stability under loading can co-exist without compromise.","sentences":["Stability and reliable operation under a spectrum of environmental conditions is still an open challenge for soft and continuum style manipulators.","The inability to carry sufficient load and effectively reject external disturbances are two drawbacks which limit the scale of continuum designs, preventing widespread adoption of this technology.","To tackle these problems, this work details the design and experimental testing of a modular, tendon driven bead-style continuum manipulator with tunable stiffness.","By embedding the ability to independently control the stiffness of distinct sections of the structure, the manipulator can regulate it's posture under greater loads of up to 1kg at the end-effector, with reference to the flexible state.","Likewise, an internal routing scheme vastly improves the stability of the proximal segment when operating the distal segment, reducing deviations by at least 70.11%.","Operation is validated when gravity is both tangential and perpendicular to the manipulator backbone, a feature uncommon in previous designs.","The findings presented in this work are key to the development of larger scale continuum designs, demonstrating that flexibility and tip stability under loading can co-exist without compromise."],"url":"http://arxiv.org/abs/2405.01925v1","category":"cs.RO"} +{"created":"2024-05-03 17:14:51","title":"Electron Drag Effect on Thermal Conductivity in Two-dimensional Semiconductors","abstract":"Two-dimensional (2D) materials have shown great potential in applications as transistors, where thermal dissipation becomes crucial because of the increasing energy density. Although thermal conductivity of 2D materials has been extensively studied, interactions between nonequilibrium electrons and phonons, which can be strong when high electric fields and heat current coexist, are not considered. In this work, we systematically study the electron drag effect, where nonequilibrium electrons impart momenta to phonons and influence the thermal conductivity, in 2D semiconductors using ab initio simulations. We find that, at room temperature, electron drag can significantly increase thermal conductivity by decreasing phonon-electron scattering in 2D semiconductors, while its impact in three-dimensional (3D) semiconductors is negligible. We attribute this difference to the large electron-phonon scattering phase space and higher contribution to thermal conductivity by drag-active phonons. Our work elucidates the fundamental physics underlying coupled electron-phonon transport in materials of various dimensionalities.","sentences":["Two-dimensional (2D) materials have shown great potential in applications as transistors, where thermal dissipation becomes crucial because of the increasing energy density.","Although thermal conductivity of 2D materials has been extensively studied, interactions between nonequilibrium electrons and phonons, which can be strong when high electric fields and heat current coexist, are not considered.","In this work, we systematically study the electron drag effect, where nonequilibrium electrons impart momenta to phonons and influence the thermal conductivity, in 2D semiconductors using ab initio simulations.","We find that, at room temperature, electron drag can significantly increase thermal conductivity by decreasing phonon-electron scattering in 2D semiconductors, while its impact in three-dimensional (3D) semiconductors is negligible.","We attribute this difference to the large electron-phonon scattering phase space and higher contribution to thermal conductivity by drag-active phonons.","Our work elucidates the fundamental physics underlying coupled electron-phonon transport in materials of various dimensionalities."],"url":"http://arxiv.org/abs/2405.02257v1","category":"cond-mat.mtrl-sci"} +{"created":"2024-05-03 16:23:41","title":"Multispectral Fine-Grained Classification of Blackgrass in Wheat and Barley Crops","abstract":"As the burden of herbicide resistance grows and the environmental repercussions of excessive herbicide use become clear, new ways of managing weed populations are needed. This is particularly true for cereal crops, like wheat and barley, that are staple food crops and occupy a globally significant portion of agricultural land. Even small improvements in weed management practices across these major food crops worldwide would yield considerable benefits for both the environment and global food security. Blackgrass is a major grass weed which causes particular problems in cereal crops in north-west Europe, a major cereal production area, because it has high levels of of herbicide resistance and is well adapted to agronomic practice in this region. With the use of machine vision and multispectral imaging, we investigate the effectiveness of state-of-the-art methods to identify blackgrass in wheat and barley crops. As part of this work, we provide a large dataset with which we evaluate several key aspects of blackgrass weed recognition. Firstly, we determine the performance of different CNN and transformer-based architectures on images from unseen fields. Secondly, we demonstrate the role that different spectral bands have on the performance of weed classification. Lastly, we evaluate the role of dataset size in classification performance for each of the models trialled. We find that even with a fairly modest quantity of training data an accuracy of almost 90% can be achieved on images from unseen fields.","sentences":["As the burden of herbicide resistance grows and the environmental repercussions of excessive herbicide use become clear, new ways of managing weed populations are needed.","This is particularly true for cereal crops, like wheat and barley, that are staple food crops and occupy a globally significant portion of agricultural land.","Even small improvements in weed management practices across these major food crops worldwide would yield considerable benefits for both the environment and global food security.","Blackgrass is a major grass weed which causes particular problems in cereal crops in north-west Europe, a major cereal production area, because it has high levels of of herbicide resistance and is well adapted to agronomic practice in this region.","With the use of machine vision and multispectral imaging, we investigate the effectiveness of state-of-the-art methods to identify blackgrass in wheat and barley crops.","As part of this work, we provide a large dataset with which we evaluate several key aspects of blackgrass weed recognition.","Firstly, we determine the performance of different CNN and transformer-based architectures on images from unseen fields.","Secondly, we demonstrate the role that different spectral bands have on the performance of weed classification.","Lastly, we evaluate the role of dataset size in classification performance for each of the models trialled.","We find that even with a fairly modest quantity of training data an accuracy of almost 90% can be achieved on images from unseen fields."],"url":"http://arxiv.org/abs/2405.02218v1","category":"cs.CV"} +{"created":"2024-05-03 15:57:16","title":"Alfv\u00e9n Wave Conversion to Low Frequency Fast Magnetosonic Waves in Magnetar Magnetospheres","abstract":"Rapid shear motion of magnetar crust can launch Alfv\\'{e}n waves into the magnetosphere. The dissipation of the Alfv\\'{e}n waves has been theorized to power the X-ray bursts characteristic of magnetars. However, the process by which Alfv\\'{e}n waves convert their energy to X-rays is unclear. Recent work has suggested that energetic fast magnetosonic (fast) waves can be produced as a byproduct of Alfv\\'{e}n waves propagating on curved magnetic field lines; their subsequent dissipation may power X-ray bursts. In this work, we investigate the production of fast waves by performing axisymmetric force-free simulations of Alfv\\'{e}n waves propagating in a dipolar magnetosphere. For Alfv\\'{e}n wave trains that do not completely fill the flux tube confining them, we find a fast wave dominated by a low frequency component with a wavelength defined by the bouncing time of the Alfv\\'{e}n waves. In contrast, when the wave train is long enough to completely fill the flux tube, and the Alfv\\'{e}n waves overlap significantly, the energy is quickly converted into a fast wave with a higher frequency that corresponds to twice the Alfv\\'{e}n wave frequency. We investigate how the energy, duration, and wavelength of the initial Alfv\\'{e}n wave train affect the conversion efficiency to fast waves. For modestly energetic star quakes, we see that the fast waves that are produced will become non-linear well within the magnetosphere, and we comment on the X-ray emission that one may expect from such events.","sentences":["Rapid shear motion of magnetar crust can launch Alfv\\'{e}n waves into the magnetosphere.","The dissipation of the Alfv\\'{e}n waves has been theorized to power the X-ray bursts characteristic of magnetars.","However, the process by which Alfv\\'{e}n waves convert their energy to X-rays is unclear.","Recent work has suggested that energetic fast magnetosonic (fast) waves can be produced as a byproduct of Alfv\\'{e}n waves propagating on curved magnetic field lines; their subsequent dissipation may power X-ray bursts.","In this work, we investigate the production of fast waves by performing axisymmetric force-free simulations of Alfv\\'{e}n waves propagating in a dipolar magnetosphere.","For Alfv\\'{e}n wave trains that do not completely fill the flux tube confining them, we find a fast wave dominated by a low frequency component with a wavelength defined by the bouncing time of the Alfv\\'{e}n waves.","In contrast, when the wave train is long enough to completely fill the flux tube, and the Alfv\\'{e}n waves overlap significantly, the energy is quickly converted into a fast wave with a higher frequency that corresponds to twice the Alfv\\'{e}n wave frequency.","We investigate how the energy, duration, and wavelength of the initial Alfv\\'{e}n wave train affect the conversion efficiency to fast waves.","For modestly energetic star quakes, we see that the fast waves that are produced will become non-linear well within the magnetosphere, and we comment on the X-ray emission that one may expect from such events."],"url":"http://arxiv.org/abs/2405.02199v1","category":"astro-ph.HE"} +{"created":"2024-05-03 15:25:06","title":"Hausdorff dimension of the exceptional set of the law of large numbers in Pierce expansions","abstract":"The digits of Pierce expansion adhere to the law of large numbers. It is known that the Hausdorff dimension of the set of exceptions to the law of large numbers is $1$. We offer an elementary proof of this fact by adapting Jun Wu's method used in the context of Engel expansions.","sentences":["The digits of Pierce expansion adhere to the law of large numbers.","It is known that the Hausdorff dimension of the set of exceptions to the law of large numbers is $1$. We offer an elementary proof of this fact by adapting Jun Wu's method used in the context of Engel expansions."],"url":"http://arxiv.org/abs/2405.02174v1","category":"math.NT"} +{"created":"2024-05-03 13:39:30","title":"Numerical validation of an adaptive model for the determination of nonlinear-flow regions in highly heterogeneous porous media","abstract":"An adaptive model for the description of flows in highly heterogeneous porous media is developed in~\\cite{FP21,FP23}. There, depending on the magnitude of the fluid's velocity, the constitutive law linking velocity and pressure gradient is selected between two possible options, one better adapted to slow motion and the other to fast motion. We propose here to validate further this adaptive approach by means of more extensive numerical experiments, including a three-dimensional case, as well as to use such approach to determine a partition of the domain into slow- and fast-flow regions.","sentences":["An adaptive model for the description of flows in highly heterogeneous porous media is developed in~\\cite{FP21,FP23}.","There, depending on the magnitude of the fluid's velocity, the constitutive law linking velocity and pressure gradient is selected between two possible options, one better adapted to slow motion and the other to fast motion.","We propose here to validate further this adaptive approach by means of more extensive numerical experiments, including a three-dimensional case, as well as to use such approach to determine a partition of the domain into slow- and fast-flow regions."],"url":"http://arxiv.org/abs/2405.02094v1","category":"math.NA"} +{"created":"2024-05-03 12:40:55","title":"Radiative and mechanical energies in galaxies I. Contributions of molecular shocks and PDRs in 3C 326 N","abstract":"Context: Atomic and molecular lines in galaxies offer insights into energy budgets and feedback mechanisms. Aims: This study establishes a new framework for interpreting these lines and deducing energy budgets from observations. Methods: Atomic and molecular lines detected in a given object are assumed to result from the combination of distributions of shocks and photo-dissociation regions (PDR). Using the Paris-Durham shock code and the Meudon PDR code, emissions are computed over a wide range of parameters. Total emissions are calculated using probability distribution functions, with a defined distance metric based on observed and predicted intensity ratios. Results: We analyze the radio galaxy 3C 326 N, finding both shocks and PDRs necessary to explain the line fluxes. Viable solutions occur only at low densities ($\\rm n_H < 100 cm^{-3}$), indicating emission from diffuse interstellar matter. The optimal solution involves low-velocity shocks (5-20 km/s) in PDRs illuminated by UV radiation ten times stronger than in the solar neighborhood. The H$2$ 0-0 S(0) $28 \\mu$m, [CII] $158 \\mu$m, and [OI] $63 \\mu$m lines originate from PDRs, while other H$2$ lines mostly come from shocks. The reprocessed radiative and mechanical energies are $\\rm {L_{UV} = 6.3\\times10^9 L\\odot}$ and $\\rm {L_K = 3.9\\times10^8 L_\\odot}$, respectively, in agreement with 3C 326 N's infrared luminosity, and consistent with 1% of the AGN jet kinetic power dissipated in the interstellar medium. Conclusions: This study demonstrates that the radiative and mechanical energy budgets of galaxies can be derived from observations of atomic and molecular lines alone. It highlights the unexpected importance of the diffuse medium for 3C 326 N. Comparison with new JWST data for 3C 326 N shows striking agreement, opening new prospects for predicting and interpreting extragalactic observations.","sentences":["Context: Atomic and molecular lines in galaxies offer insights into energy budgets and feedback mechanisms.","Aims:","This study establishes a new framework for interpreting these lines and deducing energy budgets from observations.","Methods: Atomic and molecular lines detected in a given object are assumed to result from the combination of distributions of shocks and photo-dissociation regions (PDR).","Using the Paris-Durham shock code and the Meudon PDR code, emissions are computed over a wide range of parameters.","Total emissions are calculated using probability distribution functions, with a defined distance metric based on observed and predicted intensity ratios.","Results:","We analyze the radio galaxy 3C 326 N, finding both shocks and PDRs necessary to explain the line fluxes.","Viable solutions occur only at low densities ($\\rm n_H < 100 cm^{-3}$), indicating emission from diffuse interstellar matter.","The optimal solution involves low-velocity shocks (5-20 km/s) in PDRs illuminated by UV radiation ten times stronger than in the solar neighborhood.","The H$2$ 0-0 S(0) $28 \\mu$m, [CII] $158 \\mu$m, and [OI] $63 \\mu$m lines originate from PDRs, while other H$2$ lines mostly come from shocks.","The reprocessed radiative and mechanical energies are $\\rm {L_{UV} = 6.3\\times10^9 L\\odot}$ and $\\rm {L_K = 3.9\\times10^8 L_\\odot}$, respectively, in agreement with 3C 326 N's infrared luminosity, and consistent with 1% of the AGN jet kinetic power dissipated in the interstellar medium.","Conclusions: This study demonstrates that the radiative and mechanical energy budgets of galaxies can be derived from observations of atomic and molecular lines alone.","It highlights the unexpected importance of the diffuse medium for 3C 326 N. Comparison with new JWST data for 3C 326 N shows striking agreement, opening new prospects for predicting and interpreting extragalactic observations."],"url":"http://arxiv.org/abs/2405.02058v1","category":"astro-ph.GA"} +{"created":"2024-05-03 11:26:47","title":"Comparative analysis of spin wave imaging using nitrogen vacancy centers and time resolved magneto-optical measurements","abstract":"Spin waves, the fundamental excitations in magnetic materials, are promising candidates for realizing low-dissipation information processing in spintronics. The ability to visualize and manipulate coherent spin-wave transport is crucial for the development of spin wave-based devices. We use a recently discovered method utilizing nitrogen vacancy (NV) centers, point defects in the diamond lattice, to measure spin waves in thin film magnetic insulators by detecting their magnetic stray field. We experimentally demonstrate enhanced contrast in the detected wavefront amplitudes by imaging spin waves underneath a reference stripline and phenomenologically model the results. By extracting the spin wave dispersion and comparing NV center based spin wave measurements to spin wave imaging conducted through the well-established time-resolved magneto-optical Kerr effect, we discuss the advantages and limitations of employing NV centers as spin wave sensors.","sentences":["Spin waves, the fundamental excitations in magnetic materials, are promising candidates for realizing low-dissipation information processing in spintronics.","The ability to visualize and manipulate coherent spin-wave transport is crucial for the development of spin wave-based devices.","We use a recently discovered method utilizing nitrogen vacancy (NV) centers, point defects in the diamond lattice, to measure spin waves in thin film magnetic insulators by detecting their magnetic stray field.","We experimentally demonstrate enhanced contrast in the detected wavefront amplitudes by imaging spin waves underneath a reference stripline and phenomenologically model the results.","By extracting the spin wave dispersion and comparing NV center based spin wave measurements to spin wave imaging conducted through the well-established time-resolved magneto-optical Kerr effect, we discuss the advantages and limitations of employing NV centers as spin wave sensors."],"url":"http://arxiv.org/abs/2405.02014v1","category":"cond-mat.mes-hall"} +{"created":"2024-05-03 11:18:47","title":"The Trade-off between Performance, Efficiency, and Fairness in Adapter Modules for Text Classification","abstract":"Current natural language processing (NLP) research tends to focus on only one or, less frequently, two dimensions - e.g., performance, privacy, fairness, or efficiency - at a time, which may lead to suboptimal conclusions and often overlooking the broader goal of achieving trustworthy NLP. Work on adapter modules (Houlsby et al., 2019; Hu et al., 2021) focuses on improving performance and efficiency, with no investigation of unintended consequences on other aspects such as fairness. To address this gap, we conduct experiments on three text classification datasets by either (1) finetuning all parameters or (2) using adapter modules. Regarding performance and efficiency, we confirm prior findings that the accuracy of adapter-enhanced models is roughly on par with that of fully finetuned models, while training time is substantially reduced. Regarding fairness, we show that adapter modules result in mixed fairness across sensitive groups. Further investigation reveals that, when the standard fine-tuned model exhibits limited biases, adapter modules typically do not introduce extra bias. On the other hand, when the finetuned model exhibits increased bias, the impact of adapter modules on bias becomes more unpredictable, introducing the risk of significantly magnifying these biases for certain groups. Our findings highlight the need for a case-by-case evaluation rather than a one-size-fits-all judgment.","sentences":["Current natural language processing (NLP) research tends to focus on only one or, less frequently, two dimensions - e.g., performance, privacy, fairness, or efficiency - at a time, which may lead to suboptimal conclusions and often overlooking the broader goal of achieving trustworthy NLP.","Work on adapter modules (Houlsby et al., 2019;","Hu et al., 2021) focuses on improving performance and efficiency, with no investigation of unintended consequences on other aspects such as fairness.","To address this gap, we conduct experiments on three text classification datasets by either (1) finetuning all parameters or (2) using adapter modules.","Regarding performance and efficiency, we confirm prior findings that the accuracy of adapter-enhanced models is roughly on par with that of fully finetuned models, while training time is substantially reduced.","Regarding fairness, we show that adapter modules result in mixed fairness across sensitive groups.","Further investigation reveals that, when the standard fine-tuned model exhibits limited biases, adapter modules typically do not introduce extra bias.","On the other hand, when the finetuned model exhibits increased bias, the impact of adapter modules on bias becomes more unpredictable, introducing the risk of significantly magnifying these biases for certain groups.","Our findings highlight the need for a case-by-case evaluation rather than a one-size-fits-all judgment."],"url":"http://arxiv.org/abs/2405.02010v1","category":"cs.CL"} +{"created":"2024-05-03 10:16:13","title":"Upper tails of subgraph counts in directed random graphs","abstract":"The upper tail problem in a sparse Erd\\H{o}s-R\\'enyi graph asks for the probability that the number of copies of some fixed subgraph exceeds its expected value by a constant factor. We study the analogous problem for oriented subgraphs in directed random graphs. By adapting the proof of Cook, Dembo, and Pham, we reduce this upper tail problem to the asymptotic of a certain variational problem over edge weighted directed graphs. We give upper and lower bounds for the solution to the corresponding variational problem, which differ by a constant factor of at most $2$. We provide a host of subgraphs where the upper and lower bounds coincide, giving the solution to the upper tail problem. Examples of such digraphs include triangles, stars, directed $k$-cycles, and balanced digraphs.","sentences":["The upper tail problem in a sparse Erd\\H{o}s-R\\'enyi graph asks for the probability that the number of copies of some fixed subgraph exceeds its expected value by a constant factor.","We study the analogous problem for oriented subgraphs in directed random graphs.","By adapting the proof of Cook, Dembo, and Pham, we reduce this upper tail problem to the asymptotic of a certain variational problem over edge weighted directed graphs.","We give upper and lower bounds for the solution to the corresponding variational problem, which differ by a constant factor of at most $2$. We provide a host of subgraphs where the upper and lower bounds coincide, giving the solution to the upper tail problem.","Examples of such digraphs include triangles, stars, directed $k$-cycles, and balanced digraphs."],"url":"http://arxiv.org/abs/2405.01980v1","category":"math.PR"} +{"created":"2024-05-03 09:01:29","title":"Cut elimination for Cyclic Proofs: A Case Study in Temporal Logic","abstract":"We consider modal logic extended with the well-known temporal operator `eventually' and provide a cut-elimination procedure for a cyclic sequent calculus that captures this fragment. The work showcases an adaptation of the reductive cut-elimination method to cyclic calculi. Notably, the proposed algorithm applies to a cyclic proof and directly outputs a cyclic cut-free proof without appealing to intermediate machinery for regularising the end proof.","sentences":["We consider modal logic extended with the well-known temporal operator `eventually' and provide a cut-elimination procedure for a cyclic sequent calculus that captures this fragment.","The work showcases an adaptation of the reductive cut-elimination method to cyclic calculi.","Notably, the proposed algorithm applies to a cyclic proof and directly outputs a cyclic cut-free proof without appealing to intermediate machinery for regularising the end proof."],"url":"http://arxiv.org/abs/2405.01935v1","category":"cs.LO"} +{"created":"2024-05-03 08:33:58","title":"Task-Driven Computational Framework for Simultaneously Optimizing Design and Mounted Pose of Modular Reconfigurable Manipulators","abstract":"Modular reconfigurable manipulators enable quick adaptation and versatility to address different application environments and tailor to the specific requirements of the tasks. Task performance significantly depends on the manipulator's mounted pose and morphology design, therefore posing the need of methodologies for selecting suitable modular robot configurations and mounted pose that can address the specific task requirements and required performance. Morphological changes in modular robots can be derived through a discrete optimization process involving the selective addition or removal of modules. In contrast, the adjustment of the mounted pose operates within a continuous space, allowing for smooth and precise alterations in both orientation and position. This work introduces a computational framework that simultaneously optimizes modular manipulators' mounted pose and morphology. The core of the work is that we design a mapping function that \\textit{implicitly} captures the morphological state of manipulators in the continuous space. This transformation function unifies the optimization of mounted pose and morphology within a continuous space. Furthermore, our optimization framework incorporates a array of performance metrics, such as minimum joint effort and maximum manipulability, and considerations for trajectory execution error and physical and safety constraints. To highlight our method's benefits, we compare it with previous methods that framed such problem as a combinatorial optimization problem and demonstrate its practicality in selecting the modular robot configuration for executing a drilling task with the CONCERT modular robotic platform.","sentences":["Modular reconfigurable manipulators enable quick adaptation and versatility to address different application environments and tailor to the specific requirements of the tasks.","Task performance significantly depends on the manipulator's mounted pose and morphology design, therefore posing the need of methodologies for selecting suitable modular robot configurations and mounted pose that can address the specific task requirements and required performance.","Morphological changes in modular robots can be derived through a discrete optimization process involving the selective addition or removal of modules.","In contrast, the adjustment of the mounted pose operates within a continuous space, allowing for smooth and precise alterations in both orientation and position.","This work introduces a computational framework that simultaneously optimizes modular manipulators' mounted pose and morphology.","The core of the work is that we design a mapping function that \\textit{implicitly} captures the morphological state of manipulators in the continuous space.","This transformation function unifies the optimization of mounted pose and morphology within a continuous space.","Furthermore, our optimization framework incorporates a array of performance metrics, such as minimum joint effort and maximum manipulability, and considerations for trajectory execution error and physical and safety constraints.","To highlight our method's benefits, we compare it with previous methods that framed such problem as a combinatorial optimization problem and demonstrate its practicality in selecting the modular robot configuration for executing a drilling task with the CONCERT modular robotic platform."],"url":"http://arxiv.org/abs/2405.01923v1","category":"cs.RO"} +{"created":"2024-05-03 16:48:21","title":"A second-order semi-Lagrangian exponential scheme with application to the shallow-water equations on the rotating sphere","abstract":"In this work, we study and extend a class of semi-Lagrangian exponential methods, which combine exponential time integration techniques, suitable for integrating stiff linear terms, with a semi-Lagrangian treatment of nonlinear advection terms. Partial differential equations involving both processes arise for instance in atmospheric circulation models. Through a truncation error analysis, we first show that previously formulated semi-Lagrangian exponential schemes are limited to first-order accuracy due to the discretization of the linear term; we then formulate a new discretization leading to a second-order accurate method. Also, a detailed stability study, both considering a linear stability analysis and an empirical simulation-based one, is conducted to compare several Eulerian and semi-Lagrangian exponential schemes, as well as a well-established semi-Lagrangian semi-implicit method, which is used in operational atmospheric models. Numerical simulations of the shallow-water equations on the rotating sphere, considering standard and challenging benchmark test cases, are performed to assess the orders of convergence, stability properties, and computational cost of each method. The proposed second-order semi-Lagrangian exponential method was shown to be more stable and accurate than the previously formulated schemes of the same class at the expense of larger wall-clock times; however, the method is more stable and has a similar cost compared to the well-established semi-Lagrangian semi-implicit; therefore, it is a competitive candidate for potential operational applications in atmospheric circulation modeling.","sentences":["In this work, we study and extend a class of semi-Lagrangian exponential methods, which combine exponential time integration techniques, suitable for integrating stiff linear terms, with a semi-Lagrangian treatment of nonlinear advection terms.","Partial differential equations involving both processes arise for instance in atmospheric circulation models.","Through a truncation error analysis, we first show that previously formulated semi-Lagrangian exponential schemes are limited to first-order accuracy due to the discretization of the linear term; we then formulate a new discretization leading to a second-order accurate method.","Also, a detailed stability study, both considering a linear stability analysis and an empirical simulation-based one, is conducted to compare several Eulerian and semi-Lagrangian exponential schemes, as well as a well-established semi-Lagrangian semi-implicit method, which is used in operational atmospheric models.","Numerical simulations of the shallow-water equations on the rotating sphere, considering standard and challenging benchmark test cases, are performed to assess the orders of convergence, stability properties, and computational cost of each method.","The proposed second-order semi-Lagrangian exponential method was shown to be more stable and accurate than the previously formulated schemes of the same class at the expense of larger wall-clock times; however, the method is more stable and has a similar cost compared to the well-established semi-Lagrangian semi-implicit; therefore, it is a competitive candidate for potential operational applications in atmospheric circulation modeling."],"url":"http://arxiv.org/abs/2405.02237v1","category":"math.NA"} +{"created":"2024-05-03 16:28:05","title":"Discretization Error of Fourier Neural Operators","abstract":"Operator learning is a variant of machine learning that is designed to approximate maps between function spaces from data. The Fourier Neural Operator (FNO) is a common model architecture used for operator learning. The FNO combines pointwise linear and nonlinear operations in physical space with pointwise linear operations in Fourier space, leading to a parameterized map acting between function spaces. Although FNOs formally involve convolutions of functions on a continuum, in practice the computations are performed on a discretized grid, allowing efficient implementation via the FFT. In this paper, the aliasing error that results from such a discretization is quantified and algebraic rates of convergence in terms of the grid resolution are obtained as a function of the regularity of the input. Numerical experiments that validate the theory and describe model stability are performed.","sentences":["Operator learning is a variant of machine learning that is designed to approximate maps between function spaces from data.","The Fourier Neural Operator (FNO) is a common model architecture used for operator learning.","The FNO combines pointwise linear and nonlinear operations in physical space with pointwise linear operations in Fourier space, leading to a parameterized map acting between function spaces.","Although FNOs formally involve convolutions of functions on a continuum, in practice the computations are performed on a discretized grid, allowing efficient implementation via the FFT.","In this paper, the aliasing error that results from such a discretization is quantified and algebraic rates of convergence in terms of the grid resolution are obtained as a function of the regularity of the input.","Numerical experiments that validate the theory and describe model stability are performed."],"url":"http://arxiv.org/abs/2405.02221v1","category":"math.NA"} +{"created":"2024-05-03 16:27:39","title":"Designed Dithering Sign Activation for Binary Neural Networks","abstract":"Binary Neural Networks emerged as a cost-effective and energy-efficient solution for computer vision tasks by binarizing either network weights or activations. However, common binary activations, such as the Sign activation function, abruptly binarize the values with a single threshold, losing fine-grained details in the feature outputs. This work proposes an activation that applies multiple thresholds following dithering principles, shifting the Sign activation function for each pixel according to a spatially periodic threshold kernel. Unlike literature methods, the shifting is defined jointly for a set of adjacent pixels, taking advantage of spatial correlations. Experiments over the classification task demonstrate the effectiveness of the designed dithering Sign activation function as an alternative activation for binary neural networks, without increasing the computational cost. Further, DeSign balances the preservation of details with the efficiency of binary operations.","sentences":["Binary Neural Networks emerged as a cost-effective and energy-efficient solution for computer vision tasks by binarizing either network weights or activations.","However, common binary activations, such as the Sign activation function, abruptly binarize the values with a single threshold, losing fine-grained details in the feature outputs.","This work proposes an activation that applies multiple thresholds following dithering principles, shifting the Sign activation function for each pixel according to a spatially periodic threshold kernel.","Unlike literature methods, the shifting is defined jointly for a set of adjacent pixels, taking advantage of spatial correlations.","Experiments over the classification task demonstrate the effectiveness of the designed dithering Sign activation function as an alternative activation for binary neural networks, without increasing the computational cost.","Further, DeSign balances the preservation of details with the efficiency of binary operations."],"url":"http://arxiv.org/abs/2405.02220v1","category":"cs.CV"} +{"created":"2024-05-03 16:04:09","title":"Discrete harmonic maps between hyperbolic surfaces","abstract":"Given a topological cell decomposition of a closed surface equipped with edge weights, we consider the Dirichlet energy of any geodesic realization of the 1-skeleton graph to a hyperbolic surface. By minimizing the energy over all possible hyperbolic structures and over all realizations within a fixed homotopy class, one obtains a discrete harmonic map into an optimal hyperbolic surface. We characterize the extremum by showing that at the optimal hyperbolic structure, the discrete harmonic map and the edge weights are induced from a weighted Delaunay decomposition.","sentences":["Given a topological cell decomposition of a closed surface equipped with edge weights, we consider the Dirichlet energy of any geodesic realization of the 1-skeleton graph to a hyperbolic surface.","By minimizing the energy over all possible hyperbolic structures and over all realizations within a fixed homotopy class, one obtains a discrete harmonic map into an optimal hyperbolic surface.","We characterize the extremum by showing that at the optimal hyperbolic structure, the discrete harmonic map and the edge weights are induced from a weighted Delaunay decomposition."],"url":"http://arxiv.org/abs/2405.02205v1","category":"math.GT"} +{"created":"2024-05-03 15:09:11","title":"Measurement of atmospheric neutrino oscillation parameters using convolutional neural networks with 9.3 years of data in IceCube DeepCore","abstract":"The DeepCore sub-detector of the IceCube Neutrino Observatory provides access to neutrinos with energies above approximately 5 GeV. Data taken between 2012-2021 (3,387 days) are utilized for an atmospheric $\\nu_\\mu$ disappearance analysis that studied 150,257 neutrino-candidate events with reconstructed energies between 5-100 GeV. An advanced reconstruction based on a convolutional neural network is applied, providing increased signal efficiency and background suppression, resulting in a measurement with both significantly increased statistics compared to previous DeepCore oscillation results and high neutrino purity. For the normal neutrino mass ordering, the atmospheric neutrino oscillation parameters and their 1$\\sigma$ errors are measured to be $\\Delta$m$^2_{32}$ = $2.40\\substack{+0.05 \\\\ -0.04} \\times 10^{-3} \\textrm{ eV}^2$ and sin$^2$$\\theta_{23}$=$0.54\\substack{+0.04 \\\\ -0.03}$. The results are the most precise to date using atmospheric neutrinos, and are compatible with measurements from other neutrino detectors including long-baseline accelerator experiments.","sentences":["The DeepCore sub-detector of the IceCube Neutrino Observatory provides access to neutrinos with energies above approximately 5 GeV. Data taken between 2012-2021 (3,387 days) are utilized for an atmospheric $\\nu_\\mu$ disappearance analysis that studied 150,257 neutrino-candidate events with reconstructed energies between 5-100 GeV. An advanced reconstruction based on a convolutional neural network is applied, providing increased signal efficiency and background suppression, resulting in a measurement with both significantly increased statistics compared to previous DeepCore oscillation results and high neutrino purity.","For the normal neutrino mass ordering, the atmospheric neutrino oscillation parameters and their 1$\\sigma$ errors are measured to be $\\Delta$m$^2_{32}$ = $2.40\\substack{+0.05 \\\\ -0.04} \\times 10^{-3} \\textrm{ eV}^2$ and sin$^2$$\\theta_{23}$=$0.54\\substack{+0.04 \\\\ -0.03}$.","The results are the most precise to date using atmospheric neutrinos, and are compatible with measurements from other neutrino detectors including long-baseline accelerator experiments."],"url":"http://arxiv.org/abs/2405.02163v1","category":"hep-ex"} +{"created":"2024-05-03 14:44:17","title":"New Angular Momentum Conservation Laws for Gauge Fields in QED","abstract":"Quantum electrodynamics (QED) deals with the relativistic interaction of bosonic gauge fields and fermionic charged particles. In QED, global conservation laws of angular momentum for light-matter interactions are well-known. However, local conservation laws, i.e. the conservation law of angular momentum at every point in space, remain unexplored. Here, we use the QED Lagrangian and Noether's theorem to derive a new local conservation law of angular momentum for Dirac-Maxwell fields in the form of the continuity relation for linear momentum. We separate this local conservation law into four coupled motion equations for spin and orbital angular momentum (OAM) densities. We introduce a helicity current tensor, OAM current tensor, and spin-orbit torque in the motion equations to shed light on on the local dynamics of spin-OAM interaction and angular momentum exchange between Maxwell-Dirac fields. We elucidate how our results translate to classical electrodynamics using the example of plane wave interference as well as a dual-mode optical fiber. Our results shine light on phenomena related to the spin of gauge bosons.","sentences":["Quantum electrodynamics (QED) deals with the relativistic interaction of bosonic gauge fields and fermionic charged particles.","In QED, global conservation laws of angular momentum for light-matter interactions are well-known.","However, local conservation laws, i.e. the conservation law of angular momentum at every point in space, remain unexplored.","Here, we use the QED Lagrangian and Noether's theorem to derive a new local conservation law of angular momentum for Dirac-Maxwell fields in the form of the continuity relation for linear momentum.","We separate this local conservation law into four coupled motion equations for spin and orbital angular momentum (OAM) densities.","We introduce a helicity current tensor, OAM current tensor, and spin-orbit torque in the motion equations to shed light on on the local dynamics of spin-OAM interaction and angular momentum exchange between Maxwell-Dirac fields.","We elucidate how our results translate to classical electrodynamics using the example of plane wave interference as well as a dual-mode optical fiber.","Our results shine light on phenomena related to the spin of gauge bosons."],"url":"http://arxiv.org/abs/2405.02143v1","category":"quant-ph"} +{"created":"2024-05-03 14:20:51","title":"Second radial eigenfunctions to a fractional Dirichlet problem and uniqueness for a semilinear equation","abstract":"We analyze the shape of radial second Dirichlet eigenfunctions of fractional Schr\\\"odinger type operators of the form $(-\\Delta)^s +V$ in the unit ball $B$ in $\\mathbb{R}^N$ with a nondecreasing radial potential $V$. Specifically, we show that the eigenspace corresponding to the second radial eigenvalue is simple and spanned by an eigenfunction $u$ which changes sign precisely once in the radial variable and does not have zeroes anywhere else in $B$. Moreover, by a new Hopf type lemma for supersolutions to a class of degenerate mixed boundary value problems, we show that $u$ has a nonvanishing fractional boundary derivative on $\\partial B$. We apply this result to prove uniqueness and nondegeneracy of positive ground state solutions to the problem $(-\\Delta)^s u+\\lambda u=u^p$ on ${B}$, $\\; u=0$ on $\\mathbb{R}^N\\setminus B$. Here $s\\in (0,1)$, $\\lambda\\geq 0$ and $p>1$ is strictly smaller than the critical Sobolev exponent.","sentences":["We analyze the shape of radial second Dirichlet eigenfunctions of fractional Schr\\\"odinger type operators of the form $(-\\Delta)^s +V$ in the unit ball $B$ in $\\mathbb{R}^N$ with a nondecreasing radial potential $V$. Specifically, we show that the eigenspace corresponding to the second radial eigenvalue is simple and spanned by an eigenfunction $u$ which changes sign precisely once in the radial variable and does not have zeroes anywhere else in $B$. Moreover, by a new Hopf type lemma for supersolutions to a class of degenerate mixed boundary value problems, we show that $u$ has a nonvanishing fractional boundary derivative on $\\partial B$. We apply this result to prove uniqueness and nondegeneracy of positive ground state solutions to the problem $(-\\Delta)^s u+\\lambda u=u^p$ on ${B}$, $\\;","u=0$ on $\\mathbb{R}^N\\setminus B$. Here $s\\in (0,1)$, $\\lambda\\geq 0$ and $p>1$ is strictly smaller than the critical Sobolev exponent."],"url":"http://arxiv.org/abs/2405.02120v1","category":"math.AP"} +{"created":"2024-05-03 13:48:05","title":"Forecasting Ferry Passenger Flow Using Long-Short Term Memory Neural Networks","abstract":"With recent studies related to Neural Networks being used on different forecasting and time series investigations, this study aims to expand these contexts to ferry passenger traffic. The primary objective of the study is to investigate and evaluate an LSTM-based Neural Networks' capability to forecast ferry passengers of two ports in the Philippines. The proposed model's fitting and evaluation of the passenger flow forecasting of the two ports is based on monthly passenger traffic from 2016 to 2022 data that was acquired from the Philippine Ports Authority (PPA). This work uses Mean Absolute Percentage Error (MAPE) as its primary metric to evaluate the model's forecasting capability. The proposed LSTM-based Neural Networks model achieved 72% forecasting accuracy to the Batangas port ferry passenger data and 74% forecasting accuracy to the Mindoro port ferry passenger data. Using Keras and Scikit-learn Python libraries, this work concludes a reasonable forecasting performance of the presented LSTM model. Aside from these notable findings, this study also recommends further investigation and studies on employing other statistical, machine learning, and deep learning methods on forecasting ferry passenger flows.","sentences":["With recent studies related to Neural Networks being used on different forecasting and time series investigations, this study aims to expand these contexts to ferry passenger traffic.","The primary objective of the study is to investigate and evaluate an LSTM-based Neural Networks' capability to forecast ferry passengers of two ports in the Philippines.","The proposed model's fitting and evaluation of the passenger flow forecasting of the two ports is based on monthly passenger traffic from 2016 to 2022 data that was acquired from the Philippine Ports Authority (PPA).","This work uses Mean Absolute Percentage Error (MAPE) as its primary metric to evaluate the model's forecasting capability.","The proposed LSTM-based Neural Networks model achieved 72% forecasting accuracy to the Batangas port ferry passenger data and 74% forecasting accuracy to the Mindoro port ferry passenger data.","Using Keras and Scikit-learn Python libraries, this work concludes a reasonable forecasting performance of the presented LSTM model.","Aside from these notable findings, this study also recommends further investigation and studies on employing other statistical, machine learning, and deep learning methods on forecasting ferry passenger flows."],"url":"http://arxiv.org/abs/2405.02098v1","category":"cs.LG"} +{"created":"2024-05-03 13:25:26","title":"Computational issues in Optimization for Deep networks","abstract":"The paper aims to investigate relevant computational issues of deep neural network architectures with an eye to the interaction between the optimization algorithm and the classification performance. In particular, we aim to analyze the behaviour of state-of-the-art optimization algorithms in relationship to their hyperparameters setting in order to detect robustness with respect to the choice of a certain starting point in ending on different local solutions. We conduct extensive computational experiments using nine open-source optimization algorithms to train deep Convolutional Neural Network architectures on an image multi-class classification task. Precisely, we consider several architectures by changing the number of layers and neurons per layer, in order to evaluate the impact of different width and depth structures on the computational optimization performance.","sentences":["The paper aims to investigate relevant computational issues of deep neural network architectures with an eye to the interaction between the optimization algorithm and the classification performance.","In particular, we aim to analyze the behaviour of state-of-the-art optimization algorithms in relationship to their hyperparameters setting in order to detect robustness with respect to the choice of a certain starting point in ending on different local solutions.","We conduct extensive computational experiments using nine open-source optimization algorithms to train deep Convolutional Neural Network architectures on an image multi-class classification task.","Precisely, we consider several architectures by changing the number of layers and neurons per layer, in order to evaluate the impact of different width and depth structures on the computational optimization performance."],"url":"http://arxiv.org/abs/2405.02089v1","category":"math.OC"} +{"created":"2024-05-03 13:12:14","title":"CatTSunami: Accelerating Transition State Energy Calculations with Pre-trained Graph Neural Networks","abstract":"Direct access to transition state energies at low computational cost unlocks the possibility of accelerating catalyst discovery. We show that the top performing graph neural network potential trained on the OC20 dataset, a related but different task, is able to find transition states energetically similar (within 0.1 eV) to density functional theory (DFT) 91% of the time with a 28x speedup. This speaks to the generalizability of the models, having never been explicitly trained on reactions, the machine learned potential approximates the potential energy surface well enough to be performant for this auxiliary task. We introduce the Open Catalyst 2020 Nudged Elastic Band (OC20NEB) dataset, which is made of 932 DFT nudged elastic band calculations, to benchmark machine learned model performance on transition state energies. To demonstrate the efficacy of this approach, we replicated a well-known, large reaction network with 61 intermediates and 174 dissociation reactions at DFT resolution (40 meV). In this case of dense NEB enumeration, we realize even more computational cost savings and used just 12 GPU days of compute, where DFT would have taken 52 GPU years, a 1500x speedup. Similar searches for complete reaction networks could become routine using the approach presented here. Finally, we replicated an ammonia synthesis activity volcano and systematically found lower energy configurations of the transition states and intermediates on six stepped unary surfaces. This scalable approach offers a more complete treatment of configurational space to improve and accelerate catalyst discovery.","sentences":["Direct access to transition state energies at low computational cost unlocks the possibility of accelerating catalyst discovery.","We show that the top performing graph neural network potential trained on the OC20 dataset, a related but different task, is able to find transition states energetically similar (within 0.1 eV) to density functional theory (DFT) 91% of the time with a 28x speedup.","This speaks to the generalizability of the models, having never been explicitly trained on reactions, the machine learned potential approximates the potential energy surface well enough to be performant for this auxiliary task.","We introduce the Open Catalyst 2020 Nudged Elastic Band (OC20NEB) dataset, which is made of 932 DFT nudged elastic band calculations, to benchmark machine learned model performance on transition state energies.","To demonstrate the efficacy of this approach, we replicated a well-known, large reaction network with 61 intermediates and 174 dissociation reactions at DFT resolution (40 meV).","In this case of dense NEB enumeration, we realize even more computational cost savings and used just 12 GPU days of compute, where DFT would have taken 52 GPU years, a 1500x speedup.","Similar searches for complete reaction networks could become routine using the approach presented here.","Finally, we replicated an ammonia synthesis activity volcano and systematically found lower energy configurations of the transition states and intermediates on six stepped unary surfaces.","This scalable approach offers a more complete treatment of configurational space to improve and accelerate catalyst discovery."],"url":"http://arxiv.org/abs/2405.02078v1","category":"cond-mat.mtrl-sci"} +{"created":"2024-05-03 13:08:56","title":"A Federated Learning Benchmark on Tabular Data: Comparing Tree-Based Models and Neural Networks","abstract":"Federated Learning (FL) has lately gained traction as it addresses how machine learning models train on distributed datasets. FL was designed for parametric models, namely Deep Neural Networks (DNNs).Thus, it has shown promise on image and text tasks. However, FL for tabular data has received little attention. Tree-Based Models (TBMs) have been considered to perform better on tabular data and they are starting to see FL integrations. In this study, we benchmark federated TBMs and DNNs for horizontal FL, with varying data partitions, on 10 well-known tabular datasets. Our novel benchmark results indicates that current federated boosted TBMs perform better than federated DNNs in different data partitions. Furthermore, a federated XGBoost outperforms all other models. Lastly, we find that federated TBMs perform better than federated parametric models, even when increasing the number of clients significantly.","sentences":["Federated Learning (FL) has lately gained traction as it addresses how machine learning models train on distributed datasets.","FL was designed for parametric models, namely Deep Neural Networks (DNNs).Thus, it has shown promise on image and text tasks.","However, FL for tabular data has received little attention.","Tree-Based Models (TBMs) have been considered to perform better on tabular data and they are starting to see FL integrations.","In this study, we benchmark federated TBMs and DNNs for horizontal FL, with varying data partitions, on 10 well-known tabular datasets.","Our novel benchmark results indicates that current federated boosted TBMs perform better than federated DNNs in different data partitions.","Furthermore, a federated XGBoost outperforms all other models.","Lastly, we find that federated TBMs perform better than federated parametric models, even when increasing the number of clients significantly."],"url":"http://arxiv.org/abs/2405.02074v1","category":"cs.LG"} +{"created":"2024-05-03 12:58:57","title":"Histogram-Based Federated XGBoost using Minimal Variance Sampling for Federated Tabular Data","abstract":"Federated Learning (FL) has gained considerable traction, yet, for tabular data, FL has received less attention. Most FL research has focused on Neural Networks while Tree-Based Models (TBMs) such as XGBoost have historically performed better on tabular data. It has been shown that subsampling of training data when building trees can improve performance but it is an open problem whether such subsampling can improve performance in FL. In this paper, we evaluate a histogram-based federated XGBoost that uses Minimal Variance Sampling (MVS). We demonstrate the underlying algorithm and show that our model using MVS can improve performance in terms of accuracy and regression error in a federated setting. In our evaluation, our model using MVS performs better than uniform (random) sampling and no sampling at all. It achieves both outstanding local and global performance on a new set of federated tabular datasets. Federated XGBoost using MVS also outperforms centralized XGBoost in half of the studied cases.","sentences":["Federated Learning (FL) has gained considerable traction, yet, for tabular data, FL has received less attention.","Most FL research has focused on Neural Networks while Tree-Based Models (TBMs) such as XGBoost have historically performed better on tabular data.","It has been shown that subsampling of training data when building trees can improve performance but it is an open problem whether such subsampling can improve performance in FL.","In this paper, we evaluate a histogram-based federated XGBoost that uses Minimal Variance Sampling (MVS).","We demonstrate the underlying algorithm and show that our model using MVS can improve performance in terms of accuracy and regression error in a federated setting.","In our evaluation, our model using MVS performs better than uniform (random) sampling and no sampling at all.","It achieves both outstanding local and global performance on a new set of federated tabular datasets.","Federated XGBoost using MVS also outperforms centralized XGBoost in half of the studied cases."],"url":"http://arxiv.org/abs/2405.02067v1","category":"cs.LG"} +{"created":"2024-05-03 12:56:34","title":"WateRF: Robust Watermarks in Radiance Fields for Protection of Copyrights","abstract":"The advances in the Neural Radiance Fields (NeRF) research offer extensive applications in diverse domains, but protecting their copyrights has not yet been researched in depth. Recently, NeRF watermarking has been considered one of the pivotal solutions for safely deploying NeRF-based 3D representations. However, existing methods are designed to apply only to implicit or explicit NeRF representations. In this work, we introduce an innovative watermarking method that can be employed in both representations of NeRF. This is achieved by fine-tuning NeRF to embed binary messages in the rendering process. In detail, we propose utilizing the discrete wavelet transform in the NeRF space for watermarking. Furthermore, we adopt a deferred back-propagation technique and introduce a combination with the patch-wise loss to improve rendering quality and bit accuracy with minimum trade-offs. We evaluate our method in three different aspects: capacity, invisibility, and robustness of the embedded watermarks in the 2D-rendered images. Our method achieves state-of-the-art performance with faster training speed over the compared state-of-the-art methods.","sentences":["The advances in the Neural Radiance Fields (NeRF) research offer extensive applications in diverse domains, but protecting their copyrights has not yet been researched in depth.","Recently, NeRF watermarking has been considered one of the pivotal solutions for safely deploying NeRF-based 3D representations.","However, existing methods are designed to apply only to implicit or explicit NeRF representations.","In this work, we introduce an innovative watermarking method that can be employed in both representations of NeRF.","This is achieved by fine-tuning NeRF to embed binary messages in the rendering process.","In detail, we propose utilizing the discrete wavelet transform in the NeRF space for watermarking.","Furthermore, we adopt a deferred back-propagation technique and introduce a combination with the patch-wise loss to improve rendering quality and bit accuracy with minimum trade-offs.","We evaluate our method in three different aspects: capacity, invisibility, and robustness of the embedded watermarks in the 2D-rendered images.","Our method achieves state-of-the-art performance with faster training speed over the compared state-of-the-art methods."],"url":"http://arxiv.org/abs/2405.02066v1","category":"cs.CV"} +{"created":"2024-05-03 12:42:40","title":"Federated Learning for Tabular Data using TabNet: A Vehicular Use-Case","abstract":"In this paper, we show how Federated Learning (FL) can be applied to vehicular use-cases in which we seek to classify obstacles, irregularities and pavement types on roads. Our proposed framework utilizes FL and TabNet, a state-of-the-art neural network for tabular data. We are the first to demonstrate how TabNet can be integrated with FL. Moreover, we achieve a maximum test accuracy of 93.6%. Finally, we reason why FL is a suitable concept for this data set.","sentences":["In this paper, we show how Federated Learning (FL) can be applied to vehicular use-cases in which we seek to classify obstacles, irregularities and pavement types on roads.","Our proposed framework utilizes FL and TabNet, a state-of-the-art neural network for tabular data.","We are the first to demonstrate how TabNet can be integrated with FL.","Moreover, we achieve a maximum test accuracy of 93.6%.","Finally, we reason why FL is a suitable concept for this data set."],"url":"http://arxiv.org/abs/2405.02060v1","category":"cs.LG"} +{"created":"2024-05-03 11:39:25","title":"Fast Algorithms for Spiking Neural Network Simulation with FPGAs","abstract":"Using OpenCL-based high-level synthesis, we create a number of spiking neural network (SNN) simulators for the Potjans-Diesmann cortical microcircuit for a high-end Field-Programmable Gate Array (FPGA). Our best simulators simulate the circuit 25\\% faster than real-time, require less than 21 nJ per synaptic event, and are bottle-necked by the device's on-chip memory. Speed-wise they compare favorably to the state-of-the-art GPU-based simulators and their energy usage is lower than any other published result. This result is the first for simulating the circuit on a single hardware accelerator. We also extensively analyze the techniques and algorithms we implement our simulators with, many of which can be realized on other types of hardware. Thus, this article is of interest to any researcher or practitioner interested in efficient SNN simulation, whether they target FPGAs or not.","sentences":["Using OpenCL-based high-level synthesis, we create a number of spiking neural network (SNN) simulators for the Potjans-Diesmann cortical microcircuit for a high-end Field-Programmable Gate Array (FPGA).","Our best simulators simulate the circuit 25\\% faster than real-time, require less than 21 nJ per synaptic event, and are bottle-necked by the device's on-chip memory.","Speed-wise they compare favorably to the state-of-the-art GPU-based simulators and their energy usage is lower than any other published result.","This result is the first for simulating the circuit on a single hardware accelerator.","We also extensively analyze the techniques and algorithms we implement our simulators with, many of which can be realized on other types of hardware.","Thus, this article is of interest to any researcher or practitioner interested in efficient SNN simulation, whether they target FPGAs or not."],"url":"http://arxiv.org/abs/2405.02019v1","category":"cs.NE"} +{"created":"2024-05-03 11:06:37","title":"M${^2}$Depth: Self-supervised Two-Frame Multi-camera Metric Depth Estimation","abstract":"This paper presents a novel self-supervised two-frame multi-camera metric depth estimation network, termed M${^2}$Depth, which is designed to predict reliable scale-aware surrounding depth in autonomous driving. Unlike the previous works that use multi-view images from a single time-step or multiple time-step images from a single camera, M${^2}$Depth takes temporally adjacent two-frame images from multiple cameras as inputs and produces high-quality surrounding depth. We first construct cost volumes in spatial and temporal domains individually and propose a spatial-temporal fusion module that integrates the spatial-temporal information to yield a strong volume presentation. We additionally combine the neural prior from SAM features with internal features to reduce the ambiguity between foreground and background and strengthen the depth edges. Extensive experimental results on nuScenes and DDAD benchmarks show M${^2}$Depth achieves state-of-the-art performance. More results can be found in https://heiheishuang.xyz/M2Depth .","sentences":["This paper presents a novel self-supervised two-frame multi-camera metric depth estimation network, termed M${^2}$Depth, which is designed to predict reliable scale-aware surrounding depth in autonomous driving.","Unlike the previous works that use multi-view images from a single time-step or multiple time-step images from a single camera, M${^2}$Depth takes temporally adjacent two-frame images from multiple cameras as inputs and produces high-quality surrounding depth.","We first construct cost volumes in spatial and temporal domains individually and propose a spatial-temporal fusion module that integrates the spatial-temporal information to yield a strong volume presentation.","We additionally combine the neural prior from SAM features with internal features to reduce the ambiguity between foreground and background and strengthen the depth edges.","Extensive experimental results on nuScenes and DDAD benchmarks show M${^2}$Depth achieves state-of-the-art performance.","More results can be found in https://heiheishuang.xyz/M2Depth ."],"url":"http://arxiv.org/abs/2405.02004v1","category":"cs.CV"} +{"created":"2024-05-03 11:01:26","title":"Simulation of stopping vortices in the flow past a mounted wedge","abstract":"This work is concerned with the numerical investigation of the dynamics of stopping vortex formation in the uniform flow past a wedge mounted on a wall for channel Reynolds number $Re_c=1560$. The streamfunction-vorticity ($\\psi$-$\\omega$) formulation of the transient Navier-Stokes (N-S) equations have been utilized for simulating the flow and has been discretized using a fourth order spatially and second order temporally accurate compact finite difference method on a nonuniform Cartesian grid developed by the author. The results are validated by comparing the simulated results of the early evolution of the flow with the experimental visualization of a well-known laboratory experiment of \\cite{pullin1980} and a grid-independence study. The development of the stopping vortex and its effect on the starting vortex are discussed in details. The stopping flow is analysed in the light of the time interval through which the inlet velocity of the flow is decelerated. The criterion for the development of a clean vortex is provided in terms of the impulse associated with the deceleration. Our study revealed that the strength of the stopping vortex depends upon the rapidity of deceleration. The vorticity distribution along the diameter of the core of the stopping vortex is seen to follow a Gaussian profile.","sentences":["This work is concerned with the numerical investigation of the dynamics of stopping vortex formation in the uniform flow past a wedge mounted on a wall for channel Reynolds number $Re_c=1560$.","The streamfunction-vorticity ($\\psi$-$\\omega$) formulation of the transient Navier-Stokes (N-S) equations have been utilized for simulating the flow and has been discretized using a fourth order spatially and second order temporally accurate compact finite difference method on a nonuniform Cartesian grid developed by the author.","The results are validated by comparing the simulated results of the early evolution of the flow with the experimental visualization of a well-known laboratory experiment of \\cite{pullin1980} and a grid-independence study.","The development of the stopping vortex and its effect on the starting vortex are discussed in details.","The stopping flow is analysed in the light of the time interval through which the inlet velocity of the flow is decelerated.","The criterion for the development of a clean vortex is provided in terms of the impulse associated with the deceleration.","Our study revealed that the strength of the stopping vortex depends upon the rapidity of deceleration.","The vorticity distribution along the diameter of the core of the stopping vortex is seen to follow a Gaussian profile."],"url":"http://arxiv.org/abs/2405.02000v1","category":"physics.flu-dyn"} +{"created":"2024-05-03 10:19:17","title":"Global regularity for solutions of magnetohydrodynamic equations with large initial data","abstract":"We study the existence of a strong solution to the initial value problem for the magnetohydrodynamic equations in $\\mathbb{R}^N, N\\geq 3$. We obtain a global in-time strong solution without any smallness assumptions on the initial data.","sentences":["We study the existence of a strong solution to the initial value problem for the magnetohydrodynamic equations in $\\mathbb{R}^N, N\\geq 3$.","We obtain a global in-time strong solution without any smallness assumptions on the initial data."],"url":"http://arxiv.org/abs/2405.01982v1","category":"math.AP"} +{"created":"2024-05-03 10:00:36","title":"Introducing a microstructure-embedded autoencoder approach for reconstructing high-resolution solution field from reduced parametric space","abstract":"In this study, we develop a novel multi-fidelity deep learning approach that transforms low-fidelity solution maps into high-fidelity ones by incorporating parametric space information into a standard autoencoder architecture. It is shown that, due to the integration of parametric space data, this method requires significantly less training data to achieve effective performance in predicting high-fidelity solution from the low-fidelity one. In this study, our focus is on a 2D steady-state heat transfer analysis in highly heterogeneous materials microstructure, where the spatial distribution of heat conductivity coefficients for two distinct materials is condensed. Subsequently, the boundary value problem is solved on the coarsest grid using a pre-trained physics-informed neural operator network. Afterward, the calculated low-fidelity result is upscaled using the newly designed enhanced autoencoder. The novelty of the developed enhanced autoencoder lies in the concatenation of heat conductivity maps of different resolutions to the decoder segment in distinct steps. We then compare the outcomes of developed algorithm with the corresponding finite element results, standard U-Net architecture as well as other upscaling approaches such as interpolation functions of varying orders and feedforward neural networks (FFNN). The analysis of the results based on the new approach demonstrates superior performance compared to other approaches in terms of computational cost and error on the test cases. Therefore, as a potential supplement to neural operators networks, our architecture upscales low-fidelity solutions to high-fidelity ones while preserving critical details that are often lost in conventional upscaling methods, especially at sharp interfaces, such as those encountered with interpolation methods.","sentences":["In this study, we develop a novel multi-fidelity deep learning approach that transforms low-fidelity solution maps into high-fidelity ones by incorporating parametric space information into a standard autoencoder architecture.","It is shown that, due to the integration of parametric space data, this method requires significantly less training data to achieve effective performance in predicting high-fidelity solution from the low-fidelity one.","In this study, our focus is on a 2D steady-state heat transfer analysis in highly heterogeneous materials microstructure, where the spatial distribution of heat conductivity coefficients for two distinct materials is condensed.","Subsequently, the boundary value problem is solved on the coarsest grid using a pre-trained physics-informed neural operator network.","Afterward, the calculated low-fidelity result is upscaled using the newly designed enhanced autoencoder.","The novelty of the developed enhanced autoencoder lies in the concatenation of heat conductivity maps of different resolutions to the decoder segment in distinct steps.","We then compare the outcomes of developed algorithm with the corresponding finite element results, standard U-Net architecture as well as other upscaling approaches such as interpolation functions of varying orders and feedforward neural networks (FFNN).","The analysis of the results based on the new approach demonstrates superior performance compared to other approaches in terms of computational cost and error on the test cases.","Therefore, as a potential supplement to neural operators networks, our architecture upscales low-fidelity solutions to high-fidelity ones while preserving critical details that are often lost in conventional upscaling methods, especially at sharp interfaces, such as those encountered with interpolation methods."],"url":"http://arxiv.org/abs/2405.01975v1","category":"cs.CE"} +{"created":"2024-05-03 09:37:33","title":"Rescale-Invariant Federated Reinforcement Learning for Resource Allocation in V2X Networks","abstract":"Federated Reinforcement Learning (FRL) offers a promising solution to various practical challenges in resource allocation for vehicle-to-everything (V2X) networks. However, the data discrepancy among individual agents can significantly degrade the performance of FRL-based algorithms. To address this limitation, we exploit the node-wise invariance property of ReLU-activated neural networks, with the aim of reducing data discrepancy to improve learning performance. Based on this property, we introduce a backward rescale-invariant operation to develop a rescale-invariant FRL algorithm. Simulation results demonstrate that the proposed algorithm notably enhances both convergence speed and convergent performance.","sentences":["Federated Reinforcement Learning (FRL) offers a promising solution to various practical challenges in resource allocation for vehicle-to-everything (V2X) networks.","However, the data discrepancy among individual agents can significantly degrade the performance of FRL-based algorithms.","To address this limitation, we exploit the node-wise invariance property of ReLU-activated neural networks, with the aim of reducing data discrepancy to improve learning performance.","Based on this property, we introduce a backward rescale-invariant operation to develop a rescale-invariant FRL algorithm.","Simulation results demonstrate that the proposed algorithm notably enhances both convergence speed and convergent performance."],"url":"http://arxiv.org/abs/2405.01961v1","category":"eess.SP"} +{"created":"2024-05-03 09:30:37","title":"Well-posedness of Kolmogorov-Fokker-Planck equations with unbounded drift","abstract":"We consider Kolmogorov-Fokker-Planck equations with unbounded drift terms which are only measurable in time and locally H\\\"older continuous in space. In particular, we extend the parametrix method to this setting and we prove existence and uniqueness of measure solutions to the associated Cauchy problem, as well as the equivalence with the corresponding stochastic formulation.","sentences":["We consider Kolmogorov-Fokker-Planck equations with unbounded drift terms which are only measurable in time and locally H\\\"older continuous in space.","In particular, we extend the parametrix method to this setting and we prove existence and uniqueness of measure solutions to the associated Cauchy problem, as well as the equivalence with the corresponding stochastic formulation."],"url":"http://arxiv.org/abs/2405.01955v1","category":"math.AP"} +{"created":"2024-05-03 08:32:37","title":"A note on the Fermi Golden Rule constant for the pure power NLS","abstract":"We provide a detailed proof that the Nonlinear Fermi Golden Rule coefficient that appears in our recent proof of the asymptotic stability of ground states for the pure power Nonlinear Schr\\\"odinger equations in $\\mathbb{R}$ with exponent $0<|p-3|\\ll 1$ is nonzero.","sentences":["We provide a detailed proof that the Nonlinear Fermi Golden Rule coefficient that appears in our recent proof of the asymptotic stability of ground states for the pure power Nonlinear Schr\\\"odinger equations in $\\mathbb{R}$ with exponent $0<|p-3|\\ll 1$ is nonzero."],"url":"http://arxiv.org/abs/2405.01922v1","category":"math.AP"} +{"created":"2024-05-03 17:09:52","title":"A Parameter-Masked Mock Data Challenge for Beyond-Two-Point Galaxy Clustering Statistics","abstract":"The last few years have seen the emergence of a wide array of novel techniques for analyzing high-precision data from upcoming galaxy surveys, which aim to extend the statistical analysis of galaxy clustering data beyond the linear regime and the canonical two-point (2pt) statistics. We test and benchmark some of these new techniques in a community data challenge \"Beyond-2pt\", initiated during the Aspen 2022 Summer Program \"Large-Scale Structure Cosmology beyond 2-Point Statistics,\" whose first round of results we present here. The challenge dataset consists of high-precision mock galaxy catalogs for clustering in real space, redshift space, and on a light cone. Participants in the challenge have developed end-to-end pipelines to analyze mock catalogs and extract unknown (\"masked\") cosmological parameters of the underlying $\\Lambda$CDM models with their methods. The methods represented are density-split clustering, nearest neighbor statistics, BACCO power spectrum emulator, void statistics, LEFTfield field-level inference using effective field theory (EFT), and joint power spectrum and bispectrum analyses using both EFT and simulation-based inference. In this work, we review the results of the challenge, focusing on problems solved, lessons learned, and future research needed to perfect the emerging beyond-2pt approaches. The unbiased parameter recovery demonstrated in this challenge by multiple statistics and the associated modeling and inference frameworks supports the credibility of cosmology constraints from these methods. The challenge data set is publicly available and we welcome future submissions from methods that are not yet represented.","sentences":["The last few years have seen the emergence of a wide array of novel techniques for analyzing high-precision data from upcoming galaxy surveys, which aim to extend the statistical analysis of galaxy clustering data beyond the linear regime and the canonical two-point (2pt) statistics.","We test and benchmark some of these new techniques in a community data challenge \"Beyond-2pt\", initiated during the Aspen 2022 Summer Program \"Large-Scale Structure Cosmology beyond 2-Point Statistics,\" whose first round of results we present here.","The challenge dataset consists of high-precision mock galaxy catalogs for clustering in real space, redshift space, and on a light cone.","Participants in the challenge have developed end-to-end pipelines to analyze mock catalogs and extract unknown (\"masked\") cosmological parameters of the underlying $\\Lambda$CDM models with their methods.","The methods represented are density-split clustering, nearest neighbor statistics, BACCO power spectrum emulator, void statistics, LEFTfield field-level inference using effective field theory (EFT), and joint power spectrum and bispectrum analyses using both EFT and simulation-based inference.","In this work, we review the results of the challenge, focusing on problems solved, lessons learned, and future research needed to perfect the emerging beyond-2pt approaches.","The unbiased parameter recovery demonstrated in this challenge by multiple statistics and the associated modeling and inference frameworks supports the credibility of cosmology constraints from these methods.","The challenge data set is publicly available and we welcome future submissions from methods that are not yet represented."],"url":"http://arxiv.org/abs/2405.02252v1","category":"astro-ph.CO"} +{"created":"2024-05-03 15:57:26","title":"Regularized Q-learning through Robust Averaging","abstract":"We propose a new Q-learning variant, called 2RA Q-learning, that addresses some weaknesses of existing Q-learning methods in a principled manner. One such weakness is an underlying estimation bias which cannot be controlled and often results in poor performance. We propose a distributionally robust estimator for the maximum expected value term, which allows us to precisely control the level of estimation bias introduced. The distributionally robust estimator admits a closed-form solution such that the proposed algorithm has a computational cost per iteration comparable to Watkins' Q-learning. For the tabular case, we show that 2RA Q-learning converges to the optimal policy and analyze its asymptotic mean-squared error. Lastly, we conduct numerical experiments for various settings, which corroborate our theoretical findings and indicate that 2RA Q-learning often performs better than existing methods.","sentences":["We propose a new Q-learning variant, called 2RA Q-learning, that addresses some weaknesses of existing Q-learning methods in a principled manner.","One such weakness is an underlying estimation bias which cannot be controlled and often results in poor performance.","We propose a distributionally robust estimator for the maximum expected value term, which allows us to precisely control the level of estimation bias introduced.","The distributionally robust estimator admits a closed-form solution such that the proposed algorithm has a computational cost per iteration comparable to Watkins' Q-learning.","For the tabular case, we show that 2RA Q-learning converges to the optimal policy and analyze its asymptotic mean-squared error.","Lastly, we conduct numerical experiments for various settings, which corroborate our theoretical findings and indicate that 2RA Q-learning often performs better than existing methods."],"url":"http://arxiv.org/abs/2405.02201v1","category":"math.OC"} +{"created":"2024-05-03 15:47:07","title":"Non-Destructive Peat Analysis using Hyperspectral Imaging and Machine Learning","abstract":"Peat, a crucial component in whisky production, imparts distinctive and irreplaceable flavours to the final product. However, the extraction of peat disrupts ancient ecosystems and releases significant amounts of carbon, contributing to climate change. This paper aims to address this issue by conducting a feasibility study on enhancing peat use efficiency in whisky manufacturing through non-destructive analysis using hyperspectral imaging. Results show that shot-wave infrared (SWIR) data is more effective for analyzing peat samples and predicting total phenol levels, with accuracies up to 99.81%.","sentences":["Peat, a crucial component in whisky production, imparts distinctive and irreplaceable flavours to the final product.","However, the extraction of peat disrupts ancient ecosystems and releases significant amounts of carbon, contributing to climate change.","This paper aims to address this issue by conducting a feasibility study on enhancing peat use efficiency in whisky manufacturing through non-destructive analysis using hyperspectral imaging.","Results show that shot-wave infrared (SWIR) data is more effective for analyzing peat samples and predicting total phenol levels, with accuracies up to 99.81%."],"url":"http://arxiv.org/abs/2405.02191v1","category":"cs.CV"} +{"created":"2024-05-03 15:28:44","title":"Imitation Learning in Discounted Linear MDPs without exploration assumptions","abstract":"We present a new algorithm for imitation learning in infinite horizon linear MDPs dubbed ILARL which greatly improves the bound on the number of trajectories that the learner needs to sample from the environment. In particular, we remove exploration assumptions required in previous works and we improve the dependence on the desired accuracy $\\epsilon$ from $\\mathcal{O}\\br{\\epsilon^{-5}}$ to $\\mathcal{O}\\br{\\epsilon^{-4}}$. Our result relies on a connection between imitation learning and online learning in MDPs with adversarial losses. For the latter setting, we present the first result for infinite horizon linear MDP which may be of independent interest. Moreover, we are able to provide a strengthen result for the finite horizon case where we achieve $\\mathcal{O}\\br{\\epsilon^{-2}}$. Numerical experiments with linear function approximation shows that ILARL outperforms other commonly used algorithms.","sentences":["We present a new algorithm for imitation learning in infinite horizon linear MDPs dubbed ILARL which greatly improves the bound on the number of trajectories that the learner needs to sample from the environment.","In particular, we remove exploration assumptions required in previous works and we improve the dependence on the desired accuracy $\\epsilon$ from $\\mathcal{O}\\br{\\epsilon^{-5}}$ to $\\mathcal{O}\\br{\\epsilon^{-4}}$. Our result relies on a connection between imitation learning and online learning in MDPs with adversarial losses.","For the latter setting, we present the first result for infinite horizon linear MDP which may be of independent interest.","Moreover, we are able to provide a strengthen result for the finite horizon case where we achieve $\\mathcal{O}\\br{\\epsilon^{-2}}$. Numerical experiments with linear function approximation shows that ILARL outperforms other commonly used algorithms."],"url":"http://arxiv.org/abs/2405.02181v1","category":"cs.LG"} +{"created":"2024-05-03 10:46:19","title":"Soft Label PU Learning","abstract":"PU learning refers to the classification problem in which only part of positive samples are labeled. Existing PU learning methods treat unlabeled samples equally. However, in many real tasks, from common sense or domain knowledge, some unlabeled samples are more likely to be positive than others. In this paper, we propose soft label PU learning, in which unlabeled data are assigned soft labels according to their probabilities of being positive. Considering that the ground truth of TPR, FPR, and AUC are unknown, we then design PU counterparts of these metrics to evaluate the performances of soft label PU learning methods within validation data. We show that these new designed PU metrics are good substitutes for the real metrics. After that, a method that optimizes such metrics is proposed. Experiments on public datasets and real datasets for anti-cheat services from Tencent games demonstrate the effectiveness of our proposed method.","sentences":["PU learning refers to the classification problem in which only part of positive samples are labeled.","Existing PU learning methods treat unlabeled samples equally.","However, in many real tasks, from common sense or domain knowledge, some unlabeled samples are more likely to be positive than others.","In this paper, we propose soft label PU learning, in which unlabeled data are assigned soft labels according to their probabilities of being positive.","Considering that the ground truth of TPR, FPR, and AUC are unknown, we then design PU counterparts of these metrics to evaluate the performances of soft label PU learning methods within validation data.","We show that these new designed PU metrics are good substitutes for the real metrics.","After that, a method that optimizes such metrics is proposed.","Experiments on public datasets and real datasets for anti-cheat services from Tencent games demonstrate the effectiveness of our proposed method."],"url":"http://arxiv.org/abs/2405.01990v1","category":"cs.LG"} +{"created":"2024-05-03 09:42:10","title":"A Deep Learning Approach in RIS-based Indoor Localization","abstract":"In the domain of RIS-based indoor localization, our work introduces two distinct approaches to address real-world challenges. The first method is based on deep learning, employing a Long Short-Term Memory (LSTM) network. The second, a novel LSTM-PSO hybrid, strategically takes advantage of deep learning and optimization techniques. Our simulations encompass practical scenarios, including variations in RIS placement and the intricate dynamics of multipath effects, all in Non-Line-of-Sight conditions. Our methods can achieve very high reliability, obtaining centimeter-level accuracy for the 98th percentile (worst case) in a different set of conditions, including the presence of the multipath effect. Furthermore, our hybrid approach showcases remarkable resolution, achieving sub-millimeter-level accuracy in numerous scenarios.","sentences":["In the domain of RIS-based indoor localization, our work introduces two distinct approaches to address real-world challenges.","The first method is based on deep learning, employing a Long Short-Term Memory (LSTM) network.","The second, a novel LSTM-PSO hybrid, strategically takes advantage of deep learning and optimization techniques.","Our simulations encompass practical scenarios, including variations in RIS placement and the intricate dynamics of multipath effects, all in Non-Line-of-Sight conditions.","Our methods can achieve very high reliability, obtaining centimeter-level accuracy for the 98th percentile (worst case) in a different set of conditions, including the presence of the multipath effect.","Furthermore, our hybrid approach showcases remarkable resolution, achieving sub-millimeter-level accuracy in numerous scenarios."],"url":"http://arxiv.org/abs/2405.01965v1","category":"eess.SP"} +{"created":"2024-05-03 09:02:17","title":"An Attention Based Pipeline for Identifying Pre-Cancer Lesions in Head and Neck Clinical Images","abstract":"Early detection of cancer can help improve patient prognosis by early intervention. Head and neck cancer is diagnosed in specialist centres after a surgical biopsy, however, there is a potential for these to be missed leading to delayed diagnosis. To overcome these challenges, we present an attention based pipeline that identifies suspected lesions, segments, and classifies them as non-dysplastic, dysplastic and cancerous lesions. We propose (a) a vision transformer based Mask R-CNN network for lesion detection and segmentation of clinical images, and (b) Multiple Instance Learning (MIL) based scheme for classification. Current results show that the segmentation model produces segmentation masks and bounding boxes with up to 82% overlap accuracy score on unseen external test data and surpassing reviewed segmentation benchmarks. Next, a classification F1-score of 85% on the internal cohort test set. An app has been developed to perform lesion segmentation taken via a smart device. Future work involves employing endoscopic video data for precise early detection and prognosis.","sentences":["Early detection of cancer can help improve patient prognosis by early intervention.","Head and neck cancer is diagnosed in specialist centres after a surgical biopsy, however, there is a potential for these to be missed leading to delayed diagnosis.","To overcome these challenges, we present an attention based pipeline that identifies suspected lesions, segments, and classifies them as non-dysplastic, dysplastic and cancerous lesions.","We propose (a) a vision transformer based Mask R-CNN network for lesion detection and segmentation of clinical images, and (b) Multiple Instance Learning (MIL) based scheme for classification.","Current results show that the segmentation model produces segmentation masks and bounding boxes with up to 82% overlap accuracy score on unseen external test data and surpassing reviewed segmentation benchmarks.","Next, a classification F1-score of 85% on the internal cohort test set.","An app has been developed to perform lesion segmentation taken via a smart device.","Future work involves employing endoscopic video data for precise early detection and prognosis."],"url":"http://arxiv.org/abs/2405.01937v1","category":"cs.CV"} +{"created":"2024-05-03 17:58:52","title":"Special matrices over finite fields and their applications to quantum error-correcting codes","abstract":"The matrix-product (MP) code $\\mathcal{C}_{A,k}:=[\\mathcal{C}_{1},\\mathcal{C}_{2},\\ldots,\\mathcal{C}_{k}]\\cdot A$ with a non-singular by column (NSC) matrix $A$ plays an important role in constructing good quantum error-correcting codes. In this paper, we study the MP code when the defining matrix $A$ satisfies the condition that $AA^{\\dag}$ is $(D,\\tau)$-monomial. We give an explicit formula for calculating the dimension of the Hermitian hull of a MP code. We provide the necessary and sufficient conditions that a MP code is Hermitian dual-containing (HDC), almost Hermitian dual-containing (AHDC), Hermitian self-orthogonal (HSO), almost Hermitian self-orthogonal (AHSO), and Hermitian LCD, respectively. We theoretically determine the number of all possible ways involving the relationships among the constituent codes to yield a MP code with these properties, respectively. We give alternative necessary and sufficient conditions for a MP code to be AHDC and AHSO, respectively, and show several cases where a MP code is not AHDC or AHSO. We provide the construction methods of HDC and AHDC MP codes, including those with optimal minimum distance lower bounds.","sentences":["The matrix-product (MP) code $\\mathcal{C}_{A,k}:=[\\mathcal{C}_{1},\\mathcal{C}_{2},\\ldots,\\mathcal{C}_{k}]\\cdot A$ with a non-singular by column (NSC) matrix $A$ plays an important role in constructing good quantum error-correcting codes.","In this paper, we study the MP code when the defining matrix $A$ satisfies the condition that $AA^{\\dag}$ is $(D,\\tau)$-monomial.","We give an explicit formula for calculating the dimension of the Hermitian hull of a MP code.","We provide the necessary and sufficient conditions that a MP code is Hermitian dual-containing (HDC), almost Hermitian dual-containing (AHDC), Hermitian self-orthogonal (HSO), almost Hermitian self-orthogonal (AHSO), and Hermitian LCD, respectively.","We theoretically determine the number of all possible ways involving the relationships among the constituent codes to yield a MP code with these properties, respectively.","We give alternative necessary and sufficient conditions for a MP code to be AHDC and AHSO, respectively, and show several cases where a MP code is not AHDC or AHSO.","We provide the construction methods of HDC and AHDC MP codes, including those with optimal minimum distance lower bounds."],"url":"http://arxiv.org/abs/2405.02285v1","category":"cs.IT"} +{"created":"2024-05-03 17:53:37","title":"How Rare are TESS Free-Floating Planets?","abstract":"Recently, Kunimoto et al. claimed that a short-lived signal in the Transiting Exoplanet Survey Satellite (TESS) Sector 61 database was caused by a microlensing event with a terrestrial-mass free-floating planet (FFP) lens. In this study, we investigate TESS's ability to detect microlensing FFPs by considering the detailed source information (e.g., distance and radius), the TESS photometric accuracy, and finite-source effects. Using the FFP mass function from microlensing surveys toward the Galactic bulge, we find that only $0.0018$ microlensing events are expected to be detected in TESS Sector 61 for the entire planetary mass range. The reported signal is unlikely to be a real microlensing event, which is consistent with the evidence from the long-term OGLE data that the signal was likely due to a stellar flare. By extrapolating our result to fainter stars until $T = 16$ mag and adopting a possible optimized search algorithm, we find that only $\\sim 1$ FFP events can be detected in the entire TESS mission within the first 7 years. Significant improvments of our understanding of FFPs still requires future satellite missions, such as Roman and Earth 2.0, which can detect thousands of FFPs.","sentences":["Recently, Kunimoto et al. claimed that a short-lived signal in the Transiting Exoplanet Survey Satellite (TESS) Sector 61 database was caused by a microlensing event with a terrestrial-mass free-floating planet (FFP) lens.","In this study, we investigate TESS's ability to detect microlensing FFPs by considering the detailed source information (e.g., distance and radius), the TESS photometric accuracy, and finite-source effects.","Using the FFP mass function from microlensing surveys toward the Galactic bulge, we find that only $0.0018$ microlensing events are expected to be detected in TESS Sector 61 for the entire planetary mass range.","The reported signal is unlikely to be a real microlensing event, which is consistent with the evidence from the long-term OGLE data that the signal was likely due to a stellar flare.","By extrapolating our result to fainter stars until $T = 16$ mag and adopting a possible optimized search algorithm, we find that only $\\sim 1$ FFP events can be detected in the entire TESS mission within the first 7 years.","Significant improvments of our understanding of FFPs still requires future satellite missions, such as Roman and Earth 2.0, which can detect thousands of FFPs."],"url":"http://arxiv.org/abs/2405.02279v1","category":"astro-ph.EP"} +{"created":"2024-05-03 16:20:16","title":"Influence of a slow moving vehicle on traffic: Well-posedness and approximation for a mildly non-local model","abstract":"In this paper, we propose a macroscopic model that describes the influence of a slow moving large vehicle on road traffic. The model consists of a scalar conservation law with a non-local constraint on the flux. The constraint level depends on the trajectory of the slower vehicle which is given by an ODE depending on the downstream traffic density. After proving well-posedness, we first build a finite volume scheme and prove its convergence, and then investigate numerically this model by performing a series of tests. In particular, the link with the limit local problem of [M. L. Delle Monache and P. Goatin, J. Differ. Equ. 257 (2014), 4015--4029] is explored numerically.","sentences":["In this paper, we propose a macroscopic model that describes the influence of a slow moving large vehicle on road traffic.","The model consists of a scalar conservation law with a non-local constraint on the flux.","The constraint level depends on the trajectory of the slower vehicle which is given by an ODE depending on the downstream traffic density.","After proving well-posedness, we first build a finite volume scheme and prove its convergence, and then investigate numerically this model by performing a series of tests.","In particular, the link with the limit local problem of [M. L. Delle Monache and P. Goatin, J. Differ.","Equ. 257 (2014), 4015--4029] is explored numerically."],"url":"http://arxiv.org/abs/2405.02215v1","category":"math.AP"} +{"created":"2024-05-03 13:21:27","title":"Muon g-2, Long-Range Muon Spin Force, and Neutrino Oscillations","abstract":"Recent studies have proposed using a geocentric muon spin force to account for the $(g-2)_\\mu$ anomaly, with the long-range force mediator being a light axion-like particle. The mediator exhibits a CP-violating scalar coupling to nucleons and a normal derivative coupling to muons. Due to the weak symmetry, this axion inevitably couples to neutrinos, providing potential impact on neutrino oscillations. By utilizing neutrino data from BOREXINO, IceCube DeepCore, Super-Kamiokande, and SNO, we have identified that atmospheric neutrino data can impose stringent constraints on the long-range muon spin force model and the $(g-2)_\\mu$ parameter space. Additionally, solar neutrino data places a strong limit on the model but provides a weaker constraint on the $(g-2)_\\mu$ parameter space due to a sign mismatch. With optimized data analysis techniques and the potential from future experiments, such as JUNO, Hyper-Kamiokande, SNO+, and IceCube PINGU, there exists a promising opportunity to achieve even greater sensitivities. Indeed, neutrino oscillations offer a robust and distinctive cross-check for the model, offering stringent constraints on the $(g-2)_\\mu$ parameter space.","sentences":["Recent studies have proposed using a geocentric muon spin force to account for the $(g-2)_\\mu$ anomaly, with the long-range force mediator being a light axion-like particle.","The mediator exhibits a CP-violating scalar coupling to nucleons and a normal derivative coupling to muons.","Due to the weak symmetry, this axion inevitably couples to neutrinos, providing potential impact on neutrino oscillations.","By utilizing neutrino data from BOREXINO, IceCube DeepCore, Super-Kamiokande, and SNO, we have identified that atmospheric neutrino data can impose stringent constraints on the long-range muon spin force model and the $(g-2)_\\mu$ parameter space.","Additionally, solar neutrino data places a strong limit on the model but provides a weaker constraint on the $(g-2)_\\mu$ parameter space due to a sign mismatch.","With optimized data analysis techniques and the potential from future experiments, such as JUNO, Hyper-Kamiokande, SNO+, and IceCube PINGU, there exists a promising opportunity to achieve even greater sensitivities.","Indeed, neutrino oscillations offer a robust and distinctive cross-check for the model, offering stringent constraints on the $(g-2)_\\mu$ parameter space."],"url":"http://arxiv.org/abs/2405.02084v1","category":"hep-ph"} +{"created":"2024-05-03 11:03:03","title":"Optimizing Robot Dispersion on Grids: with and without Fault Tolerance","abstract":"The introduction and study of dispersing mobile robots across the nodes of an anonymous graph have recently gained traction and have been explored within various graph classes and settings. While optimal dispersion solution was established for {\\em oriented} grids [Kshemkalyani et al., WALCOM 2020], a significant unresolved question pertains to whether achieving optimal dispersion is feasible on an {\\em unoriented} grid. This paper investigates the dispersion problem on unoriented grids, considering both non-faulty and faulty robots. The challenge posed by unoriented grids lies in the absence of a clear sense of direction for a single robot moving between nodes, as opposed to the straightforward navigation of oriented grids. We present three deterministic algorithms tailored to our robot model. The first and second algorithms deal with the dispersion of faulty and non-faulty robots, ensuring both time and memory optimization in oriented and unoriented grids, respectively. Faulty robots that are prone to crashing at any time, causing permanent failure. In both settings, we achieve dispersion in $O(\\sqrt{n})$ rounds while requiring $O(\\log n)$ bits of memory per robot. The third algorithm tackles faulty robots prone to crash faults in an unoriented grid. In this scenario, our algorithm operates within $O(\\sqrt{n} \\log n)$ time and uses $O(\\sqrt{n} \\log n)$ bits of memory per robot. The robots need to know the value of $n$ for termination.","sentences":["The introduction and study of dispersing mobile robots across the nodes of an anonymous graph have recently gained traction and have been explored within various graph classes and settings.","While optimal dispersion solution was established for {\\em oriented} grids [Kshemkalyani et al., WALCOM 2020], a significant unresolved question pertains to whether achieving optimal dispersion is feasible on an {\\em unoriented} grid.","This paper investigates the dispersion problem on unoriented grids, considering both non-faulty and faulty robots.","The challenge posed by unoriented grids lies in the absence of a clear sense of direction for a single robot moving between nodes, as opposed to the straightforward navigation of oriented grids. ","We present three deterministic algorithms tailored to our robot model.","The first and second algorithms deal with the dispersion of faulty and non-faulty robots, ensuring both time and memory optimization in oriented and unoriented grids, respectively.","Faulty robots that are prone to crashing at any time, causing permanent failure.","In both settings, we achieve dispersion in $O(\\sqrt{n})$ rounds while requiring $O(\\log n)$ bits of memory per robot.","The third algorithm tackles faulty robots prone to crash faults in an unoriented grid.","In this scenario, our algorithm operates within $O(\\sqrt{n} \\log n)$ time and uses $O(\\sqrt{n} \\log n)$ bits of memory per robot.","The robots need to know the value of $n$ for termination."],"url":"http://arxiv.org/abs/2405.02002v1","category":"cs.DC"} +{"created":"2024-05-03 09:10:40","title":"CRCL at SemEval-2024 Task 2: Simple prompt optimizations","abstract":"We present a baseline for the SemEval 2024 task 2 challenge, whose objective is to ascertain the inference relationship between pairs of clinical trial report sections and statements. We apply prompt optimization techniques with LLM Instruct models provided as a Language Model-as-a-Service (LMaaS). We observed, in line with recent findings, that synthetic CoT prompts significantly enhance manually crafted ones.","sentences":["We present a baseline for the SemEval 2024 task 2 challenge, whose objective is to ascertain the inference relationship between pairs of clinical trial report sections and statements.","We apply prompt optimization techniques with LLM Instruct models provided as a Language Model-as-a-Service (LMaaS).","We observed, in line with recent findings, that synthetic CoT prompts significantly enhance manually crafted ones."],"url":"http://arxiv.org/abs/2405.01942v1","category":"cs.CL"} +{"created":"2024-05-03 09:01:30","title":"Theoretical study of the excitation function in the CN + C2H6 hydrogen transfer reaction. Effect of vibrational excitation","abstract":"To gain insight into the dynamics of the CN + C2H6 gas-phase reaction, quasi-classical trajectory (QCT) calculations were performed on a full-dimensional analytical potential energy surface. This reaction presents very high exothermicity, -22.20 kcal/mol, and it is practically barrierless, with a barrier height of 0.23 kcal/mol, being an early transition state reaction. The V-shape form of the excitation function is characteristic of non-threshold reactions. The pronounced increase observed at lower energies can be attributed to the substantial increase in the impact parameter within this energy regime. Vibrational excitations by one quantum of stretching and bending modes give rise to excitation functions that present a similar V-shaped profile.","sentences":["To gain insight into the dynamics of the CN + C2H6 gas-phase reaction, quasi-classical trajectory (QCT) calculations were performed on a full-dimensional analytical potential energy surface.","This reaction presents very high exothermicity, -22.20 kcal/mol, and it is practically barrierless, with a barrier height of 0.23 kcal/mol, being an early transition state reaction.","The V-shape form of the excitation function is characteristic of non-threshold reactions.","The pronounced increase observed at lower energies can be attributed to the substantial increase in the impact parameter within this energy regime.","Vibrational excitations by one quantum of stretching and bending modes give rise to excitation functions that present a similar V-shaped profile."],"url":"http://arxiv.org/abs/2405.01936v1","category":"physics.chem-ph"} +{"created":"2024-05-03 17:58:31","title":"Study of energy backflow in unidirectional monochromatic and space-time waves","abstract":"Backflow, or retropropagation, is a counterintuitive phenomenon whereby for a forward-propagating wave the energy locally propagates backward. In this study, energy backflow has been examined in connection with (a) (2+1)-dimensional unidirectional scalar and vector-valued monochromatic waves; (b) a (2+1)D scalar spatiotemporal wavepacket constructed by using an appropriate temporal frequency spectrum; (c) a scalar closed-form analytical unidirectional version of the Focus Wave Mode -- a localized pulse propagating luminally and without spread. Furthermore, an extended class of (2+1)D and (3+1)D finite-energy unidirectional spatiotemporally localized wave packets has been derived.","sentences":["Backflow, or retropropagation, is a counterintuitive phenomenon whereby for a forward-propagating wave the energy locally propagates backward.","In this study, energy backflow has been examined in connection with (a) (2+1)-dimensional unidirectional scalar and vector-valued monochromatic waves; (b) a (2+1)D scalar spatiotemporal wavepacket constructed by using an appropriate temporal frequency spectrum; (c) a scalar closed-form analytical unidirectional version of the Focus Wave Mode -- a localized pulse propagating luminally and without spread.","Furthermore, an extended class of (2+1)D and (3+1)D finite-energy unidirectional spatiotemporally localized wave packets has been derived."],"url":"http://arxiv.org/abs/2405.02284v1","category":"physics.optics"} +{"created":"2024-05-03 17:47:31","title":"Early flash-ionization lines in SN 2024ggi revealed by high-resolution spectroscopy","abstract":"We present an analysis of very early high-resolution spectroscopic observations of the nearby core-collapse (CC) supernova (SN) 2024ggi, a Type II SN that ocurred in the galaxy NGC 3621, at a distance of 7.11 Mpc ($z\\approx0.002435$). These observations represent the earliest high-resolution spectroscopy of a CCSN ever made. We analyze the very early-phase spectroscopic evolution of SN 2024ggi obtained in a short interval at 26.6 and 33.8h after the SN first light. Observations were obtained with the high-resolution spectrograph MIKE ($R\\approx22600-28000$) at the 6.5m Magellan Clay Telescope, located at the Las Campanas Observatory, during the night of 2024-04-12UT. We constrain emission line features in the early-phase spectroscopic evolution of SN 2024ggi. We analyze the evolution of main spectroscopic features and the occurrence of high-ionization emission lines, by estimating their full width at half maximum (FWHM), equivalent width (EW), and blueshift velocities. We then compare our results to other early-time observations of CCSNe. The spectra show strong and narrow features of Balmer emission lines and of high-ionization species of HeI, HeII, NIII, CIII, together with relatively broader emission features of NIV and CIV. Some of these features become broader or disappear in the interval of 8h, indicating the rapid changes in the early evolution of CCSNe flash-ionization features. The HeII, CIV, NIV and Balmer emission lines have asymmetric Lorentzian profiles, with the HeII $\\lambda4686$ broad component showing blue wings that extends up to $\\sim-1000$ km s$^{-1}$. We also measure a CSM expansion velocity of $\\sim 79 \\ \\textrm{km} \\ \\textrm{s}^{-1}$ from the blueshift in the H$\\alpha$ emission profile, and a total extinction in the line of sight of $E(B-V)=0.16$ mag. Finally, we note many similarities of SN 2024ggi to the early evolution of SN 2023ixf.","sentences":["We present an analysis of very early high-resolution spectroscopic observations of the nearby core-collapse (CC) supernova (SN) 2024ggi, a Type II SN that ocurred in the galaxy NGC 3621, at a distance of 7.11 Mpc ($z\\approx0.002435$).","These observations represent the earliest high-resolution spectroscopy of a CCSN ever made.","We analyze the very early-phase spectroscopic evolution of SN 2024ggi obtained in a short interval at 26.6 and 33.8h after the SN first light.","Observations were obtained with the high-resolution spectrograph MIKE ($R\\approx22600-28000$) at the 6.5m Magellan Clay Telescope, located at the Las Campanas Observatory, during the night of 2024-04-12UT.","We constrain emission line features in the early-phase spectroscopic evolution of SN 2024ggi.","We analyze the evolution of main spectroscopic features and the occurrence of high-ionization emission lines, by estimating their full width at half maximum (FWHM), equivalent width (EW), and blueshift velocities.","We then compare our results to other early-time observations of CCSNe.","The spectra show strong and narrow features of Balmer emission lines and of high-ionization species of HeI, HeII, NIII, CIII, together with relatively broader emission features of NIV and CIV.","Some of these features become broader or disappear in the interval of 8h, indicating the rapid changes in the early evolution of CCSNe flash-ionization features.","The HeII, CIV, NIV and Balmer emission lines have asymmetric Lorentzian profiles, with the HeII $\\lambda4686$ broad component showing blue wings that extends up to $\\sim-1000$ km s$^{-1}$.","We also measure a CSM expansion velocity of $\\sim 79 \\ \\textrm{km} \\ \\textrm{s}^{-1}$ from the blueshift in the H$\\alpha$ emission profile, and a total extinction in the line of sight of $E(B-V)=0.16$ mag.","Finally, we note many similarities of SN 2024ggi to the early evolution of SN 2023ixf."],"url":"http://arxiv.org/abs/2405.02274v1","category":"astro-ph.HE"} +{"created":"2024-05-03 17:04:23","title":"Reversible single-pulse laser-induced phase change of Sb$_2$S$_3$ thin films: multi-physics modeling and experimental demonstrations","abstract":"Phase change materials (PCMs) have gained a tremendous interest as a means to actively tune nanophotonic devices through the large optical modulation produced by their amorphous to crystalline reversible transition. Recently, materials such as Sb$_2$S$_3$ emerged as particularly promising low loss PCMs, with both large refractive index modulations and transparency in the visible and NIR. Controlling the local and reversible phase transition in this material is of major importance for future applications, and an appealing method to do so is to exploit pulsed lasers. Yet, the physics and limits involved in the optical switching of Sb$_2$S$_3$ are not yet well understood. Here, we investigate the reversible laser-induced phase transition of Sb$_2$S$_3$, focusing specifically on the mechanisms that drive the optically induced amorphization, with multi-physics considerations including the optical and thermal properties of the PCM and its environment. We theoretically and experimentally determine the laser energy threshold for reversibly changing the phase of the PCM, not only between fully amorphous and crystalline states but also between partially recrystallized states. We then reveal the non-negligible impact of the material's polycrystallinity and anisotropy on the power thresholds for optical switching. Finally, we address the challenges related to laser amorphization of thick Sb$_2$S$_3$ layers, as well as strategies to overcome them. These results enable a qualitative and quantitative understanding of the physics behind the optically-induced reversible change of phase in Sb$_2$S$_3$ layers.","sentences":["Phase change materials (PCMs) have gained a tremendous interest as a means to actively tune nanophotonic devices through the large optical modulation produced by their amorphous to crystalline reversible transition.","Recently, materials such as Sb$_2$S$_3$ emerged as particularly promising low loss PCMs, with both large refractive index modulations and transparency in the visible and NIR.","Controlling the local and reversible phase transition in this material is of major importance for future applications, and an appealing method to do so is to exploit pulsed lasers.","Yet, the physics and limits involved in the optical switching of Sb$_2$S$_3$ are not yet well understood.","Here, we investigate the reversible laser-induced phase transition of Sb$_2$S$_3$, focusing specifically on the mechanisms that drive the optically induced amorphization, with multi-physics considerations including the optical and thermal properties of the PCM and its environment.","We theoretically and experimentally determine the laser energy threshold for reversibly changing the phase of the PCM, not only between fully amorphous and crystalline states but also between partially recrystallized states.","We then reveal the non-negligible impact of the material's polycrystallinity and anisotropy on the power thresholds for optical switching.","Finally, we address the challenges related to laser amorphization of thick Sb$_2$S$_3$ layers, as well as strategies to overcome them.","These results enable a qualitative and quantitative understanding of the physics behind the optically-induced reversible change of phase in Sb$_2$S$_3$ layers."],"url":"http://arxiv.org/abs/2405.02249v1","category":"physics.app-ph"} +{"created":"2024-05-03 16:56:46","title":"Measurement of the flow harmonic correlations via multi-particle symmetric and asymmetric cumulants in Au+Au collisions at $\\sqrt{s_{NN}}$ = 200 GeV","abstract":"We study multi-particle azimuthal correlations in Au+Au collisions at $\\sqrt{s_{NN}} = 200$ GeV. We use initial conditions obtained from a Monte-Carlo Glauber model and evolve them within a viscous relativistic hydrodynamics framework that eventually gives way to a transport model in the late hadronic stage of the evolution. We compute the multi-particle symmetric and asymmetric cumulants and present the results for their sensitivity to the shear and bulk viscosities during the hydrodynamic evolution. We show that these observables are more sensitive to the transport coefficients than the traditional flow observables.","sentences":["We study multi-particle azimuthal correlations in Au+Au collisions at $\\sqrt{s_{NN}} = 200$ GeV.","We use initial conditions obtained from a Monte-Carlo Glauber model and evolve them within a viscous relativistic hydrodynamics framework that eventually gives way to a transport model in the late hadronic stage of the evolution.","We compute the multi-particle symmetric and asymmetric cumulants and present the results for their sensitivity to the shear and bulk viscosities during the hydrodynamic evolution.","We show that these observables are more sensitive to the transport coefficients than the traditional flow observables."],"url":"http://arxiv.org/abs/2405.02245v1","category":"nucl-th"} +{"created":"2024-05-03 16:43:17","title":"Influence of Polymer on Shock-Induced Pore Collapse: Hotspot Criticality through Reactive Molecular Dynamics","abstract":"The shock initiation of energetic materials is mediated by the localization of mechanical energy into hotspots. These originate through the interaction of the shock and material microstructure; the most potent hotspots are formed by the collapse of porosity. Recent work using molecular dynamics (MD) has shed light on the molecular mechanisms responsible for the shock-to-deflagration transition following pore collapse in pure energetic materials. However, explosive formulations are composites of energetic crystals and a polymer binder, which differs from the prior focus on pure materials. The role of polymer phases on hotspot formation and its criticality is not well-understood. We use reactive MD simulations to investigate the role of polystyrene and polyvinyl nitrate films around pores in the shock-induced pore collapse of RDX. The polymer affects the hotspots' temperature and their criticality. While the presence of inert polymer often delays or hinders chemical reactions of the energetic material, certain geometries accelerate chemistry. The simulations provide a mechanistic understanding of these phenomena.","sentences":["The shock initiation of energetic materials is mediated by the localization of mechanical energy into hotspots.","These originate through the interaction of the shock and material microstructure; the most potent hotspots are formed by the collapse of porosity.","Recent work using molecular dynamics (MD) has shed light on the molecular mechanisms responsible for the shock-to-deflagration transition following pore collapse in pure energetic materials.","However, explosive formulations are composites of energetic crystals and a polymer binder, which differs from the prior focus on pure materials.","The role of polymer phases on hotspot formation and its criticality is not well-understood.","We use reactive MD simulations to investigate the role of polystyrene and polyvinyl nitrate films around pores in the shock-induced pore collapse of RDX.","The polymer affects the hotspots' temperature and their criticality.","While the presence of inert polymer often delays or hinders chemical reactions of the energetic material, certain geometries accelerate chemistry.","The simulations provide a mechanistic understanding of these phenomena."],"url":"http://arxiv.org/abs/2405.02234v1","category":"cond-mat.mtrl-sci"} +{"created":"2024-05-03 16:31:25","title":"Lectures on Resurgence in Integrable Field Theories","abstract":"There has been recently considerable progress in understanding the nature of perturbation theory in UV free and gapped $2d$ integrable field theories with renormalon singularities. Thanks to Bethe ansatz and large $N$ techniques, non-perturbative corrections can also be computed and lead to the reconstruction of the trans-series for the free energy in presence of a chemical potential. This is an ideal arena to test resurgence in QFT and determine if and how the exact result can be reconstructed from the knowledge of the perturbative series only. In these notes we give a pedagogical introduction to this subject starting from the basics. In the first lecture we give an overview of applications in QFT of Borel resummations before the advent of resurgence. The second lecture introduces the key concepts of resurgence and finally in the third lecture we discuss a specific application in the context of the principal chiral field model. Extended version of three lectures given at IHES and review talks given at Les Diablerets and Mainz, in 2023.","sentences":["There has been recently considerable progress in understanding the nature of perturbation theory in UV free and gapped $2d$ integrable field theories with renormalon singularities.","Thanks to Bethe ansatz and large $N$ techniques, non-perturbative corrections can also be computed and lead to the reconstruction of the trans-series for the free energy in presence of a chemical potential.","This is an ideal arena to test resurgence in QFT and determine if and how the exact result can be reconstructed from the knowledge of the perturbative series only.","In these notes we give a pedagogical introduction to this subject starting from the basics.","In the first lecture we give an overview of applications in QFT of Borel resummations before the advent of resurgence.","The second lecture introduces the key concepts of resurgence and finally in the third lecture we discuss a specific application in the context of the principal chiral field model.","Extended version of three lectures given at IHES and review talks given at Les Diablerets and Mainz, in 2023."],"url":"http://arxiv.org/abs/2405.02224v1","category":"hep-th"} +{"created":"2024-05-03 16:28:20","title":"Unitarity in the non-relativistic regime and implications for dark matter","abstract":"Unitarity sets upper limits on partial-wave elastic and inelastic cross-sections, which are often violated by perturbative computations. We discuss the dynamics underlying these limits in the non-relativistic regime, namely long-range interactions, and show how the resummation of the elastic 2-particle-irreducible diagrams arising from squaring inelastic processes unitarizes inelastic cross-sections. Our results are model-independent, apply to all partial waves, and affect elastic and inelastic cross-sections, with extensive implications for new physics scenarios, including dark-matter freeze-out and self-interactions.","sentences":["Unitarity sets upper limits on partial-wave elastic and inelastic cross-sections, which are often violated by perturbative computations.","We discuss the dynamics underlying these limits in the non-relativistic regime, namely long-range interactions, and show how the resummation of the elastic 2-particle-irreducible diagrams arising from squaring inelastic processes unitarizes inelastic cross-sections.","Our results are model-independent, apply to all partial waves, and affect elastic and inelastic cross-sections, with extensive implications for new physics scenarios, including dark-matter freeze-out and self-interactions."],"url":"http://arxiv.org/abs/2405.02222v1","category":"hep-ph"} +{"created":"2024-05-03 16:11:00","title":"Pseudoscalar Higgs plus jet production at Next-to-Next-to-Leading Order in QCD","abstract":"We present a calculation of pseudoscalar Higgs production in association with a jet at Next-to-Next-to Leading Order (NNLO) accuracy in QCD. We work in an effective field theory in which $m_t \\rightarrow \\infty$ resulting in effective operators which couple the pseudoscalar to gluons and (massless) quarks. We have calculated all of the relevant amplitudes for the two-loop, one-loop and tree-level contributions. As a cross-check of our calculation we have re-calculated all of the scalar Higgs plus parton amplitudes and perform a detailed comparison to the literature. In order to regulate the infra-red singularities present at this order we employ the $N-$jettiness slicing method. In addition to a detailed validation of our calculation at this order we investigate LHC phenomenology for a selection of pseudoscalar Higgs masses. Our results are implemented into the parton-level Monte Carlo code MCFM.","sentences":["We present a calculation of pseudoscalar Higgs production in association with a jet at Next-to-Next-to Leading Order (NNLO) accuracy in QCD.","We work in an effective field theory in which $m_t \\rightarrow \\infty$ resulting in effective operators which couple the pseudoscalar to gluons and (massless) quarks.","We have calculated all of the relevant amplitudes for the two-loop, one-loop and tree-level contributions.","As a cross-check of our calculation we have re-calculated all of the scalar Higgs plus parton amplitudes and perform a detailed comparison to the literature.","In order to regulate the infra-red singularities present at this order we employ the $N-$jettiness slicing method.","In addition to a detailed validation of our calculation at this order we investigate LHC phenomenology for a selection of pseudoscalar Higgs masses.","Our results are implemented into the parton-level Monte Carlo code MCFM."],"url":"http://arxiv.org/abs/2405.02210v1","category":"hep-ph"} +{"created":"2024-05-03 15:42:27","title":"Unraveling p-type and n-type interfaces in Superconducting Infinite-Layer Nickelate thin films","abstract":"After decades of research, superconductivity was finally found in nickel-based analogs of superconducting cuprates, with infinite-layer (IL) structure. These results are so far restricted to thin films in the case of IL-nickelates. Therefore, the nature of the interface with the substrate, and how it couples with the thin film properties is still an open question. Here, using scanning transmission electron microscopy (STEM)- electron energy loss spectroscopy (EELS) and four-dimensional (4D)-STEM, a novel chemically sharp p-type interface is observed in a series of superconducting IL-praseodymium nickelate samples, and a comparative study is carried out with the previously reported n-type interface obtained in other samples. Both interfaces have strong differences, with the p-type interface being highly polar. In combination with ab-initio calculations, we find that the influence of the interface on the electronic structure is local, and does not extend beyond 2-3 unit cells into the thin film. This decouples the direct influence of the interface in driving the superconductivity, and indicates that the IL-nickelate thin films do not have a universal interface model. Insights into the spatial hole-distribution in SC samples, provided by monochromated EELS and total reflection-hard x-ray photoemission spectroscopy, suggest that this particular distribution might be directly influencing superconductivity.","sentences":["After decades of research, superconductivity was finally found in nickel-based analogs of superconducting cuprates, with infinite-layer (IL) structure.","These results are so far restricted to thin films in the case of IL-nickelates.","Therefore, the nature of the interface with the substrate, and how it couples with the thin film properties is still an open question.","Here, using scanning transmission electron microscopy (STEM)- electron energy loss spectroscopy (EELS) and four-dimensional (4D)-STEM, a novel chemically sharp p-type interface is observed in a series of superconducting IL-praseodymium nickelate samples, and a comparative study is carried out with the previously reported n-type interface obtained in other samples.","Both interfaces have strong differences, with the p-type interface being highly polar.","In combination with ab-initio calculations, we find that the influence of the interface on the electronic structure is local, and does not extend beyond 2-3 unit cells into the thin film.","This decouples the direct influence of the interface in driving the superconductivity, and indicates that the IL-nickelate thin films do not have a universal interface model.","Insights into the spatial hole-distribution in SC samples, provided by monochromated EELS and total reflection-hard x-ray photoemission spectroscopy, suggest that this particular distribution might be directly influencing superconductivity."],"url":"http://arxiv.org/abs/2405.02186v1","category":"cond-mat.supr-con"} +{"created":"2024-05-03 15:17:15","title":"The role of LRG1 and LRG2's monopole in inferring the DESI 2024 BAO cosmology","abstract":"Dark Energy Spectroscopic Instrument (DESI) collaboration recently release the first year data (DR1) of baryon acoustic oscillations (BAO) in galaxy, quasar and Lyman-$\\alpha$ forest tracers. When combined the CMB and SNIa data with DESI BAO, its cosmological implication shows a hint of thawing dark energy behavior. The official distance measurements for cosmology analysis are given in terms of the comoving distances along ($D_H$), and perpendicular to ($D_M$), the line of sight to the observer. We notice that there are $1\\sim2\\sigma$ deviations in $D_M$ and $D_H$ from the values of Planck cosmology in the luminous red galaxies (LRG) bin 1 and bin 2, namely LRG1 and LRG2. In this paper, we want to study the role of LRG1 and LRG2 in driving the DESI 2024 BAO osmology away from Planck cosmology. Since the angle-averaged distance $D_V$ and the ratio of transverse and line-of-sight comoving distances $F_{\\rm AP}=D_M/D_H$ are more directly related with the measured monopole and quadrupole components of galaxy power spectrum or correlation function, we use $D_V$ and $F_{\\rm AP}$ instead of the officially adopted $D_M$ and $D_H$. The purpose of this data vector transformation is to isolate the influence of monopoles in LRG1 and LRG2 in driving the deviation from $w=-1$. We find that by removing the $D_V$ data point in LRG2, the DESI + CMB + SNIa data compilation recovers $w=-1$ within $2\\sigma$ contour.Similarly, the exclusion of the $D_V$ data point from LRG1 shifts the contour in the $w_0/w_a$ plane toward $w=-1$, though no intersection is observed. This underscores the preference of both the LRG1 and LRG2 BAO monopole components for the thawing dark energy model, with LRG2 exhibiting a stronger preference compared to LRG1. Alongside this paper, we provide the $D_V$ and $F_{\\rm AP}$ data as well as their covariance.","sentences":["Dark Energy Spectroscopic Instrument (DESI) collaboration recently release the first year data (DR1) of baryon acoustic oscillations (BAO) in galaxy, quasar and Lyman-$\\alpha$ forest tracers.","When combined the CMB and SNIa data with DESI BAO, its cosmological implication shows a hint of thawing dark energy behavior.","The official distance measurements for cosmology analysis are given in terms of the comoving distances along ($D_H$), and perpendicular to ($D_M$), the line of sight to the observer.","We notice that there are $1\\sim2\\sigma$ deviations in $D_M$ and $D_H$ from the values of Planck cosmology in the luminous red galaxies (LRG) bin 1 and bin 2, namely LRG1 and LRG2.","In this paper, we want to study the role of LRG1 and LRG2 in driving the DESI 2024 BAO osmology away from Planck cosmology.","Since the angle-averaged distance $D_V$ and the ratio of transverse and line-of-sight comoving distances $F_{\\rm AP}=D_M/D_H$ are more directly related with the measured monopole and quadrupole components of galaxy power spectrum or correlation function, we use $D_V$ and $F_{\\rm AP}$ instead of the officially adopted $D_M$ and $D_H$. The purpose of this data vector transformation is to isolate the influence of monopoles in LRG1 and LRG2 in driving the deviation from $w=-1$. We find that by removing the $D_V$ data point in LRG2, the DESI + CMB + SNIa data compilation recovers $w=-1$ within $2\\sigma$ contour.","Similarly, the exclusion of the $D_V$ data point from LRG1 shifts the contour in the $w_0/w_a$ plane toward $w=-1$, though no intersection is observed.","This underscores the preference of both the LRG1 and LRG2 BAO monopole components for the thawing dark energy model, with LRG2 exhibiting a stronger preference compared to LRG1.","Alongside this paper, we provide the $D_V$ and $F_{\\rm AP}$ data as well as their covariance."],"url":"http://arxiv.org/abs/2405.02168v1","category":"astro-ph.CO"} +{"created":"2024-05-03 14:41:01","title":"Multi-rate Runge-Kutta methods: stability analysis and applications","abstract":"We present an approach for the efficient implementation of self-adjusting multi-rate Runge-Kutta methods and we extend the previously available stability analyses of these methods to the case of an arbitrary number of sub-steps for the active components. We propose a physically motivated model problem that can be used to assess the stability of different multi-rate versions of standard Runge-Kutta methods and the impact of different interpolation methods for the latent variables. Finally, we present the results of several numerical experiments, performed with implementations of the proposed methods in the framework of the \\textit{OpenModelica} open-source modelling and simulation software, which demonstrate the efficiency gains deriving from the use of the proposed multi-rate approach for physical modelling problems with multiple time scales.","sentences":["We present an approach for the efficient implementation of self-adjusting multi-rate Runge-Kutta methods and we extend the previously available stability analyses of these methods to the case of an arbitrary number of sub-steps for the active components.","We propose a physically motivated model problem that can be used to assess the stability of different multi-rate versions of standard Runge-Kutta methods and the impact of different interpolation methods for the latent variables.","Finally, we present the results of several numerical experiments, performed with implementations of the proposed methods in the framework of the \\textit{OpenModelica} open-source modelling and simulation software, which demonstrate the efficiency gains deriving from the use of the proposed multi-rate approach for physical modelling problems with multiple time scales."],"url":"http://arxiv.org/abs/2405.02139v1","category":"math.NA"} +{"created":"2024-05-03 14:39:31","title":"Intermittent thermal convection in jammed emulsions","abstract":"We study the process of thermal convection in jammed emulsions with a yield-stress rheology. We find that heat transfer occurs via an intermittent mechanism, whereby intense short-lived convective \"heat bursts\" are spaced out by long-lasting conductive periods. This behaviour is the result of a sequence of fluidization-rigidity transitions, rooted in a non-trivial interplay between emulsion yield-stress rheology and plastic activity, which we characterize via a statistical analysis of the dynamics at the droplet scale. We also show that droplets' coalescence induced during heat bursts leads to a spatially heterogeneous phase-inversion of the emulsion which eventually supports a sustained convective state.","sentences":["We study the process of thermal convection in jammed emulsions with a yield-stress rheology.","We find that heat transfer occurs via an intermittent mechanism, whereby intense short-lived convective \"heat bursts\" are spaced out by long-lasting conductive periods.","This behaviour is the result of a sequence of fluidization-rigidity transitions, rooted in a non-trivial interplay between emulsion yield-stress rheology and plastic activity, which we characterize via a statistical analysis of the dynamics at the droplet scale.","We also show that droplets' coalescence induced during heat bursts leads to a spatially heterogeneous phase-inversion of the emulsion which eventually supports a sustained convective state."],"url":"http://arxiv.org/abs/2405.02135v1","category":"physics.flu-dyn"} +{"created":"2024-05-03 14:25:30","title":"Floquet dynamics of ultracold atoms in optical lattices with a parametrically modulated trapping potential","abstract":"Experiments with ultracold atoms in optical lattices usually involve a weak parabolic trapping potential which merely serves to confine the atoms, but otherwise remains negligible. In contrast, we suggest a different class of experiments in which the presence of a stronger trap is an essential part of the set-up. Because the trap-modified on-site energies exhibit a slowly varying level spacing, similar to that of an anharmonic oscillator, an additional time-periodic trap modulation with judiciously chosen parameters creates nonlinear resonances which enable efficient Floquet engineering. We employ a Mathieu approximation for constructing the near-resonant Floquet states in an accurate manner and demonstrate the emergence of effective ground states from the resonant trap eigenstates. Moreover, we show that the population of the Floquet states is strongly affected by the phase of a sudden turn-on of the trap modulation, which leads to significantly modified and rich dynamics. As a guideline for further studies, we argue that the deliberate population of only the resonance-induced effective ground states will allow one to realize Floquet condensates which follow classical periodic orbits, thus providing challenging future perspectives for the investigation of the quantum-classical correspondence.","sentences":["Experiments with ultracold atoms in optical lattices usually involve a weak parabolic trapping potential which merely serves to confine the atoms, but otherwise remains negligible.","In contrast, we suggest a different class of experiments in which the presence of a stronger trap is an essential part of the set-up.","Because the trap-modified on-site energies exhibit a slowly varying level spacing, similar to that of an anharmonic oscillator, an additional time-periodic trap modulation with judiciously chosen parameters creates nonlinear resonances which enable efficient Floquet engineering.","We employ a Mathieu approximation for constructing the near-resonant Floquet states in an accurate manner and demonstrate the emergence of effective ground states from the resonant trap eigenstates.","Moreover, we show that the population of the Floquet states is strongly affected by the phase of a sudden turn-on of the trap modulation, which leads to significantly modified and rich dynamics.","As a guideline for further studies, we argue that the deliberate population of only the resonance-induced effective ground states will allow one to realize Floquet condensates which follow classical periodic orbits, thus providing challenging future perspectives for the investigation of the quantum-classical correspondence."],"url":"http://arxiv.org/abs/2405.02125v1","category":"cond-mat.quant-gas"} +{"created":"2024-05-03 14:18:21","title":"New analytical and geometrical aspects on Trudinger-Moser type inequality in 2D","abstract":"The present survey is devoted to results on Trudinger-Moser inequalities in two dimension. We give a brief overview of the history of these celebrated inequalities and, starting from the geometric problem that motivated Moser's original work, we discuss the connection between Onofri's inequality for the unit sphere and sharp inequalities on Euclidean domains. Finally, we present recent results and new insights into nonlocal interaction energy functionals in two dimension, involving logarithmic kernels.","sentences":["The present survey is devoted to results on Trudinger-Moser inequalities in two dimension.","We give a brief overview of the history of these celebrated inequalities and, starting from the geometric problem that motivated Moser's original work, we discuss the connection between Onofri's inequality for the unit sphere and sharp inequalities on Euclidean domains.","Finally, we present recent results and new insights into nonlocal interaction energy functionals in two dimension, involving logarithmic kernels."],"url":"http://arxiv.org/abs/2405.02118v1","category":"math.AP"} +{"created":"2024-05-03 14:17:23","title":"Intriguing aspects of light baryon resonances","abstract":"We discuss that some light baryon resonances exhibit properties which cannot be described when attributing a three-valence quark structure to them. Besides pointing out the hadron resonances which clearly require description beyond the quark model, we focus on the third $s_{11},~ N^*$ state and its decay to final states consisting of the lightest hyperon resonances which have a partial width comparable to that for the decay to $\\pi N$. Such properties of the mentioned nucleon resonance get manifested in the cross sections and other observables related to processes producing the lightest hyperon resonances. We show that all these findings arise from the strong association of the baryon resonances to the dynamics among the ground-state hadrons.","sentences":["We discuss that some light baryon resonances exhibit properties which cannot be described when attributing a three-valence quark structure to them.","Besides pointing out the hadron resonances which clearly require description beyond the quark model, we focus on the third $s_{11},~ N^*$ state and its decay to final states consisting of the lightest hyperon resonances which have a partial width comparable to that for the decay to $\\pi N$. Such properties of the mentioned nucleon resonance get manifested in the cross sections and other observables related to processes producing the lightest hyperon resonances.","We show that all these findings arise from the strong association of the baryon resonances to the dynamics among the ground-state hadrons."],"url":"http://arxiv.org/abs/2405.02116v1","category":"hep-ph"} +{"created":"2024-05-03 13:24:04","title":"X-ray observations of the Zwicky 3146 galaxy cluster reveal a 3.5 keV excess","abstract":"In this note, we present spectral fits of the well-documented sloshing cool-core cluster Zwicky 3146 ($z=0.291$), to test the existence of the highly speculated 3.5 keV line. We report excesses at $>3\\sigma$ significance at $E=3.575$ keV, yielding a flux $F = 8.73_{-2.22}^{+2.17}$ $\\times 10^{-6}$ photons cm$^{-2}$ s$^{-1}$, in \\textit{XMM-Newton}, and $E=3.55$ keV, with a flux $F = 10.0_{-2.96}^{+3.05}$ $\\times 10^{-6}$ photons cm$^{-2}$ s$^{-1}$ in \\textit{Chandra}. We explore the possibility that the 3.5 keV excess is correlated to the presence of cold gas within the cluster, based on optical and sub-mm literature analyses. Following the launch of the X-ray Imaging and Spectroscopy Mission (XRISM), high resolution spectroscopy ($\\leq 7$ eV) will reveal in unprecedented detail, the origin of this unidentified feature, for which Zwicky 3146 should be considered a viable target, due to the strength of the feature in two independent X-ray telescopes, opening a new window into plasma or charge exchange studies in galaxy clusters.","sentences":["In this note, we present spectral fits of the well-documented sloshing cool-core cluster Zwicky 3146 ($z=0.291$), to test the existence of the highly speculated 3.5 keV line.","We report excesses at $>3\\sigma$ significance at $E=3.575$ keV, yielding a flux $F = 8.73_{-2.22}^{+2.17}$ $\\times 10^{-6}$ photons cm$^{-2}$","s$^{-1}$, in \\textit{XMM-Newton}, and $E=3.55$ keV, with a flux $F = 10.0_{-2.96}^{+3.05}$ $\\times 10^{-6}$ photons cm$^{-2}$ s$^{-1}$ in \\textit{Chandra}.","We explore the possibility that the 3.5 keV excess is correlated to the presence of cold gas within the cluster, based on optical and sub-mm literature analyses.","Following the launch of the X-ray Imaging and Spectroscopy Mission (XRISM), high resolution spectroscopy ($\\leq 7$ eV) will reveal in unprecedented detail, the origin of this unidentified feature, for which Zwicky 3146 should be considered a viable target, due to the strength of the feature in two independent X-ray telescopes, opening a new window into plasma or charge exchange studies in galaxy clusters."],"url":"http://arxiv.org/abs/2405.02088v1","category":"astro-ph.HE"} +{"created":"2024-05-03 12:36:23","title":"The CO-dark molecular gas in the cold HI arc","abstract":"The CO-dark molecular gas (DMG), which refers to the molecular gas not traced by CO emission, is crucial for the evolution of the interstellar medium (ISM). While the gas properties of DMG have been widely explored in the Solar neighborhood, whether or not they are similar in the outer disk regions of the Milky Way is still not well understood. In this Letter, we confirm the existence of DMG toward a cold HI arc structure at 13 kpc away from the Galactic center with both OH emission and HI narrow self-absorption (HINSA). This is the first detection of HINSA in the outer disk region, in which the HINSA fraction ($N_{\\rm HINSA}$/$N_{\\rm H_2}$ = 0.022$\\pm$0.011) is an order of magnitude higher than the average value observed in nearby evolved dark clouds, but is consistent with that of the early evolutionary stage of dark clouds. The inferred H$_2$ column density from both extinction and OH emission ($N_{\\rm H_2} \\approx 10^{20}$ cm$^{-2}$) is an order of magnitude higher than previously estimated. Although the ISM environmental parameters are expected to be different between the outer Galactic disk regions and the Solar neighborhood, we find that the visual extinction ($A_{\\rm V}$ = 0.19$\\pm$0.03 mag), H$_2$-gas density ($n_{\\rm H_2} = 91\\pm46$ cm$^{-3}$), and molecular fraction (58\\%$\\pm$28\\%) of the DMG are rather similar to those of nearby diffuse molecular clouds. The existence of DMG associated with the expanding HI supershell supports a scenario where the expansion of supershells may trigger the formation of molecular clouds within a crossing timescale of the shock wave ($\\sim$10$^6$ yr).","sentences":["The CO-dark molecular gas (DMG), which refers to the molecular gas not traced by CO emission, is crucial for the evolution of the interstellar medium (ISM).","While the gas properties of DMG have been widely explored in the Solar neighborhood, whether or not they are similar in the outer disk regions of the Milky Way is still not well understood.","In this Letter, we confirm the existence of DMG toward a cold HI arc structure at 13 kpc away from the Galactic center with both OH emission and HI narrow self-absorption (HINSA).","This is the first detection of HINSA in the outer disk region, in which the HINSA fraction ($N_{\\rm HINSA}$/$N_{\\rm H_2}$ = 0.022$\\pm$0.011) is an order of magnitude higher than the average value observed in nearby evolved dark clouds, but is consistent with that of the early evolutionary stage of dark clouds.","The inferred H$_2$ column density from both extinction and OH emission ($N_{\\rm H_2} \\approx 10^{20}$ cm$^{-2}$) is an order of magnitude higher than previously estimated.","Although the ISM environmental parameters are expected to be different between the outer Galactic disk regions and the Solar neighborhood, we find that the visual extinction ($A_{\\rm V}$ = 0.19$\\pm$0.03 mag), H$_2$-gas density ($n_{\\rm H_2} = 91\\pm46$ cm$^{-3}$), and molecular fraction (58\\%$\\pm$28\\%) of the DMG are rather similar to those of nearby diffuse molecular clouds.","The existence of DMG associated with the expanding HI supershell supports a scenario where the expansion of supershells may trigger the formation of molecular clouds within a crossing timescale of the shock wave ($\\sim$10$^6$ yr)."],"url":"http://arxiv.org/abs/2405.02055v1","category":"astro-ph.GA"} +{"created":"2024-05-03 12:25:42","title":"Cu$_x$Al$_{1-x}$ films as Alternatives to Copper for Advanced Interconnect Metallization","abstract":"Cu$_x$Al$_{1-x}$ thin films with $0.2 \\le x \\le 0.7$ have been studied as potential alternatives for the metallization of advanced interconnects. First-principles simulations were used to obtain the Cu$_x$Al$_{1-x}$ electronic structure and cohesive energy to benchmark different intermetallics and their prospects for interconnect metallization. Next, thin Cu$_x$Al$_{1-x}$ films were deposited by PVD with thicknesses in the range between 3 and 28 nm. The lowest resistivities of 9.5 $\\mu\\Omega$cm were obtained for 28 nm thick stochiometric CuAl and CuAl$_2$ after 400$^\\circ$C post-deposition annealing. Based on the experimental results, we discuss the main challenges for the studied aluminides from an interconnect point of view, namely the control of the film stoichiometry, the phase separation observed for off-stoichiometric CuAl and CuAl$_2$, as well as the presence of a nonstoichiometric surface oxide.","sentences":["Cu$_x$Al$_{1-x}$ thin films with $0.2 \\le x \\le 0.7$ have been studied as potential alternatives for the metallization of advanced interconnects.","First-principles simulations were used to obtain the Cu$_x$Al$_{1-x}$ electronic structure and cohesive energy to benchmark different intermetallics and their prospects for interconnect metallization.","Next, thin Cu$_x$Al$_{1-x}$ films were deposited by PVD with thicknesses in the range between 3 and 28 nm.","The lowest resistivities of 9.5 $\\mu\\Omega$cm were obtained for 28 nm thick stochiometric CuAl and CuAl$_2$ after 400$^\\circ$C post-deposition annealing.","Based on the experimental results, we discuss the main challenges for the studied aluminides from an interconnect point of view, namely the control of the film stoichiometry, the phase separation observed for off-stoichiometric CuAl and CuAl$_2$, as well as the presence of a nonstoichiometric surface oxide."],"url":"http://arxiv.org/abs/2405.02046v1","category":"cond-mat.mtrl-sci"} +{"created":"2024-05-03 11:56:19","title":"The primitive spectrum of C$^*$-algebras of \u00e9tale groupoids with abelian isotropy","abstract":"Given a Hausdorff locally compact \\'etale groupoid $\\G$, we describe as a topological space the part of the primitive spectrum of $C^*(\\G)$ obtained by inducing one-dimensional representations of amenable isotropy groups of $\\G$. When $\\G$ is amenable, second countable, with abelian isotropy groups, our result gives the description of $\\Prim C^*(\\G)$ conjectured by Van~Wyk and Williams. This, in principle, completely determines the ideal structure of a large class of separable C$^*$-algebras, including the transformation group C$^*$-algebras defined by amenable actions of discrete groups with abelian stabilizers and the C$^*$-algebras of higher rank graphs.","sentences":["Given a Hausdorff locally compact \\'etale groupoid $\\G$, we describe as a topological space the part of the primitive spectrum of $C^*(\\G)$ obtained by inducing one-dimensional representations of amenable isotropy groups of $\\G$. When $\\G$ is amenable, second countable, with abelian isotropy groups, our result gives the description of $\\Prim C^*(\\G)$ conjectured by Van~Wyk and Williams.","This, in principle, completely determines the ideal structure of a large class of separable C$^*$-algebras, including the transformation group C$^*$-algebras defined by amenable actions of discrete groups with abelian stabilizers and the C$^*$-algebras of higher rank graphs."],"url":"http://arxiv.org/abs/2405.02025v1","category":"math.OA"} +{"created":"2024-05-03 11:54:16","title":"STX-Vote: Improving Reliability with Bit Voting in Synchronous Transmission-based IoT Networks","abstract":"Industrial Internet of Things (IIoT) networks must meet strict reliability, latency, and low energy consumption requirements. However, traditional low-power wireless protocols are ineffective in finding a sweet spot for balancing these performance metrics. Recently, network flooding protocols based on Synchronous Transmissions (STX) have been proposed for better performance in reliability-critical IIoT, where simultaneous transmissions are possible without packet collisions. STX-based protocols can offer a competitive edge over routing-based protocols, particularly in dependability. However, they notably suffer from the beating effect, a physical layer phenomenon that results in sinusoidal interference across a packet and, consequently, packet loss. Thus, we introduce STX-Vote, an error correction scheme that can handle errors caused by beating effects. Importantly, we utilize transmission redundancy already inherent within STX protocols so do not incur additional on-air overhead. Through simulation, we demonstrate STX-Vote can provide a 40% increase in reliability. We subsequently implement STX-Vote on nRF52840-DK devices and perform extensive experiments. The results confirm that STX-Vote improves reliability by 25-28% for BLE 5 PHYs and 8% for IEEE 802.15.4; thus, it can complement existing error correction schemes.","sentences":["Industrial Internet of Things (IIoT) networks must meet strict reliability, latency, and low energy consumption requirements.","However, traditional low-power wireless protocols are ineffective in finding a sweet spot for balancing these performance metrics.","Recently, network flooding protocols based on Synchronous Transmissions (STX) have been proposed for better performance in reliability-critical IIoT, where simultaneous transmissions are possible without packet collisions.","STX-based protocols can offer a competitive edge over routing-based protocols, particularly in dependability.","However, they notably suffer from the beating effect, a physical layer phenomenon that results in sinusoidal interference across a packet and, consequently, packet loss.","Thus, we introduce STX-Vote, an error correction scheme that can handle errors caused by beating effects.","Importantly, we utilize transmission redundancy already inherent within STX protocols so do not incur additional on-air overhead.","Through simulation, we demonstrate STX-Vote can provide a 40% increase in reliability.","We subsequently implement STX-Vote on nRF52840-DK devices and perform extensive experiments.","The results confirm that STX-Vote improves reliability by 25-28% for BLE 5 PHYs and 8% for IEEE 802.15.4; thus, it can complement existing error correction schemes."],"url":"http://arxiv.org/abs/2405.02022v1","category":"cs.NI"} +{"created":"2024-05-03 11:17:07","title":"On the (in)consistency of perturbation theory at finite temperature","abstract":"A well-known difficulty of perturbative approaches to quantum field theory at finite temperature is the necessity to address theoretical constraints that are not present in the vacuum theory. In this work, we use lattice simulations of scalar correlation functions in massive $\\phi^{4}$ theory to analyse the extent to which these constraints affect the perturbative predictions. We find that the standard perturbative predictions deteriorate even in the absence of infrared divergences at relatively low temperatures, and that this is directly connected to the analytic structure of the propagators used in the expansion. This suggests that the incorporation of non-perturbative thermal effects in the propagators is essential for a consistent perturbative formulation of scalar quantum field theories at finite temperature. By utilising the spectral constraints imposed on finite-temperature correlation functions, we explore how these effects manifest themselves in the lattice data, and discuss why the presence of distinct thermoparticle excitations provides a potential resolution to these issues.","sentences":["A well-known difficulty of perturbative approaches to quantum field theory at finite temperature is the necessity to address theoretical constraints that are not present in the vacuum theory.","In this work, we use lattice simulations of scalar correlation functions in massive $\\phi^{4}$ theory to analyse the extent to which these constraints affect the perturbative predictions.","We find that the standard perturbative predictions deteriorate even in the absence of infrared divergences at relatively low temperatures, and that this is directly connected to the analytic structure of the propagators used in the expansion.","This suggests that the incorporation of non-perturbative thermal effects in the propagators is essential for a consistent perturbative formulation of scalar quantum field theories at finite temperature.","By utilising the spectral constraints imposed on finite-temperature correlation functions, we explore how these effects manifest themselves in the lattice data, and discuss why the presence of distinct thermoparticle excitations provides a potential resolution to these issues."],"url":"http://arxiv.org/abs/2405.02009v1","category":"hep-ph"} +{"created":"2024-05-03 11:11:13","title":"Analysing PolSAR data from vegetation by using the subaperture decomposition approach","abstract":"A common assumption in radar remote sensing studies for vegetation is that radar returns originate from a target made up by a set of uniformly distributed isotropic scatterers. Nonetheless, several studies in the literature have noted that orientation effects and heterogeneities have a noticeable impact in backscattering signatures according to the specific vegetation type and sensor frequency. In this paper we have employed the subaperture decomposition technique (i.e. a time-frequency analysis) and the 3-D Barakat degree of polarisation to assess the variation of the volume backscatterig power as a function of the azimuth look angle. Three different datasets, i.e. multi-frequency indoor acquisitions over short vegetation samples, and P-band airborne data and L-band satellite data over boreal and tropical forest, respectively, have been employed in this study. We have argued that despite depolarising effects may be only sensed through a small portion of the synthetic aperture, they can lead to overestimated retrievals of the volume scattering for the full resolution image. This has direct implications in the existing model-based and model-free polarimetric SAR decompositions.","sentences":["A common assumption in radar remote sensing studies for vegetation is that radar returns originate from a target made up by a set of uniformly distributed isotropic scatterers.","Nonetheless, several studies in the literature have noted that orientation effects and heterogeneities have a noticeable impact in backscattering signatures according to the specific vegetation type and sensor frequency.","In this paper we have employed the subaperture decomposition technique (i.e. a time-frequency analysis) and the 3-D Barakat degree of polarisation to assess the variation of the volume backscatterig power as a function of the azimuth look angle.","Three different datasets, i.e. multi-frequency indoor acquisitions over short vegetation samples, and P-band airborne data and L-band satellite data over boreal and tropical forest, respectively, have been employed in this study.","We have argued that despite depolarising effects may be only sensed through a small portion of the synthetic aperture, they can lead to overestimated retrievals of the volume scattering for the full resolution image.","This has direct implications in the existing model-based and model-free polarimetric SAR decompositions."],"url":"http://arxiv.org/abs/2405.02007v1","category":"eess.SP"} +{"created":"2024-05-03 10:52:57","title":"Optical spectroscopic and photometric classification of the X-ray transient EP240309a (EP J115415.8-501810) as an intermediate polar","abstract":"We report on optical follow-up observations of an X-ray source initially detected by the Einstein Probe mission. Our investigations categorize the source as an intermediate polar, a class of magnetic cataclysmic variables, exhibiting an orbital period of 3.7614(4) hours and a white dwarf spin period of 3.97 minutes. The orbital period was identified through TESS observations, while our high-speed photometric data, obtained using the 1.9m and Lesedi 1.0m telescopes at the South African Astronomical Observatory, revealed both the spin and beat periods. Additionally, we present orbitally phase-resolved spectroscopic observations using the 1.9m telescope, specifically centered on the Hbeta emission line, which reveal two emission components that exhibit Doppler variations throughout the orbital cycle.","sentences":["We report on optical follow-up observations of an X-ray source initially detected by the Einstein Probe mission.","Our investigations categorize the source as an intermediate polar, a class of magnetic cataclysmic variables, exhibiting an orbital period of 3.7614(4) hours and a white dwarf spin period of 3.97 minutes.","The orbital period was identified through TESS observations, while our high-speed photometric data, obtained using the 1.9m and Lesedi 1.0m telescopes at the South African Astronomical Observatory, revealed both the spin and beat periods.","Additionally, we present orbitally phase-resolved spectroscopic observations using the 1.9m telescope, specifically centered on the Hbeta emission line, which reveal two emission components that exhibit Doppler variations throughout the orbital cycle."],"url":"http://arxiv.org/abs/2405.01996v1","category":"astro-ph.HE"} +{"created":"2024-05-03 10:04:22","title":"PINT: Maximum-likelihood estimation of pulsar timing noise parameters","abstract":"PINT is a pure-Python framework for high-precision pulsar timing developed on top of widely used and well-tested Python libraries, supporting both interactive and programmatic data analysis workflows. We present a new frequentist framework within PINT to characterize the single-pulsar noise processes present in pulsar timing datasets. This framework enables the parameter estimation for both uncorrelated and correlated noise processes as well as the model comparison between different timing and noise models. We demonstrate the efficacy of the new framework by applying it to simulated datasets as well as a real dataset of PSR B1855+09. We also briefly describe the new features implemented in PINT since it was first described in the literature.","sentences":["PINT is a pure-Python framework for high-precision pulsar timing developed on top of widely used and well-tested Python libraries, supporting both interactive and programmatic data analysis workflows.","We present a new frequentist framework within PINT to characterize the single-pulsar noise processes present in pulsar timing datasets.","This framework enables the parameter estimation for both uncorrelated and correlated noise processes as well as the model comparison between different timing and noise models.","We demonstrate the efficacy of the new framework by applying it to simulated datasets as well as a real dataset of PSR B1855+09.","We also briefly describe the new features implemented in PINT since it was first described in the literature."],"url":"http://arxiv.org/abs/2405.01977v1","category":"astro-ph.IM"} +{"created":"2024-05-03 09:43:53","title":"All optical control of bubble and skyrmion breathing","abstract":"Controlling the dynamics of topologically protected spin objects by all optical means promises enormous potential for future spintronic applications. Excitation of bubbles and skyrmions in ferrimagnetic [Fe(0.35 nm)/Gd(0.40 nm)]$_{160}$ multilayers by ultrashort laser pulses leads to a periodic modulation of the core diameter of these spin objects, the so-called breathing mode. We demonstrate versatile amplitude and phase control of this breathing using a double excitation scheme, where the observed dynamics is controlled by the excitation delay. We gain insight into both the time scale on which the breathing mode is launched and the role of the spin object size on the dynamics. Our results demonstrate that ultrafast optical excitation allows for precise tuning of the spin dynamics of trivial and non-trivial spin objects, showing a possible control strategy in device applications.","sentences":["Controlling the dynamics of topologically protected spin objects by all optical means promises enormous potential for future spintronic applications.","Excitation of bubbles and skyrmions in ferrimagnetic [Fe(0.35 nm)/Gd(0.40 nm)]$_{160}$ multilayers by ultrashort laser pulses leads to a periodic modulation of the core diameter of these spin objects, the so-called breathing mode.","We demonstrate versatile amplitude and phase control of this breathing using a double excitation scheme, where the observed dynamics is controlled by the excitation delay.","We gain insight into both the time scale on which the breathing mode is launched and the role of the spin object size on the dynamics.","Our results demonstrate that ultrafast optical excitation allows for precise tuning of the spin dynamics of trivial and non-trivial spin objects, showing a possible control strategy in device applications."],"url":"http://arxiv.org/abs/2405.01966v1","category":"cond-mat.mes-hall"} +{"created":"2024-05-03 09:29:28","title":"Enhancement of swimmer diffusion through regular kicks: analytic mapping of a scale independent parameter space","abstract":"Depending on their mechanism of self-propulsion, active particles can exhibit a time-dependent, often periodic, propulsion velocity. The precise propulsion velocity profile determines their mean square displacement and their effective diffusion coefficient at long times. Here we demonstrate that any periodic propulsion profile results in a larger diffusion coefficient than the corresponding case with constant propulsion velocity. We investigate in detail the case of periodic exponentially decaying velocity pulses, expected in propulsion mechanisms based on sudden absorption of finite amounts of energy. We show both analytically and with numerical simulations that in these cases the effective diffusion coefficient can be arbitrarily enhanced with respect to the case with constant velocity equal to the average speed. Our results may help interpret in a new light observations on the diffusion enhancement of active particles.","sentences":["Depending on their mechanism of self-propulsion, active particles can exhibit a time-dependent, often periodic, propulsion velocity.","The precise propulsion velocity profile determines their mean square displacement and their effective diffusion coefficient at long times.","Here we demonstrate that any periodic propulsion profile results in a larger diffusion coefficient than the corresponding case with constant propulsion velocity.","We investigate in detail the case of periodic exponentially decaying velocity pulses, expected in propulsion mechanisms based on sudden absorption of finite amounts of energy.","We show both analytically and with numerical simulations that in these cases the effective diffusion coefficient can be arbitrarily enhanced with respect to the case with constant velocity equal to the average speed.","Our results may help interpret in a new light observations on the diffusion enhancement of active particles."],"url":"http://arxiv.org/abs/2405.01954v1","category":"cond-mat.soft"} +{"created":"2024-05-03 08:51:33","title":"Revealing H$_2$O dissociation in WASP-76~b through combined high- and low-resolution transmission spectroscopy","abstract":"Numerous chemical constraints have been possible for exoplanetary atmospheres thanks to high-resolution spectroscopy (HRS) from ground-based facilities as well as low-resolution spectroscopy (LRS) from space. These two techniques have complementary strengths, and hence combined HRS and LRS analyses have the potential for more accurate abundance constraints and increased sensitivity to trace species. In this work we retrieve the atmosphere of the ultra-hot Jupiter WASP-76~b, using high-resolution CARMENES/CAHA and low-resolution HST WFC3 and Spitzer observations of the primary eclipse. As such hot planets are expected to have a substantial fraction of H$_2$O dissociated, we conduct retrievals including both H$_2$O and OH. We explore two retrieval models, one with self-consistent treatment of H$_2$O dissociation and another where H$_2$O and OH are vertically-homogeneous. Both models constrain H$_2$O and OH, with H$_2$O primarily detected by LRS and OH through HRS, highlighting the strengths of each technique and demonstrating the need for combined retrievals to fully constrain chemical compositions. We see only a slight preference for the H$_2$O-dissociation model given that the photospheric constraints for both are very similar, indicating $\\log(\\mathrm{OH/H_2O}) = 0.7^{+0.3}_{-0.3}$ at 1.5~mbar, showing that the majority of the H$_2$O in the photosphere is dissociated. However, the bulk O/H and C/O ratios inferred from the models differs significantly, and highlights the challenge of constraining bulk compositions from photospheric abundances with strong vertical chemical gradients. Further observations with JWST and ground-based facilities may help shed more light on these processes.","sentences":["Numerous chemical constraints have been possible for exoplanetary atmospheres thanks to high-resolution spectroscopy (HRS) from ground-based facilities as well as low-resolution spectroscopy (LRS) from space.","These two techniques have complementary strengths, and hence combined HRS and LRS analyses have the potential for more accurate abundance constraints and increased sensitivity to trace species.","In this work we retrieve the atmosphere of the ultra-hot Jupiter WASP-76~b, using high-resolution CARMENES/CAHA and low-resolution HST WFC3 and Spitzer observations of the primary eclipse.","As such hot planets are expected to have a substantial fraction of H$_2$O dissociated, we conduct retrievals including both H$_2$O and OH.","We explore two retrieval models, one with self-consistent treatment of H$_2$O dissociation and another where H$_2$O and OH are vertically-homogeneous.","Both models constrain H$_2$O and OH, with H$_2$O primarily detected by LRS and OH through HRS, highlighting the strengths of each technique and demonstrating the need for combined retrievals to fully constrain chemical compositions.","We see only a slight preference for the H$_2$O-dissociation model given that the photospheric constraints for both are very similar, indicating $\\log(\\mathrm{OH/H_2O}) = 0.7^{+0.3}_{-0.3}$ at 1.5~mbar, showing that the majority of the H$_2$O in the photosphere is dissociated.","However, the bulk O/H and C/O ratios inferred from the models differs significantly, and highlights the challenge of constraining bulk compositions from photospheric abundances with strong vertical chemical gradients.","Further observations with JWST and ground-based facilities may help shed more light on these processes."],"url":"http://arxiv.org/abs/2405.01933v1","category":"astro-ph.EP"} +{"created":"2024-05-03 08:23:48","title":"On the computation of moments in the Super-Transition-Arrays model for radiative opacity calculations","abstract":"In the Super-Transition-Array statistical method for the computation of radiative opacity of hot dense matter, the moments of the absorption or emission features involve partition functions with reduced degeneracies, occurring through the calculation of averages of products of subshell populations. In the present work, we discuss several aspects of the computation of such peculiar partition functions, insisting on the precautions that must be taken in order to avoid numerical difficulties. In a previous work, we derived a formula for supershell partition functions, which takes the form of a functional of the distribution of energies within the supershell and allows for fast and accurate computations, truncating the number of terms in the expansion. The latter involves coefficients for which we obtained a recursion relation and an explicit formula. We show that such an expansion can be combined with the recurrence relation for shifted partition functions. We also propose, neglecting the effect of fine structure as a first step, a positive-definite formula for the Super-Transition-Array moments of any order, providing an insight into the asymmetry and sharpness of the latter. The corresponding formulas are free of alternating sums. Several ways to speed up the calculations are also presented.","sentences":["In the Super-Transition-Array statistical method for the computation of radiative opacity of hot dense matter, the moments of the absorption or emission features involve partition functions with reduced degeneracies, occurring through the calculation of averages of products of subshell populations.","In the present work, we discuss several aspects of the computation of such peculiar partition functions, insisting on the precautions that must be taken in order to avoid numerical difficulties.","In a previous work, we derived a formula for supershell partition functions, which takes the form of a functional of the distribution of energies within the supershell and allows for fast and accurate computations, truncating the number of terms in the expansion.","The latter involves coefficients for which we obtained a recursion relation and an explicit formula.","We show that such an expansion can be combined with the recurrence relation for shifted partition functions.","We also propose, neglecting the effect of fine structure as a first step, a positive-definite formula for the Super-Transition-Array moments of any order, providing an insight into the asymmetry and sharpness of the latter.","The corresponding formulas are free of alternating sums.","Several ways to speed up the calculations are also presented."],"url":"http://arxiv.org/abs/2405.01921v1","category":"physics.atom-ph"}