What is Time Series Analysis? Time series analysis comprises statistical methods for analyzing a sequence of data points collected over an interval of time to identify interesting patterns and trends.
Abstract:In MIMO systems, the presence of phase noise is a significant factor that can degrade performance. For MIMO testbeds build from SDR devices, phase noise cannot be ignored, particular in applications that require phase synchronization. This is especially relevant in MIMO systems that employ digital beamforming, where precise phase alignment is crucial. Accordingly, accurate phase noise modelling of SDR devices is essential. However, the information provided in data sheets for different SDR models varies widely and is often insufficient for comprehensive characterization of their phase noise performance. While numerical simulations of PLL phase noise behavior are documented in the literature, there is a lack of extensive measurements supported by appropriate system modelling. In this work, we present a practical phase noise modeling methodology applied to an SDR from the USRP X310 series. Based on measurement data, we derive estimates of key PLL performance indicators such as cycle-to-cycle jitter, oscillator constants, and PLL bandwidth. Furthermore, we propose a parametric model for the phase noise PSD of the PLL circuit and provide corresponding parameter estimates. This model can be used for further investigation into the impact of phase noise on MIMO system performance implemented by similar SDR devices.
Abstract:With advancements in computing and communication technologies, the Internet of Things (IoT) has seen significant growth. IoT devices typically collect data from various sensors, such as temperature, humidity, and energy meters. Much of this data is temporal in nature. Traditionally, data from IoT devices is centralized for analysis, but this approach introduces delays and increased communication costs. Federated learning (FL) has emerged as an effective alternative, allowing for model training across distributed devices without the need to centralize data. In many applications, such as smart home energy and environmental monitoring, the data collected by IoT devices across different locations can exhibit significant variation in trends and seasonal patterns. Accurately forecasting such non-stationary, non-linear time-series data is crucial for applications like energy consumption estimation and weather forecasting. However, these data variations can severely impact prediction accuracy. The key contributions of this paper are: (1) Investigating how non-linear, non-stationary time-series data distributions, like generalized extreme value (gen-extreme) and log norm distributions, affect FL performance. (2) Analyzing how different detrending techniques for non-linear time-series data influence the forecasting model's performance in a FL setup. We generated several synthetic time-series datasets using non-linear data distributions and trained an LSTM-based forecasting model using both centralized and FL approaches. Additionally, we evaluated the impact of detrending on real-world datasets with non-linear time-series data distributions. Our experimental results show that: (1) FL performs worse than centralized approaches when dealing with non-linear data distributions. (2) The use of appropriate detrending techniques improves FL performance, reducing loss across different data distributions.
Abstract:Progress in a research field can be hard to assess, in particular when many concurrent methods are proposed in a short period of time. This is the case in digital pathology, where many foundation models have been released recently to serve as feature extractors for tile-level images, being used in a variety of downstream tasks, both for tile- and slide-level problems. Benchmarking available methods then becomes paramount to get a clearer view of the research landscape. In particular, in critical domains such as healthcare, a benchmark should not only focus on evaluating downstream performance, but also provide insights about the main differences between methods, and importantly, further consider uncertainty and robustness to ensure a reliable usage of proposed models. For these reasons, we introduce THUNDER, a tile-level benchmark for digital pathology foundation models, allowing for efficient comparison of many models on diverse datasets with a series of downstream tasks, studying their feature spaces and assessing the robustness and uncertainty of predictions informed by their embeddings. THUNDER is a fast, easy-to-use, dynamic benchmark that can already support a large variety of state-of-the-art foundation, as well as local user-defined models for direct tile-based comparison. In this paper, we provide a comprehensive comparison of 23 foundation models on 16 different datasets covering diverse tasks, feature analysis, and robustness. The code for THUNDER is publicly available at https://github.com/MICS-Lab/thunder.
Abstract:Reinforcement Learning (RL) has demonstrated its potential to improve the reasoning ability of Large Language Models (LLMs). One major limitation of most existing Reinforcement Finetuning (RFT) methods is that they are on-policy RL in nature, i.e., data generated during the past learning process is not fully utilized. This inevitably comes at a significant cost of compute and time, posing a stringent bottleneck on continuing economic and efficient scaling. To this end, we launch the renaissance of off-policy RL and propose Reincarnating Mix-policy Proximal Policy Gradient (ReMix), a general approach to enable on-policy RFT methods like PPO and GRPO to leverage off-policy data. ReMix consists of three major components: (1) Mix-policy proximal policy gradient with an increased Update-To-Data (UTD) ratio for efficient training; (2) KL-Convex policy constraint to balance the trade-off between stability and flexibility; (3) Policy reincarnation to achieve a seamless transition from efficient early-stage learning to steady asymptotic improvement. In our experiments, we train a series of ReMix models upon PPO, GRPO and 1.5B, 7B base models. ReMix shows an average Pass@1 accuracy of 52.10% (for 1.5B model) with 0.079M response rollouts, 350 training steps and achieves 63.27%/64.39% (for 7B model) with 0.007M/0.011M response rollouts, 50/75 training steps, on five math reasoning benchmarks (i.e., AIME'24, AMC'23, Minerva, OlympiadBench, and MATH500). Compared with 15 recent advanced models, ReMix shows SOTA-level performance with an over 30x to 450x reduction in training cost in terms of rollout data volume. In addition, we reveal insightful findings via multifaceted analysis, including the implicit preference for shorter responses due to the Whipping Effect of off-policy discrepancy, the collapse mode of self-reflection behavior under the presence of severe off-policyness, etc.
Abstract:This paper tackles the urgent need for efficient energy management in healthcare facilities, where fluctuating demands challenge operational efficiency and sustainability. Traditional methods often prove inadequate, causing inefficiencies and higher costs. To address this, the study presents an AI-based framework combining Long Short-Term Memory (LSTM), genetic algorithm (GA), and SHAP (Shapley Additive Explanations), specifically designed for healthcare energy management. Although LSTM is widely used for time-series forecasting, its application in healthcare energy prediction remains underexplored. The results reveal that LSTM significantly outperforms ARIMA and Prophet models in forecasting complex, non-linear demand patterns. LSTM achieves a Mean Absolute Error (MAE) of 21.69 and Root Mean Square Error (RMSE) of 29.96, far better than Prophet (MAE: 59.78, RMSE: 81.22) and ARIMA (MAE: 87.73, RMSE: 125.22), demonstrating superior performance. The genetic algorithm is applied to optimize model parameters and improve load balancing strategies, enabling adaptive responses to real-time energy fluctuations. SHAP analysis further enhances model transparency by explaining the influence of different features on predictions, fostering trust in decision-making processes. This integrated LSTM-GA-SHAP approach offers a robust solution for improving forecasting accuracy, boosting energy efficiency, and advancing sustainability in healthcare facilities. Future research may explore real-time deployment and hybridization with reinforcement learning for continuous optimization. Overall, the study establishes a solid foundation for using AI in healthcare energy management, highlighting its scalability, efficiency, and resilience potential.
Abstract:Fault diagnosis in Cyber-Physical Systems (CPSs) is essential for ensuring system dependability and operational efficiency by accurately detecting anomalies and identifying their root causes. However, the manual modeling of faulty behaviors often demands extensive domain expertise and produces models that are complex, error-prone, and difficult to interpret. To address this challenge, we present a novel unsupervised fault diagnosis methodology that integrates collective anomaly detection in multivariate time series, process mining, and stochastic simulation. Initially, collective anomalies are detected from low-level sensor data using multivariate time-series analysis. These anomalies are then transformed into structured event logs, enabling the discovery of interpretable process models through process mining. By incorporating timing distributions into the extracted Petri nets, the approach supports stochastic simulation of faulty behaviors, thereby enhancing root cause analysis and behavioral understanding. The methodology is validated using the Robotic Arm Dataset (RoAD), a widely recognized benchmark in smart manufacturing. Experimental results demonstrate its effectiveness in modeling, simulating, and classifying faulty behaviors in CPSs. This enables the creation of comprehensive fault dictionaries that support predictive maintenance and the development of digital twins for industrial environments.
Abstract:Are Large Language Models (LLMs) a new form of strategic intelligence, able to reason about goals in competitive settings? We present compelling supporting evidence. The Iterated Prisoner's Dilemma (IPD) has long served as a model for studying decision-making. We conduct the first ever series of evolutionary IPD tournaments, pitting canonical strategies (e.g., Tit-for-Tat, Grim Trigger) against agents from the leading frontier AI companies OpenAI, Google, and Anthropic. By varying the termination probability in each tournament (the "shadow of the future"), we introduce complexity and chance, confounding memorisation. Our results show that LLMs are highly competitive, consistently surviving and sometimes even proliferating in these complex ecosystems. Furthermore, they exhibit distinctive and persistent "strategic fingerprints": Google's Gemini models proved strategically ruthless, exploiting cooperative opponents and retaliating against defectors, while OpenAI's models remained highly cooperative, a trait that proved catastrophic in hostile environments. Anthropic's Claude emerged as the most forgiving reciprocator, showing remarkable willingness to restore cooperation even after being exploited or successfully defecting. Analysis of nearly 32,000 prose rationales provided by the models reveals that they actively reason about both the time horizon and their opponent's likely strategy, and we demonstrate that this reasoning is instrumental to their decisions. This work connects classic game theory with machine psychology, offering a rich and granular view of algorithmic decision-making under uncertainty.
Abstract:Understanding temporal patterns in online search behavior is crucial for real-time marketing and trend forecasting. Google Trends offers a rich proxy for public interest, yet the high dimensionality and noise of its time-series data present challenges for effective clustering. This study evaluates three unsupervised clustering approaches, Symbolic Aggregate approXimation (SAX), enhanced SAX (eSAX), and Topological Data Analysis (TDA), applied to 20 Google Trends keywords representing major consumer categories. Our results show that while SAX and eSAX offer fast and interpretable clustering for stable time series, they struggle with volatility and complexity, often producing ambiguous ``catch-all'' clusters. TDA, by contrast, captures global structural features through persistent homology and achieves more balanced and meaningful groupings. We conclude with practical guidance for using symbolic and topological methods in consumer analytics and suggest that hybrid approaches combining both perspectives hold strong potential for future applications.
Abstract:Satellite image time-series analysis demands fine-grained spatial-temporal reasoning, which remains a challenge for existing multimodal large language models (MLLMs). In this work, we study the capabilities of MLLMs on a novel task that jointly targets temporal change understanding and future scene generation, aiming to assess their potential for modeling complex multimodal dynamics over time. We propose TAMMs, a Temporal-Aware Multimodal Model for satellite image change understanding and forecasting, which enhances frozen MLLMs with lightweight temporal modules for structured sequence encoding and contextual prompting. To guide future image generation, TAMMs introduces a Semantic-Fused Control Injection (SFCI) mechanism that adaptively combines high-level semantic reasoning and structural priors within an enhanced ControlNet. This dual-path conditioning enables temporally consistent and semantically grounded image synthesis. Experiments demonstrate that TAMMs outperforms strong MLLM baselines in both temporal change understanding and future image forecasting tasks, highlighting how carefully designed temporal reasoning and semantic fusion can unlock the full potential of MLLMs for spatio-temporal understanding.
Abstract:Recurrent neural networks (RNNs), particularly LSTMs, are effective for time-series tasks like sentiment analysis and short-term stock prediction. However, their computational complexity poses challenges for real-time deployment in resource constrained environments. While FPGAs offer a promising platform for energy-efficient AI acceleration, existing tools mainly target feed-forward networks, and LSTM acceleration typically requires full custom implementation. In this paper, we address this gap by leveraging the open-source and extensible FINN framework to enable the generalized deployment of LSTMs on FPGAs. Specifically, we leverage the Scan operator from the Open Neural Network Exchange (ONNX) specification to model the recurrent nature of LSTM computations, enabling support for mixed quantisation within them and functional verification of LSTM-based models. Furthermore, we introduce custom transformations within the FINN compiler to map the quantised ONNX computation graph to hardware blocks from the HLS kernel library of the FINN compiler and Vitis HLS. We validate the proposed tool-flow by training a quantised ConvLSTM model for a mid-price stock prediction task using the widely used dataset and generating a corresponding hardware IP of the model using our flow, targeting the XCZU7EV device. We show that the generated quantised ConvLSTM accelerator through our flow achieves a balance between performance (latency) and resource consumption, while matching (or bettering) inference accuracy of state-of-the-art models with reduced precision. We believe that the generalisable nature of the proposed flow will pave the way for resource-efficient RNN accelerator designs on FPGAs.