Publications
Here is the list of publications grouped by year. You can filter them by using the buttons below.
2025
- SoSymHow fair are we? From conceptualization to automated assessment of fairness definitionsGiordano d’Aloisio, Claudio Di Sipio, Antinisca Di Marco, and Davide Di RuscioSoftware and Systems Modeling, 2025
Fairness is a critical concept in ethics and social domains, but it is also a challenging property to engineer in software systems. With the increasing use of machine learning in software systems, researchers have been developing techniques to automatically assess the fairness of software systems. Nonetheless, a significant proportion of these techniques rely upon pre-established fairness definitions, metrics, and criteria, which may fail to encompass the wide-ranging needs and preferences of users and stakeholders. To overcome this limitation, we propose a novel approach, called MODNESS, that enables users to customize and define their fairness concepts using a dedicated modeling environment. Our approach guides the user through the definition of new fairness concepts also in emerging domains, and the specification and composition of metrics for its evaluation. Ultimately, MODNESS generates the source code to implement fair assessment based on these custom definitions. In addition, we elucidate the process we followed to collect and analyze relevant literature on fairness assessment in software engineering (SE). We compare MODNESS with the selected approaches and evaluate how they support the distinguishing features identified by our study. Our findings reveal that i) most of the current approaches do not support user-defined fairness concepts; ii) our approach can cover two additional application domains not addressed by currently available tools, i.e., mitigating bias in recommender systems for software engineering and Arduino software component recommendations; iii) MODNESS demonstrates the capability to overcome the limitations of the only two other Model-Driven Engineering-based approaches for fairness assessment.
@article{d2024fair, title = {How fair are we? From conceptualization to automated assessment of fairness definitions}, author = {d'Aloisio, Giordano and Di Sipio, Claudio and Di Marco, Antinisca and Di Ruscio, Davide}, journal = {Software and Systems Modeling}, pages = {1--27}, year = {2025}, publisher = {Springer Berlin Heidelberg}, doi = {https://doi.org/10.1007/s10270-025-01277-2} }
- FAIRNESSHow Do Generative Models Draw a Software Engineer? A Case Study on Stable Diffusion BiasTosin Fadahunsi, Giordano d’Aloisio, Antinisca Di Marco, and Federica SarroIn Companion of the IEEE/ACM International Conference on Software Analysis, Evolution, and Reengineering, Mar 2025Best Paper Award
Generative models are nowadays widely used to generate graphical content used for multiple purposes, e.g. web, art, advertisement. However, it has been shown that the images generated by these models could reinforce societal biases already existing in specific contexts. In this paper, we focus on understanding if this is the case when one generates images related to various software engineering tasks. In fact, the Software Engineering (SE) community is not immune from gender and ethnicity disparities, which could be amplified by the use of these models. Hence, if used without consciousness, artificially generated images could reinforce these biases in the SE domain. Specifically, we perform an extensive empirical evaluation of the gender and ethnicity bias exposed by three versions of the Stable Diffusion (SD) model (a very popular open-source text-to-image model) - SD 2, SD XL, and SD 3 - towards SE tasks. We obtain 6,720 images by feeding each model with two sets of prompts describing different software-related tasks: one set includes the Software Engineer keyword, and one set does not include any specification of the person performing the task. Next, we evaluate the gender and ethnicity disparities in the generated images. Results show how all models are significantly biased towards male figures when representing software engineers. On the contrary, while SD 2 and SD XL are strongly biased towards White figures, SD 3 is slightly more biased towards Asian figures. Nevertheless, all models significantly under-represent Black and Arab figures, regardless of the prompt style used. The results of our analysis highlight severe concerns about adopting those models to generate content for SE tasks and open the field for future research on bias mitigation in this context.
@inproceedings{fadahunsi_how_2025, title = {How {Do} {Generative} {Models} {Draw} a {Software} {Engineer}? {A} {Case} {Study} on {Stable} {Diffusion} {Bias}}, copyright = {All rights reserved}, shorttitle = {How {Do} {Generative} {Models} {Draw} a {Software} {Engineer}?}, url = {https://doi.org/10.48550/arXiv.2501.09014}, booktitle = {Companion of the {IEEE}/{ACM} {International} {Conference} on {Software} {Analysis}, {Evolution}, and {Reengineering}}, urldate = {2025-01-21}, author = {Fadahunsi, Tosin and d'Aloisio, Giordano and Marco, Antinisca Di and Sarro, Federica}, month = mar, year = {2025}, keywords = {Computer Science - Artificial Intelligence, Computer Science - Software Engineering}, }
- SANEROn the Compression of Language Models for Code: An Empirical Study on CodeBERTGiordano d’Aloisio, Luca Traini, Federica Sarro, and Antinisca Di MarcoIn IEEE/ACM International Conference on Software Analysis, Evolution, and Reengineering, Mar 2025
Language models have proven successful across a wide range of software engineering tasks, but their significant computational costs often hinder their practical adoption. To address this challenge, researchers have begun applying various compression strategies to improve the efficiency of language models for code. These strategies aim to optimize inference latency and memory usage, though often at the cost of reduced model effectiveness. However, there is still a significant gap in understanding how these strategies influence the efficiency and effectiveness of language models for code. Here, we empirically investigate the impact of three well-known compression strategies – knowledge distillation, quantization, and pruning – across three different classes of software engineering tasks: vulnerability detection, code summarization, and code search. Our findings reveal that the impact of these strategies varies greatly depending on the task and the specific compression method employed. Practitioners and researchers can use these insights to make informed decisions when selecting the most appropriate compression strategy, balancing both efficiency and effectiveness based on their specific needs.
@inproceedings{daloisio_compression_2024, title = {On the {Compression} of {Language} {Models} for {Code}: {An} {Empirical} {Study} on {CodeBERT}}, copyright = {All rights reserved}, shorttitle = {On the {Compression} of {Language} {Models} for {Code}}, booktitle = {{IEEE}/{ACM} {International} {Conference} on {Software} {Analysis}, {Evolution}, and {Reengineering}}, urldate = {2025-03-11}, publisher = {arXiv}, author = {d'Aloisio, Giordano and Traini, Luca and Sarro, Federica and Marco, Antinisca Di}, year = {2025}, keywords = {Computer Science - Artificial Intelligence, Computer Science - Performance, Computer Science - Software Engineering}, file = {Preprint PDF:/Users/giord/Zotero/storage/H59UCM8I/d'Aloisio et al. - 2024 - On the Compression of Language Models for Code An.pdf:application/pdf;Snapshot:/Users/giord/Zotero/storage/BP5EMX5Y/2412.html:text/html} }
2024
- ESEMFRINGE: context-aware FaiRness engineerING in complex software systEmsFabio Palomba, Andrea Di Sorbo, Davide Di Ruscio, Filomena Ferrucci, Gemma Catolino, Giammaria Giordano, Dario Di Dario, Gianmario Voria, Viviana Pentangelo, Maria Tortorella, Arnaldo Sgueglia, Claudio Di Sipio, Giordano D’Aloisio, and Antinisca Di MarcoIn Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, Mar 2024
Machine learning (ML) is essential in modern technology, driving complex data-driven decisions. By 2025, daily data generation will exceed 463 exabytes, increasing ML’s influence and ethical risks of data exploitation and discrimination. The European Union’s Artificial Intelligence Act highlights the need for ethical AI solutions. Project Fringe (context-aware FaiRness engineerING in complex software systEms) addresses software fairness in ML-intensive systems that collect data through interconnected devices. Fringe aims to provide software engineers, data scientists, and ML experts with methodologies and software engineering solutions to improve fairness in ML systems. The goals of the project include developing a metamodel for ML fairness, a fairness-aware monitoring infrastructure, contextual solutions for identifying fairness issues, and automated recommendation systems to design fairness properties throughout the software development lifecycle.
@inproceedings{10.1145/3674805.3695394, author = {Palomba, Fabio and Di Sorbo, Andrea and Di Ruscio, Davide and Ferrucci, Filomena and Catolino, Gemma and Giordano, Giammaria and Di Dario, Dario and Voria, Gianmario and Pentangelo, Viviana and Tortorella, Maria and Sgueglia, Arnaldo and Di Sipio, Claudio and D'Aloisio, Giordano and Di Marco, Antinisca}, title = {FRINGE: context-aware FaiRness engineerING in complex software systEms}, year = {2024}, isbn = {9798400710476}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, doi = {https://doi.org/10.1145/3674805.3695394}, booktitle = {Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement}, pages = {608–612}, numpages = {5}, keywords = {Ethical Artificial Intelligence, Software Engineering for Artificial Intelligence., Software fairness engineering}, location = {Barcelona, Spain}, series = {ESEM '24} }
- ESEMExploring LLM-Driven Explanations for Quantum AlgorithmsGiordano d’Aloisio, Sophie Fortz, Carol Hanna, Daniel Fortunato, Avner Bensoussan, Eñaut Mendiluze Usandizaga, and Federica SarroIn Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, Mar 2024
Background Quantum computing is a rapidly growing new programming paradigm that brings significant changes to the design and implementation of algorithms. Understanding quantum algorithms requires knowledge of physics and mathematics, which can be challenging for software developers. Aims In this work, we provide a first analysis of how LLMs can support developers’ understanding of quantum code. Method We empirically analyse and compare the quality of explanations provided by three widely adopted LLMs (Gpt3.5, Llama2, and Tinyllama) using two different human-written prompt styles for seven state-of-the-art quantum algorithms. We also analyse how consistent LLM explanations are over multiple rounds and how LLMs can improve existing descriptions of quantum algorithms. Results Llama2 provides the highest quality explanations from scratch, while Gpt3.5 emerged as the LLM best suited to improve existing explanations. In addition, we show that adding a small amount of context to the prompt significantly improves the quality of explanations. Finally, we observe how explanations are qualitatively and syntactically consistent over multiple rounds. Conclusions This work highlights promising results, and opens challenges for future research in the field of LLMs for quantum code explanation. Future work includes refining the methods through prompt optimisation and parsing of quantum code explanations, as well as carrying out a systematic assessment of the quality of explanations.
@inproceedings{d2024exploring, title = {Exploring LLM-Driven Explanations for Quantum Algorithms}, author = {d'Aloisio, Giordano and Fortz, Sophie and Hanna, Carol and Fortunato, Daniel and Bensoussan, Avner and Usandizaga, E{\~n}aut Mendiluze and Sarro, Federica}, booktitle = {Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement}, pages = {475--481}, year = {2024}, doi = {https://doi.org/10.1145/3674805.3690753}, }
- EDTConfEngineering a Digital Twin for Diagnosis and Treatment of Multiple SclerosisGiordano D’Aloisio, Alessandro Di Matteo, Alessia Cipriani, Daniele Lozzi, Enrico Mattei, Gennaro Zanfardino, Antinisca Di Marco, and Giuseppe PlacidiIn Proceedings of the ACM/IEEE 27th International Conference on Model Driven Engineering Languages and Systems, Mar 2024
Multiple sclerosis (MS) is a complex, chronic, and heterogeneous disease of the central nervous system that affects 3 million people globally. The multifactorial nature of MS necessitates an adaptive and personalized approach to diagnosis, monitoring, and treatment. This paper proposes a novel Digital Twin for Multiple Sclerosis (DTMS) designed to integrate diverse data sources, including Magnetic resonance imaging (MRI), clinical biomarkers, and digital health metrics, into a unified predictive model. The DTMS aims to enhance the precision of MS management by providing real-time, individualized insights into disease progression and treatment efficacy. Through a federated learning approach, the DTMS leverages explainable AI to offer reliable and personalized therapeutic recommendations, ultimately striving to delay disability and improve patient outcomes. This comprehensive digital framework represents a significant advancement in the application of AI and digital twins in the field of neurology, promising a more tailored and effective management strategy for MS.
@inproceedings{d2024engineering, title = {Engineering a Digital Twin for Diagnosis and Treatment of Multiple Sclerosis}, author = {D'Aloisio, Giordano and Di Matteo, Alessandro and Cipriani, Alessia and Lozzi, Daniele and Mattei, Enrico and Zanfardino, Gennaro and Di Marco, Antinisca and Placidi, Giuseppe}, booktitle = {Proceedings of the ACM/IEEE 27th International Conference on Model Driven Engineering Languages and Systems}, pages = {364--369}, year = {2024}, doi = {https://doi.org/10.1145/3652620.3688249} }
- JSSUncovering gender gap in academia: A comprehensive analysis within the software engineering communityAndrea D’Angelo, Giordano d’Aloisio, Francesca Marzi, Antinisca Di Marco, and Giovanni StiloJournal of Systems and Software, Mar 2024
Gender gap in education has gained considerable attention in recent years, as it carries profound implications for the academic community. However, while the problem has been tackled from a student perspective, research is still lacking from an academic point of view. In this work, our main objective is to address this unexplored area by shedding light on the intricate dynamics of gender gap within the Software Engineering (SE) community. To this aim, we first review how the problem of gender gap in the SE community and in academia has been addressed by the literature so far. Results show that men in SE build more tightly-knit clusters but less global co-authorship relations than women, but the networks do not exhibit homophily. Concerning academic promotions, the Software Engineering community presents a higher bias in promotions to Associate Professors and a smaller bias in promotions to Full Professors than the overall Informatics community.
@article{DANGELO2024112162, title = {Uncovering gender gap in academia: A comprehensive analysis within the software engineering community}, journal = {Journal of Systems and Software}, pages = {112162}, year = {2024}, issn = {0164-1212}, doi = {https://doi.org/10.1016/j.jss.2024.112162}, url = {https://www.sciencedirect.com/science/article/pii/S0164121224002073}, author = {D’Angelo, Andrea and d’Aloisio, Giordano and Marzi, Francesca and {Di Marco}, Antinisca and Stilo, Giovanni}, keywords = {Gender gap, Gender bias, Academia, Italy, Informatics, Software engineering} }
- SSBSEGreenStableYolo: Optimizing Inference Time and Image Quality of Text-to-Image GenerationJingzhi Gong, Sisi Li, Giordano d’Aloisio, Zishuo Ding, Yulong Ye, William B Langdon, and Federica SarroIn International Symposium on Search Based Software Engineering, Mar 2024Challenge Track Winner
Tuning the parameters and prompts for improving AI-based text-to-image generation has remained a substantial yet unaddressed challenge. Hence we introduce GreenStableYolo, which improves the parameters and prompts for Stable Diffusion to both reduce GPU inference time and increase image generation quality using NSGA-II and Yolo. Our experiments show that despite a relatively slight trade-off (18%) in image quality compared to StableYolo (which only considers image quality), GreenStableYolo achieves a substantial reduction in inference time (266% less) and a 526% higher hypervolume, thereby advancing the state-of-the-art for text-to-image generation.
@inproceedings{gong2024greenstableyolo, title = {GreenStableYolo: Optimizing Inference Time and Image Quality of Text-to-Image Generation}, author = {Gong, Jingzhi and Li, Sisi and d’Aloisio, Giordano and Ding, Zishuo and Ye, Yulong and Langdon, William B and Sarro, Federica}, booktitle = {International Symposium on Search Based Software Engineering}, pages = {70--76}, year = {2024}, organization = {Springer Nature Switzerland Cham}, doi = {https://doi.org/10.1007/978-3-031-64573-0_7}, url = {https://doi.org/10.1007/978-3-031-64573-0_7} }
- ICPEGrammar-Based Anomaly Detection of Microservice Systems Execution TracesAndrea D’Angelo, and Giordano d’AloisioIn Companion of the 15th ACM/SPEC International Conference on Performance Engineering, Mar 2024Best Data Challenge Award
Microservice architectures are a widely adopted architectural pattern for large-scale applications. Given the large adoption of these systems, several works have been proposed to detect performance anomalies starting from analysing the execution traces. However, most of the proposed approaches rely on machine learning (ML) algorithms to detect anomalies. While ML methods may be effective in detecting anomalies, the training and deployment of these systems as been shown to be less efficient in terms of time, computational resources, and energy required.In this paper, we propose a novel approach based on Context-free grammar for anomaly detection of microservice systems execution traces. We employ the SAX encoding to transform execution traces into strings. Then, we select strings encoding anomalies, and for each possible anomaly, we build a Context-free grammar using the Sequitur grammar induction algorithm. We test our approach on two real-world datasets and compare it with a Logistic Regression classifier. We show how our approach is more effective in terms of training time of 15 seconds with a minimum loss in effectiveness of 5% compared to the Logistic Regression baseline.
@inproceedings{10.1145/3629527.3651844, author = {D'Angelo, Andrea and d'Aloisio, Giordano}, title = {Grammar-Based Anomaly Detection of Microservice Systems Execution Traces}, year = {2024}, isbn = {9798400704451}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3629527.3651844}, doi = {https://doi.org/10.1145/3629527.3651844}, booktitle = {Companion of the 15th ACM/SPEC International Conference on Performance Engineering}, pages = {77–81}, numpages = {5}, keywords = {anomaly detection, context-free grammar, execution traces, micro service system}, location = {<conf-loc>, <city>London</city>, <country>United Kingdom</country>, </conf-loc>}, series = {ICPE '24 Companion} }
2023
- ECSAData-Driven Analysis of Gender Fairness in the Software Engineering Academic LandscapeGiordano d’Aloisio, Andrea D’Angelo, Francesca Marzi, Diana Di Marco, Giovanni Stilo, and Antinisca Di MarcoIn European Conference on Software Architecture - ECSA, Mar 2023
@inproceedings{d2023data, title = {Data-Driven Analysis of Gender Fairness in the Software Engineering Academic Landscape}, author = {d'Aloisio, Giordano and D'Angelo, Andrea and Marzi, Francesca and Di Marco, Diana and Stilo, Giovanni and Di Marco, Antinisca}, booktitle = {European Conference on Software Architecture - ECSA}, year = {2023} }
- QUALIFIER@ECSATowards a Prediction of Machine Learning Training Time to Support Continuous Learning Systems DevelopmentFrancesca Marzi, Giordano d’Aloisio, Antinisca Di Marco, and Giovanni StiloIn International Workshop on Quality in Software Architecture - QUALIFIER, Mar 2023
@inproceedings{marzi2023towards, title = {Towards a Prediction of Machine Learning Training Time to Support Continuous Learning Systems Development}, author = {Marzi, Francesca and d'Aloisio, Giordano and Di Marco, Antinisca and Stilo, Giovanni}, booktitle = {International Workshop on Quality in Software Architecture - QUALIFIER}, year = {2023} }
- FASEDemocratizing Quality-Based Machine Learning Development through Extended Feature ModelsGiordano d’Aloisio, Antinisca Di Marco, and Giovanni StiloIn Fundamental Approaches to Software Engineering, Mar 2023
ML systems have become an essential tool for experts of many domains, data scientists and researchers, allowing them to find answers to many complex business questions starting from raw datasets. Nevertheless, the development of ML systems able to satisfy the stakeholders’ needs requires an appropriate amount of knowledge about the ML domain. Over the years, several solutions have been proposed to automate the development of ML systems. However, an approach taking into account the new quality concerns needed by ML systems (like fairness, interpretability, privacy, and others) is still missing.
@inproceedings{daloisio_democratizing_2023, address = {Cham}, series = {Lecture {Notes} in {Computer} {Science}}, title = {Democratizing {Quality}-{Based} {Machine} {Learning} {Development} through {Extended} {Feature} {Models}}, copyright = {All rights reserved}, isbn = {978-3-031-30826-0}, doi = {https://doi.org/10.1007/978-3-031-30826-0_5}, language = {en}, booktitle = {Fundamental {Approaches} to {Software} {Engineering}}, publisher = {Springer Nature Switzerland}, author = {d’Aloisio, Giordano and Di Marco, Antinisca and Stilo, Giovanni}, editor = {Lambers, Leen and Uchitel, Sebastián}, year = {2023}, keywords = {/unread, Feature Models, Low-code development, Machine Learning System, Software Product Line, Software Quality}, pages = {88--110}, }
- IP&MDebiaser for Multiple Variables to enhance fairness in classification tasksGiordano d’Aloisio, Andrea D’Angelo, Antinisca Di Marco, and Giovanni StiloInformation Processing & Management, Mar 2023
Nowadays assuring that search and recommendation systems are fair and do not apply discrimination among any kind of population has become of paramount importance. This is also highlighted by some of the sustainable development goals proposed by the United Nations. Those systems typically rely on machine learning algorithms that solve the classification task. Although the problem of fairness has been widely addressed in binary classification, unfortunately, the fairness of multi-class classification problem needs to be further investigated lacking well-established solutions. For the aforementioned reasons, in this paper, we present the Debiaser for Multiple Variables (DEMV), an approach able to mitigate unbalanced groups bias (i.e., bias caused by an unequal distribution of instances in the population) in both binary and multi-class classification problems with multiple sensitive variables. The proposed method is compared, under several conditions, with a set of well-established baselines using different categories of classifiers. At first we conduct a specific study to understand which is the best generation strategies and their impact on DEMV’s ability to improve fairness. Then, we evaluate our method on a heterogeneous set of datasets and we show how it overcomes the established algorithms of the literature in the multi-class classification setting and in the binary classification setting when more than two sensitive variables are involved. Finally, based on the conducted experiments, we discuss strengths and weaknesses of our method and of the other baselines.
@article{daloisio_debiaser_2023, title = {Debiaser for {Multiple} {Variables} to enhance fairness in classification tasks}, volume = {60}, copyright = {All rights reserved}, issn = {0306-4573}, url = {https://www.sciencedirect.com/science/article/pii/S0306457322003272}, doi = {https://doi.org/10.1016/j.ipm.2022.103226}, language = {en}, number = {2}, urldate = {2022-12-22}, journal = {Information Processing & Management}, author = {d’Aloisio, Giordano and D’Angelo, Andrea and Di Marco, Antinisca and Stilo, Giovanni}, year = {2023}, keywords = {Machine learning, Multi-class classification, Preprocessing algorithm, Bias and Fairness, Equality}, pages = {103226} }
- IJDRRThe toolkit disaster preparedness for pre-disaster planningDonato Di Ludovico, Chiara Capannolo, and Giordano d’AloisioInternational Journal of Disaster Risk Reduction, Mar 2023
The University of L’Aquila “Territori Aperti” (Open Territories) project deals with the topics of prevention and management of natural disasters and the reconstruction and development processes in the affected areas. One of its tasks is developing research on the Toolkit Disaster Preparedness (TDP) aimed at Pre-Disaster Planning. The TDP is structured in this study as a support for the construction of Recovery Strategies and Actions, and concerns the collection and analysis of good practices on post-disaster reconstruction management (Experience Sheets (ESs)), their elaboration into Disaster Preparedness Recommendation Sheets (DPRSs), and the transposition of these into the Recovery Plan. The methodology for the construction of the Recovery Plan was structured in two macro-activities. The first concerns structuring the Toolkit and the related set of sheets (ESs→DPRSs). The second concerns the transfer of the DPRSs to the Recovery Strategies, so that the recommendations and success measures of the former become the actions of the latter. The Toolkit methodology was applied to the case studies of the Abruzzo 2009 earthquake and the Central Italy 2016-17 earthquake. The next steps of the research will concern testing the methodology in the second macro-activity, i.e. the construction of the Recovery Plan, again in the territorial context of the two aforementioned areas.
@article{di_ludovico_toolkit_2023, title = {The toolkit disaster preparedness for pre-disaster planning}, volume = {96}, copyright = {All rights reserved}, issn = {2212-4209}, url = {https://www.sciencedirect.com/science/article/pii/S2212420923003692}, doi = {https://doi.org/10.1016/j.ijdrr.2023.103889}, language = {en}, journal = {International Journal of Disaster Risk Reduction}, author = {Di Ludovico, Donato and Capannolo, Chiara and d'Aloisio, Giordano}, year = {2023}, keywords = {/unread, Disasters, Pre-disaster planning, Preparedness, Recovery, Resilience, Toolkit}, pages = {103889} }
- RRRRA Decision Tree to Shepherd Scientists through Data RetrievabilityAndrea Bianchi, Giordano d’Aloisio, Francesca Marzi, and Antinisca Di MarcoIn Second Workshop on Reproducibility and Replication of Research Results, Mar 2023
Reproducibility is a crucial aspect of scientific research that involves the ability to independently replicate experimental results by analysing the same data or repeating the same experiment. Over the years, many works have been proposed to make the results of the experiments actually reproducible. However, very few address the importance of data reproducibility, defined as the ability of independent researchers to retain the same dataset used as input for experimentation. Properly addressing the problem of data reproducibility is crucial because often just providing a link to the data is not enough to make the results reproducible. In fact, also proper metadata (e.g., preprocessing instruction) must be provided to make a dataset fully reproducible. In this work, our aim is to fill this gap by proposing a decision tree to sheperd researchers through the reproducibility of their datasets. In particular, this decision tree guides researchers through identifying if the dataset is actually reproducible and if additional metadata (i.e., additional resources needed to reproduce the data) must also be provided. This decision tree will be the foundation of a future application that will automate the data reproduction process by automatically providing the necessary metadata based on the particular context (e.g., data availability, data preprocessing, and so on). It is worth noting that, in this paper, we detail the steps to make a dataset retrievable, while we will detail other crucial aspects for reproducibility (e.g., dataset documentation) in future works.
@inproceedings{bianchi2023decision, title = {A Decision Tree to Shepherd Scientists through Data Retrievability}, author = {Bianchi, Andrea and d'Aloisio, Giordano and Marzi, Francesca and Di Marco, Antinisca}, booktitle = {Second Workshop on Reproducibility and Replication of Research Results}, year = {2023}, doi = {https://doi.org/10.48550/arXiv.2304.05767} }
2022
- BIAS@ECIREnhancing Fairness in Classification Tasks with Multiple Variables: A Data- and Model-Agnostic ApproachGiordano d’Aloisio, Giovanni Stilo, Antinisca Di Marco, and Andrea D’AngeloIn Advances in Bias and Fairness in Information Retrieval, Mar 2022
Nowadays assuring that search and recommendation systems are fair and do not apply discrimination among any kind of population has become of paramount importance. Those systems typically rely on machine learning algorithms that solve the classification task. Although the problem of fairness has been widely addressed in binary classification, unfortunately, the fairness of multi-class classification problem needs to be further investigated lacking well-established solutions. For the aforementioned reasons, in this paper, we present the Debiaser for Multiple Variables, a novel approach able to enhance fairness in both binary and multi-class classification problems. The proposed method is compared, under several conditions, with the well-established baseline. We evaluate our method on a heterogeneous data set and prove how it overcomes the established algorithms in the multi-classification setting, while maintaining good performances in binary classification. Finally, we present some limitations and future improvements.
@inproceedings{10.1007/978-3-031-09316-6_11, author = {d'Aloisio, Giordano and Stilo, Giovanni and Di Marco, Antinisca and D'Angelo, Andrea}, editor = {Boratto, Ludovico and Faralli, Stefano and Marras, Mirko and Stilo, Giovanni}, title = {Enhancing Fairness in Classification Tasks with Multiple Variables: A Data- and Model-Agnostic Approach}, booktitle = {Advances in Bias and Fairness in Information Retrieval}, year = {2022}, publisher = {Springer International Publishing}, address = {Cham}, pages = {117--129}, isbn = {978-3-031-09316-6}, doi = {https://doi.org/10.1007/978-3-031-09316-6_11}, }
- ICSE-DSQuality-Driven Machine Learning-based Data Science Pipeline Realization: a software engineering approachGiordano d’AloisioIn 2022 IEEE/ACM 44th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), Mar 2022
The recently wide adoption of data science approaches to decision making in several application domains (such as health, business and even education) open new challenges in engineering and implementation of this systems. Considering the big picture of data science, Machine learning is the wider used technique and due to its characteristics, we believe that a better engineering methodology and tools are needed to realize innovative data-driven systems able to satisfy the emerging quality attributes (such as, debias and fariness, explainability, privacy and ethics, sustainability). This research project will explore the following three pillars: i) identify key quality attributes, formalize them in the context of data science pipelines and study their relationships; ii) define a new software engineering approach for data-science systems development that assures compliance with quality requirements; iii) implement tools that guide IT professionals and researchers in the realization of ML-based data science pipelines since the requirement engineering. Moreover, in this paper we also presents some details of the project showing how the feature models and model-driven engineering can be leveraged to realize our project.
@inproceedings{9793779, author = {d’Aloisio, Giordano}, booktitle = {2022 IEEE/ACM 44th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion)}, title = {Quality-Driven Machine Learning-based Data Science Pipeline Realization: a software engineering approach}, year = {2022}, volume = {}, number = {}, pages = {291-293}, doi = {https://doi.org/10.1109/ICSE-Companion55297.2022.9793779}, }
- PreprintModeling Quality and Machine Learning Pipelines through Extended Feature ModelsGiordano d’Aloisio, Antinisca Di Marco, and Giovanni StiloMar 2022arXiv:2207.07528 [cs]
The recently increased complexity of Machine Learning (ML) methods, led to the necessity to lighten both the research and industry development processes. ML pipelines have become an essential tool for experts of many domains, data scientists and researchers, allowing them to easily put together several ML models to cover the full analytic process starting from raw datasets. Over the years, several solutions have been proposed to automate the building of ML pipelines, most of them focused on semantic aspects and characteristics of the input dataset. However, an approach taking into account the new quality concerns needed by ML systems (like fairness, interpretability, privacy, etc.) is still missing. In this paper, we first identify, from the literature, key quality attributes of ML systems. Further, we propose a new engineering approach for quality ML pipeline by properly extending the Feature Models meta-model. The presented approach allows to model ML pipelines, their quality requirements (on the whole pipeline and on single phases), and quality characteristics of algorithms used to implement each pipeline phase. Finally, we demonstrate the expressiveness of our model considering the classification problem.
@misc{daloisio_modeling_2022, title = {Modeling {Quality} and {Machine} {Learning} {Pipelines} through {Extended} {Feature} {Models}}, copyright = {All rights reserved}, booktitle = {arXiv preprint}, urldate = {2022-07-19}, publisher = {arXiv}, author = {d'Aloisio, Giordano and Di Marco, Antinisca and Stilo, Giovanni}, year = {2022}, note = {arXiv:2207.07528 [cs]}, keywords = {Computer Science - Machine Learning, Computer Science - Software Engineering} }
2021
- ILOGSismaDL: an ontology to represent post-disaster regulationFrancesca Caroccia, Damiano D’Agostino, Giordano d’Aloisio, Antinisca Di Marco, and Giovanni StiloIn 12th Workshop on Information Logistics and Digital Transformation, Mar 2021
The emergency caused by a natural disaster must be tackled promptly by public institutions. In this situation, Governments enact specific laws (i.e., decrees) to handle the emergency and the reconstruction of destroyed areas. As it happened in 2009 and 2016 when the Italian Government issued several, very different, decrees to face respectively the earthquakes of L’Aquila and Centro Italia. In this work, we propose SismaDL, a LKIF based ontology, that models the laws in the domain of natural disasters. SismaDL has been used to model the aforementioned laws to build a knowledge base useful to reason about why one regulation is less effective and efficient than the other. SismaDL is the first step of a wider project whose aims are: i) compare laws in the domain of natural disaster; ii) integrate such laws in the Semantic Web; iii) evaluate the effectiveness of a post-disaster reconstruction law; iv) identify good practices to build a reference normative model of the natural disaster regulation. This project is a founding step towards the development of accurate and timely IT systems for efficient and high quality disaster management and reconstruction services to support Governments and local institutions in case of natural disasters.
@inproceedings{caroccia_sismadl_nodate, title = {{SismaDL}: an ontology to represent post-disaster regulation}, language = {en}, author = {Caroccia, Francesca and D’Agostino, Damiano and d'Aloisio, Giordano and Marco, Antinisca Di and Stilo, Giovanni}, pages = {14}, file = {Caroccia et al. - SismaDL an ontology to represent post-disaster re.pdf}, booktitle = {12th Workshop on Information Logistics and Digital Transformation}, year = {2021} }
2025
- FAIRNESSHow Do Generative Models Draw a Software Engineer? A Case Study on Stable Diffusion BiasTosin Fadahunsi, Giordano d’Aloisio, Antinisca Di Marco, and Federica SarroIn Companion of the IEEE/ACM International Conference on Software Analysis, Evolution, and Reengineering, Mar 2025Best Paper Award
Generative models are nowadays widely used to generate graphical content used for multiple purposes, e.g. web, art, advertisement. However, it has been shown that the images generated by these models could reinforce societal biases already existing in specific contexts. In this paper, we focus on understanding if this is the case when one generates images related to various software engineering tasks. In fact, the Software Engineering (SE) community is not immune from gender and ethnicity disparities, which could be amplified by the use of these models. Hence, if used without consciousness, artificially generated images could reinforce these biases in the SE domain. Specifically, we perform an extensive empirical evaluation of the gender and ethnicity bias exposed by three versions of the Stable Diffusion (SD) model (a very popular open-source text-to-image model) - SD 2, SD XL, and SD 3 - towards SE tasks. We obtain 6,720 images by feeding each model with two sets of prompts describing different software-related tasks: one set includes the Software Engineer keyword, and one set does not include any specification of the person performing the task. Next, we evaluate the gender and ethnicity disparities in the generated images. Results show how all models are significantly biased towards male figures when representing software engineers. On the contrary, while SD 2 and SD XL are strongly biased towards White figures, SD 3 is slightly more biased towards Asian figures. Nevertheless, all models significantly under-represent Black and Arab figures, regardless of the prompt style used. The results of our analysis highlight severe concerns about adopting those models to generate content for SE tasks and open the field for future research on bias mitigation in this context.
@inproceedings{fadahunsi_how_2025, title = {How {Do} {Generative} {Models} {Draw} a {Software} {Engineer}? {A} {Case} {Study} on {Stable} {Diffusion} {Bias}}, copyright = {All rights reserved}, shorttitle = {How {Do} {Generative} {Models} {Draw} a {Software} {Engineer}?}, url = {https://doi.org/10.48550/arXiv.2501.09014}, booktitle = {Companion of the {IEEE}/{ACM} {International} {Conference} on {Software} {Analysis}, {Evolution}, and {Reengineering}}, urldate = {2025-01-21}, author = {Fadahunsi, Tosin and d'Aloisio, Giordano and Marco, Antinisca Di and Sarro, Federica}, month = mar, year = {2025}, keywords = {Computer Science - Artificial Intelligence, Computer Science - Software Engineering}, }
2023
- QUALIFIER@ECSATowards a Prediction of Machine Learning Training Time to Support Continuous Learning Systems DevelopmentFrancesca Marzi, Giordano d’Aloisio, Antinisca Di Marco, and Giovanni StiloIn International Workshop on Quality in Software Architecture - QUALIFIER, Mar 2023
@inproceedings{marzi2023towards, title = {Towards a Prediction of Machine Learning Training Time to Support Continuous Learning Systems Development}, author = {Marzi, Francesca and d'Aloisio, Giordano and Di Marco, Antinisca and Stilo, Giovanni}, booktitle = {International Workshop on Quality in Software Architecture - QUALIFIER}, year = {2023} }
- RRRRA Decision Tree to Shepherd Scientists through Data RetrievabilityAndrea Bianchi, Giordano d’Aloisio, Francesca Marzi, and Antinisca Di MarcoIn Second Workshop on Reproducibility and Replication of Research Results, Mar 2023
Reproducibility is a crucial aspect of scientific research that involves the ability to independently replicate experimental results by analysing the same data or repeating the same experiment. Over the years, many works have been proposed to make the results of the experiments actually reproducible. However, very few address the importance of data reproducibility, defined as the ability of independent researchers to retain the same dataset used as input for experimentation. Properly addressing the problem of data reproducibility is crucial because often just providing a link to the data is not enough to make the results reproducible. In fact, also proper metadata (e.g., preprocessing instruction) must be provided to make a dataset fully reproducible. In this work, our aim is to fill this gap by proposing a decision tree to sheperd researchers through the reproducibility of their datasets. In particular, this decision tree guides researchers through identifying if the dataset is actually reproducible and if additional metadata (i.e., additional resources needed to reproduce the data) must also be provided. This decision tree will be the foundation of a future application that will automate the data reproduction process by automatically providing the necessary metadata based on the particular context (e.g., data availability, data preprocessing, and so on). It is worth noting that, in this paper, we detail the steps to make a dataset retrievable, while we will detail other crucial aspects for reproducibility (e.g., dataset documentation) in future works.
@inproceedings{bianchi2023decision, title = {A Decision Tree to Shepherd Scientists through Data Retrievability}, author = {Bianchi, Andrea and d'Aloisio, Giordano and Marzi, Francesca and Di Marco, Antinisca}, booktitle = {Second Workshop on Reproducibility and Replication of Research Results}, year = {2023}, doi = {https://doi.org/10.48550/arXiv.2304.05767} }
2022
- BIAS@ECIREnhancing Fairness in Classification Tasks with Multiple Variables: A Data- and Model-Agnostic ApproachGiordano d’Aloisio, Giovanni Stilo, Antinisca Di Marco, and Andrea D’AngeloIn Advances in Bias and Fairness in Information Retrieval, Mar 2022
Nowadays assuring that search and recommendation systems are fair and do not apply discrimination among any kind of population has become of paramount importance. Those systems typically rely on machine learning algorithms that solve the classification task. Although the problem of fairness has been widely addressed in binary classification, unfortunately, the fairness of multi-class classification problem needs to be further investigated lacking well-established solutions. For the aforementioned reasons, in this paper, we present the Debiaser for Multiple Variables, a novel approach able to enhance fairness in both binary and multi-class classification problems. The proposed method is compared, under several conditions, with the well-established baseline. We evaluate our method on a heterogeneous data set and prove how it overcomes the established algorithms in the multi-classification setting, while maintaining good performances in binary classification. Finally, we present some limitations and future improvements.
@inproceedings{10.1007/978-3-031-09316-6_11, author = {d'Aloisio, Giordano and Stilo, Giovanni and Di Marco, Antinisca and D'Angelo, Andrea}, editor = {Boratto, Ludovico and Faralli, Stefano and Marras, Mirko and Stilo, Giovanni}, title = {Enhancing Fairness in Classification Tasks with Multiple Variables: A Data- and Model-Agnostic Approach}, booktitle = {Advances in Bias and Fairness in Information Retrieval}, year = {2022}, publisher = {Springer International Publishing}, address = {Cham}, pages = {117--129}, isbn = {978-3-031-09316-6}, doi = {https://doi.org/10.1007/978-3-031-09316-6_11}, }
2021
- ILOGSismaDL: an ontology to represent post-disaster regulationFrancesca Caroccia, Damiano D’Agostino, Giordano d’Aloisio, Antinisca Di Marco, and Giovanni StiloIn 12th Workshop on Information Logistics and Digital Transformation, Mar 2021
The emergency caused by a natural disaster must be tackled promptly by public institutions. In this situation, Governments enact specific laws (i.e., decrees) to handle the emergency and the reconstruction of destroyed areas. As it happened in 2009 and 2016 when the Italian Government issued several, very different, decrees to face respectively the earthquakes of L’Aquila and Centro Italia. In this work, we propose SismaDL, a LKIF based ontology, that models the laws in the domain of natural disasters. SismaDL has been used to model the aforementioned laws to build a knowledge base useful to reason about why one regulation is less effective and efficient than the other. SismaDL is the first step of a wider project whose aims are: i) compare laws in the domain of natural disaster; ii) integrate such laws in the Semantic Web; iii) evaluate the effectiveness of a post-disaster reconstruction law; iv) identify good practices to build a reference normative model of the natural disaster regulation. This project is a founding step towards the development of accurate and timely IT systems for efficient and high quality disaster management and reconstruction services to support Governments and local institutions in case of natural disasters.
@inproceedings{caroccia_sismadl_nodate, title = {{SismaDL}: an ontology to represent post-disaster regulation}, language = {en}, author = {Caroccia, Francesca and D’Agostino, Damiano and d'Aloisio, Giordano and Marco, Antinisca Di and Stilo, Giovanni}, pages = {14}, file = {Caroccia et al. - SismaDL an ontology to represent post-disaster re.pdf}, booktitle = {12th Workshop on Information Logistics and Digital Transformation}, year = {2021} }
2025
- SANEROn the Compression of Language Models for Code: An Empirical Study on CodeBERTGiordano d’Aloisio, Luca Traini, Federica Sarro, and Antinisca Di MarcoIn IEEE/ACM International Conference on Software Analysis, Evolution, and Reengineering, 2025
Language models have proven successful across a wide range of software engineering tasks, but their significant computational costs often hinder their practical adoption. To address this challenge, researchers have begun applying various compression strategies to improve the efficiency of language models for code. These strategies aim to optimize inference latency and memory usage, though often at the cost of reduced model effectiveness. However, there is still a significant gap in understanding how these strategies influence the efficiency and effectiveness of language models for code. Here, we empirically investigate the impact of three well-known compression strategies – knowledge distillation, quantization, and pruning – across three different classes of software engineering tasks: vulnerability detection, code summarization, and code search. Our findings reveal that the impact of these strategies varies greatly depending on the task and the specific compression method employed. Practitioners and researchers can use these insights to make informed decisions when selecting the most appropriate compression strategy, balancing both efficiency and effectiveness based on their specific needs.
@inproceedings{daloisio_compression_2024, title = {On the {Compression} of {Language} {Models} for {Code}: {An} {Empirical} {Study} on {CodeBERT}}, copyright = {All rights reserved}, shorttitle = {On the {Compression} of {Language} {Models} for {Code}}, booktitle = {{IEEE}/{ACM} {International} {Conference} on {Software} {Analysis}, {Evolution}, and {Reengineering}}, urldate = {2025-03-11}, publisher = {arXiv}, author = {d'Aloisio, Giordano and Traini, Luca and Sarro, Federica and Marco, Antinisca Di}, year = {2025}, keywords = {Computer Science - Artificial Intelligence, Computer Science - Performance, Computer Science - Software Engineering}, file = {Preprint PDF:/Users/giord/Zotero/storage/H59UCM8I/d'Aloisio et al. - 2024 - On the Compression of Language Models for Code An.pdf:application/pdf;Snapshot:/Users/giord/Zotero/storage/BP5EMX5Y/2412.html:text/html} }
2024
- ESEMFRINGE: context-aware FaiRness engineerING in complex software systEmsFabio Palomba, Andrea Di Sorbo, Davide Di Ruscio, Filomena Ferrucci, Gemma Catolino, Giammaria Giordano, Dario Di Dario, Gianmario Voria, Viviana Pentangelo, Maria Tortorella, Arnaldo Sgueglia, Claudio Di Sipio, Giordano D’Aloisio, and Antinisca Di MarcoIn Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, 2024
Machine learning (ML) is essential in modern technology, driving complex data-driven decisions. By 2025, daily data generation will exceed 463 exabytes, increasing ML’s influence and ethical risks of data exploitation and discrimination. The European Union’s Artificial Intelligence Act highlights the need for ethical AI solutions. Project Fringe (context-aware FaiRness engineerING in complex software systEms) addresses software fairness in ML-intensive systems that collect data through interconnected devices. Fringe aims to provide software engineers, data scientists, and ML experts with methodologies and software engineering solutions to improve fairness in ML systems. The goals of the project include developing a metamodel for ML fairness, a fairness-aware monitoring infrastructure, contextual solutions for identifying fairness issues, and automated recommendation systems to design fairness properties throughout the software development lifecycle.
@inproceedings{10.1145/3674805.3695394, author = {Palomba, Fabio and Di Sorbo, Andrea and Di Ruscio, Davide and Ferrucci, Filomena and Catolino, Gemma and Giordano, Giammaria and Di Dario, Dario and Voria, Gianmario and Pentangelo, Viviana and Tortorella, Maria and Sgueglia, Arnaldo and Di Sipio, Claudio and D'Aloisio, Giordano and Di Marco, Antinisca}, title = {FRINGE: context-aware FaiRness engineerING in complex software systEms}, year = {2024}, isbn = {9798400710476}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, doi = {https://doi.org/10.1145/3674805.3695394}, booktitle = {Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement}, pages = {608–612}, numpages = {5}, keywords = {Ethical Artificial Intelligence, Software Engineering for Artificial Intelligence., Software fairness engineering}, location = {Barcelona, Spain}, series = {ESEM '24} }
- ESEMExploring LLM-Driven Explanations for Quantum AlgorithmsGiordano d’Aloisio, Sophie Fortz, Carol Hanna, Daniel Fortunato, Avner Bensoussan, Eñaut Mendiluze Usandizaga, and Federica SarroIn Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, 2024
Background Quantum computing is a rapidly growing new programming paradigm that brings significant changes to the design and implementation of algorithms. Understanding quantum algorithms requires knowledge of physics and mathematics, which can be challenging for software developers. Aims In this work, we provide a first analysis of how LLMs can support developers’ understanding of quantum code. Method We empirically analyse and compare the quality of explanations provided by three widely adopted LLMs (Gpt3.5, Llama2, and Tinyllama) using two different human-written prompt styles for seven state-of-the-art quantum algorithms. We also analyse how consistent LLM explanations are over multiple rounds and how LLMs can improve existing descriptions of quantum algorithms. Results Llama2 provides the highest quality explanations from scratch, while Gpt3.5 emerged as the LLM best suited to improve existing explanations. In addition, we show that adding a small amount of context to the prompt significantly improves the quality of explanations. Finally, we observe how explanations are qualitatively and syntactically consistent over multiple rounds. Conclusions This work highlights promising results, and opens challenges for future research in the field of LLMs for quantum code explanation. Future work includes refining the methods through prompt optimisation and parsing of quantum code explanations, as well as carrying out a systematic assessment of the quality of explanations.
@inproceedings{d2024exploring, title = {Exploring LLM-Driven Explanations for Quantum Algorithms}, author = {d'Aloisio, Giordano and Fortz, Sophie and Hanna, Carol and Fortunato, Daniel and Bensoussan, Avner and Usandizaga, E{\~n}aut Mendiluze and Sarro, Federica}, booktitle = {Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement}, pages = {475--481}, year = {2024}, doi = {https://doi.org/10.1145/3674805.3690753}, }
- EDTConfEngineering a Digital Twin for Diagnosis and Treatment of Multiple SclerosisGiordano D’Aloisio, Alessandro Di Matteo, Alessia Cipriani, Daniele Lozzi, Enrico Mattei, Gennaro Zanfardino, Antinisca Di Marco, and Giuseppe PlacidiIn Proceedings of the ACM/IEEE 27th International Conference on Model Driven Engineering Languages and Systems, 2024
Multiple sclerosis (MS) is a complex, chronic, and heterogeneous disease of the central nervous system that affects 3 million people globally. The multifactorial nature of MS necessitates an adaptive and personalized approach to diagnosis, monitoring, and treatment. This paper proposes a novel Digital Twin for Multiple Sclerosis (DTMS) designed to integrate diverse data sources, including Magnetic resonance imaging (MRI), clinical biomarkers, and digital health metrics, into a unified predictive model. The DTMS aims to enhance the precision of MS management by providing real-time, individualized insights into disease progression and treatment efficacy. Through a federated learning approach, the DTMS leverages explainable AI to offer reliable and personalized therapeutic recommendations, ultimately striving to delay disability and improve patient outcomes. This comprehensive digital framework represents a significant advancement in the application of AI and digital twins in the field of neurology, promising a more tailored and effective management strategy for MS.
@inproceedings{d2024engineering, title = {Engineering a Digital Twin for Diagnosis and Treatment of Multiple Sclerosis}, author = {D'Aloisio, Giordano and Di Matteo, Alessandro and Cipriani, Alessia and Lozzi, Daniele and Mattei, Enrico and Zanfardino, Gennaro and Di Marco, Antinisca and Placidi, Giuseppe}, booktitle = {Proceedings of the ACM/IEEE 27th International Conference on Model Driven Engineering Languages and Systems}, pages = {364--369}, year = {2024}, doi = {https://doi.org/10.1145/3652620.3688249} }
- SSBSEGreenStableYolo: Optimizing Inference Time and Image Quality of Text-to-Image GenerationJingzhi Gong, Sisi Li, Giordano d’Aloisio, Zishuo Ding, Yulong Ye, William B Langdon, and Federica SarroIn International Symposium on Search Based Software Engineering, 2024Challenge Track Winner
Tuning the parameters and prompts for improving AI-based text-to-image generation has remained a substantial yet unaddressed challenge. Hence we introduce GreenStableYolo, which improves the parameters and prompts for Stable Diffusion to both reduce GPU inference time and increase image generation quality using NSGA-II and Yolo. Our experiments show that despite a relatively slight trade-off (18%) in image quality compared to StableYolo (which only considers image quality), GreenStableYolo achieves a substantial reduction in inference time (266% less) and a 526% higher hypervolume, thereby advancing the state-of-the-art for text-to-image generation.
@inproceedings{gong2024greenstableyolo, title = {GreenStableYolo: Optimizing Inference Time and Image Quality of Text-to-Image Generation}, author = {Gong, Jingzhi and Li, Sisi and d’Aloisio, Giordano and Ding, Zishuo and Ye, Yulong and Langdon, William B and Sarro, Federica}, booktitle = {International Symposium on Search Based Software Engineering}, pages = {70--76}, year = {2024}, organization = {Springer Nature Switzerland Cham}, doi = {https://doi.org/10.1007/978-3-031-64573-0_7}, url = {https://doi.org/10.1007/978-3-031-64573-0_7} }
- ICPEGrammar-Based Anomaly Detection of Microservice Systems Execution TracesAndrea D’Angelo, and Giordano d’AloisioIn Companion of the 15th ACM/SPEC International Conference on Performance Engineering, 2024Best Data Challenge Award
Microservice architectures are a widely adopted architectural pattern for large-scale applications. Given the large adoption of these systems, several works have been proposed to detect performance anomalies starting from analysing the execution traces. However, most of the proposed approaches rely on machine learning (ML) algorithms to detect anomalies. While ML methods may be effective in detecting anomalies, the training and deployment of these systems as been shown to be less efficient in terms of time, computational resources, and energy required.In this paper, we propose a novel approach based on Context-free grammar for anomaly detection of microservice systems execution traces. We employ the SAX encoding to transform execution traces into strings. Then, we select strings encoding anomalies, and for each possible anomaly, we build a Context-free grammar using the Sequitur grammar induction algorithm. We test our approach on two real-world datasets and compare it with a Logistic Regression classifier. We show how our approach is more effective in terms of training time of 15 seconds with a minimum loss in effectiveness of 5% compared to the Logistic Regression baseline.
@inproceedings{10.1145/3629527.3651844, author = {D'Angelo, Andrea and d'Aloisio, Giordano}, title = {Grammar-Based Anomaly Detection of Microservice Systems Execution Traces}, year = {2024}, isbn = {9798400704451}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3629527.3651844}, doi = {https://doi.org/10.1145/3629527.3651844}, booktitle = {Companion of the 15th ACM/SPEC International Conference on Performance Engineering}, pages = {77–81}, numpages = {5}, keywords = {anomaly detection, context-free grammar, execution traces, micro service system}, location = {<conf-loc>, <city>London</city>, <country>United Kingdom</country>, </conf-loc>}, series = {ICPE '24 Companion} }
2023
- ECSAData-Driven Analysis of Gender Fairness in the Software Engineering Academic LandscapeGiordano d’Aloisio, Andrea D’Angelo, Francesca Marzi, Diana Di Marco, Giovanni Stilo, and Antinisca Di MarcoIn European Conference on Software Architecture - ECSA, 2023
@inproceedings{d2023data, title = {Data-Driven Analysis of Gender Fairness in the Software Engineering Academic Landscape}, author = {d'Aloisio, Giordano and D'Angelo, Andrea and Marzi, Francesca and Di Marco, Diana and Stilo, Giovanni and Di Marco, Antinisca}, booktitle = {European Conference on Software Architecture - ECSA}, year = {2023} }
- FASEDemocratizing Quality-Based Machine Learning Development through Extended Feature ModelsGiordano d’Aloisio, Antinisca Di Marco, and Giovanni StiloIn Fundamental Approaches to Software Engineering, 2023
ML systems have become an essential tool for experts of many domains, data scientists and researchers, allowing them to find answers to many complex business questions starting from raw datasets. Nevertheless, the development of ML systems able to satisfy the stakeholders’ needs requires an appropriate amount of knowledge about the ML domain. Over the years, several solutions have been proposed to automate the development of ML systems. However, an approach taking into account the new quality concerns needed by ML systems (like fairness, interpretability, privacy, and others) is still missing.
@inproceedings{daloisio_democratizing_2023, address = {Cham}, series = {Lecture {Notes} in {Computer} {Science}}, title = {Democratizing {Quality}-{Based} {Machine} {Learning} {Development} through {Extended} {Feature} {Models}}, copyright = {All rights reserved}, isbn = {978-3-031-30826-0}, doi = {https://doi.org/10.1007/978-3-031-30826-0_5}, language = {en}, booktitle = {Fundamental {Approaches} to {Software} {Engineering}}, publisher = {Springer Nature Switzerland}, author = {d’Aloisio, Giordano and Di Marco, Antinisca and Stilo, Giovanni}, editor = {Lambers, Leen and Uchitel, Sebastián}, year = {2023}, keywords = {/unread, Feature Models, Low-code development, Machine Learning System, Software Product Line, Software Quality}, pages = {88--110}, }
2022
- ICSE-DSQuality-Driven Machine Learning-based Data Science Pipeline Realization: a software engineering approachGiordano d’AloisioIn 2022 IEEE/ACM 44th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), 2022
The recently wide adoption of data science approaches to decision making in several application domains (such as health, business and even education) open new challenges in engineering and implementation of this systems. Considering the big picture of data science, Machine learning is the wider used technique and due to its characteristics, we believe that a better engineering methodology and tools are needed to realize innovative data-driven systems able to satisfy the emerging quality attributes (such as, debias and fariness, explainability, privacy and ethics, sustainability). This research project will explore the following three pillars: i) identify key quality attributes, formalize them in the context of data science pipelines and study their relationships; ii) define a new software engineering approach for data-science systems development that assures compliance with quality requirements; iii) implement tools that guide IT professionals and researchers in the realization of ML-based data science pipelines since the requirement engineering. Moreover, in this paper we also presents some details of the project showing how the feature models and model-driven engineering can be leveraged to realize our project.
@inproceedings{9793779, author = {d’Aloisio, Giordano}, booktitle = {2022 IEEE/ACM 44th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion)}, title = {Quality-Driven Machine Learning-based Data Science Pipeline Realization: a software engineering approach}, year = {2022}, volume = {}, number = {}, pages = {291-293}, doi = {https://doi.org/10.1109/ICSE-Companion55297.2022.9793779}, }
2025
- SoSymHow fair are we? From conceptualization to automated assessment of fairness definitionsGiordano d’Aloisio, Claudio Di Sipio, Antinisca Di Marco, and Davide Di RuscioSoftware and Systems Modeling, 2025
Fairness is a critical concept in ethics and social domains, but it is also a challenging property to engineer in software systems. With the increasing use of machine learning in software systems, researchers have been developing techniques to automatically assess the fairness of software systems. Nonetheless, a significant proportion of these techniques rely upon pre-established fairness definitions, metrics, and criteria, which may fail to encompass the wide-ranging needs and preferences of users and stakeholders. To overcome this limitation, we propose a novel approach, called MODNESS, that enables users to customize and define their fairness concepts using a dedicated modeling environment. Our approach guides the user through the definition of new fairness concepts also in emerging domains, and the specification and composition of metrics for its evaluation. Ultimately, MODNESS generates the source code to implement fair assessment based on these custom definitions. In addition, we elucidate the process we followed to collect and analyze relevant literature on fairness assessment in software engineering (SE). We compare MODNESS with the selected approaches and evaluate how they support the distinguishing features identified by our study. Our findings reveal that i) most of the current approaches do not support user-defined fairness concepts; ii) our approach can cover two additional application domains not addressed by currently available tools, i.e., mitigating bias in recommender systems for software engineering and Arduino software component recommendations; iii) MODNESS demonstrates the capability to overcome the limitations of the only two other Model-Driven Engineering-based approaches for fairness assessment.
@article{d2024fair, title = {How fair are we? From conceptualization to automated assessment of fairness definitions}, author = {d'Aloisio, Giordano and Di Sipio, Claudio and Di Marco, Antinisca and Di Ruscio, Davide}, journal = {Software and Systems Modeling}, pages = {1--27}, year = {2025}, publisher = {Springer Berlin Heidelberg}, doi = {https://doi.org/10.1007/s10270-025-01277-2} }
2024
- JSSUncovering gender gap in academia: A comprehensive analysis within the software engineering communityAndrea D’Angelo, Giordano d’Aloisio, Francesca Marzi, Antinisca Di Marco, and Giovanni StiloJournal of Systems and Software, 2024
Gender gap in education has gained considerable attention in recent years, as it carries profound implications for the academic community. However, while the problem has been tackled from a student perspective, research is still lacking from an academic point of view. In this work, our main objective is to address this unexplored area by shedding light on the intricate dynamics of gender gap within the Software Engineering (SE) community. To this aim, we first review how the problem of gender gap in the SE community and in academia has been addressed by the literature so far. Results show that men in SE build more tightly-knit clusters but less global co-authorship relations than women, but the networks do not exhibit homophily. Concerning academic promotions, the Software Engineering community presents a higher bias in promotions to Associate Professors and a smaller bias in promotions to Full Professors than the overall Informatics community.
@article{DANGELO2024112162, title = {Uncovering gender gap in academia: A comprehensive analysis within the software engineering community}, journal = {Journal of Systems and Software}, pages = {112162}, year = {2024}, issn = {0164-1212}, doi = {https://doi.org/10.1016/j.jss.2024.112162}, url = {https://www.sciencedirect.com/science/article/pii/S0164121224002073}, author = {D’Angelo, Andrea and d’Aloisio, Giordano and Marzi, Francesca and {Di Marco}, Antinisca and Stilo, Giovanni}, keywords = {Gender gap, Gender bias, Academia, Italy, Informatics, Software engineering} }
2023
- IP&MDebiaser for Multiple Variables to enhance fairness in classification tasksGiordano d’Aloisio, Andrea D’Angelo, Antinisca Di Marco, and Giovanni StiloInformation Processing & Management, 2023
Nowadays assuring that search and recommendation systems are fair and do not apply discrimination among any kind of population has become of paramount importance. This is also highlighted by some of the sustainable development goals proposed by the United Nations. Those systems typically rely on machine learning algorithms that solve the classification task. Although the problem of fairness has been widely addressed in binary classification, unfortunately, the fairness of multi-class classification problem needs to be further investigated lacking well-established solutions. For the aforementioned reasons, in this paper, we present the Debiaser for Multiple Variables (DEMV), an approach able to mitigate unbalanced groups bias (i.e., bias caused by an unequal distribution of instances in the population) in both binary and multi-class classification problems with multiple sensitive variables. The proposed method is compared, under several conditions, with a set of well-established baselines using different categories of classifiers. At first we conduct a specific study to understand which is the best generation strategies and their impact on DEMV’s ability to improve fairness. Then, we evaluate our method on a heterogeneous set of datasets and we show how it overcomes the established algorithms of the literature in the multi-class classification setting and in the binary classification setting when more than two sensitive variables are involved. Finally, based on the conducted experiments, we discuss strengths and weaknesses of our method and of the other baselines.
@article{daloisio_debiaser_2023, title = {Debiaser for {Multiple} {Variables} to enhance fairness in classification tasks}, volume = {60}, copyright = {All rights reserved}, issn = {0306-4573}, url = {https://www.sciencedirect.com/science/article/pii/S0306457322003272}, doi = {https://doi.org/10.1016/j.ipm.2022.103226}, language = {en}, number = {2}, urldate = {2022-12-22}, journal = {Information Processing & Management}, author = {d’Aloisio, Giordano and D’Angelo, Andrea and Di Marco, Antinisca and Stilo, Giovanni}, year = {2023}, keywords = {Machine learning, Multi-class classification, Preprocessing algorithm, Bias and Fairness, Equality}, pages = {103226} }
- IJDRRThe toolkit disaster preparedness for pre-disaster planningDonato Di Ludovico, Chiara Capannolo, and Giordano d’AloisioInternational Journal of Disaster Risk Reduction, 2023
The University of L’Aquila “Territori Aperti” (Open Territories) project deals with the topics of prevention and management of natural disasters and the reconstruction and development processes in the affected areas. One of its tasks is developing research on the Toolkit Disaster Preparedness (TDP) aimed at Pre-Disaster Planning. The TDP is structured in this study as a support for the construction of Recovery Strategies and Actions, and concerns the collection and analysis of good practices on post-disaster reconstruction management (Experience Sheets (ESs)), their elaboration into Disaster Preparedness Recommendation Sheets (DPRSs), and the transposition of these into the Recovery Plan. The methodology for the construction of the Recovery Plan was structured in two macro-activities. The first concerns structuring the Toolkit and the related set of sheets (ESs→DPRSs). The second concerns the transfer of the DPRSs to the Recovery Strategies, so that the recommendations and success measures of the former become the actions of the latter. The Toolkit methodology was applied to the case studies of the Abruzzo 2009 earthquake and the Central Italy 2016-17 earthquake. The next steps of the research will concern testing the methodology in the second macro-activity, i.e. the construction of the Recovery Plan, again in the territorial context of the two aforementioned areas.
@article{di_ludovico_toolkit_2023, title = {The toolkit disaster preparedness for pre-disaster planning}, volume = {96}, copyright = {All rights reserved}, issn = {2212-4209}, url = {https://www.sciencedirect.com/science/article/pii/S2212420923003692}, doi = {https://doi.org/10.1016/j.ijdrr.2023.103889}, language = {en}, journal = {International Journal of Disaster Risk Reduction}, author = {Di Ludovico, Donato and Capannolo, Chiara and d'Aloisio, Giordano}, year = {2023}, keywords = {/unread, Disasters, Pre-disaster planning, Preparedness, Recovery, Resilience, Toolkit}, pages = {103889} }