Profile Image
SOFTWARE ENGINEER

Vicente Barros

PhD Student @ University of Aveiro

SCROLL

About

Research Sudent focused on Software Development Lifecycle Automation

I am a PhD student pursuing software engineering with a research fellowship at IEETA (Institute of Electronics and Informatics Engineering of Aveiro). My work focuses on software engineering, and technology, with particular emphasis on automating the Software Development Process. I am passionate about simplifying workflows and making complex systems more accessible to everyone!

Research Interests

  • Machine Learning for Software Engineering
  • Software Development Automation
  • Agentic Based Software Development

Experience

My professional journey through the years

2025 – Present
CURRENT POSITION

PhD Research Fellow

IEETA - Institute of Electronics and Informatics Engineering of Aveiro
Aveiro, Portugal

Development and guidance on the GREG Portal Toolkit development, a software platform that allows the medical research community to easily access, analyze health data and generate Real-World Evidence (RWE) from observational health data.

KEY ACHIEVEMENTS

  • Coordinating the development of the catalogue system for the GREG Portal Toolkit, ensuring it meets the needs of the research community and integrates seamlessly with existing tools and workflows
  • Development of the new tools for GREG including the Dashboards, Privacy Data Score.
  • Validation of the new tool with end users and stakeholders, gathering feedback and iterating on the design to ensure it meets the needs of the research community
2024 – 2025

Research Fellow

IEETA - Institute of Electronics and Informatics Engineering of Aveiro
Aveiro, Portugal

Development of the new tool for supporting the EHDEN and GREG Portal, the software that allows the medical research community to easily access and analyze health data. Focused on building user-friendly interfaces for complex health datasets, improving usability and accessibility for researchers and healthcare professionals.

KEY ACHIEVEMENTS

  • Rethink the new software architecture to support multiple projects and flexibility for future tools
  • Development of the catalogue system, enabling users to easily browse and access available data visualization components
  • Validation of the new tool with end users and stakeholders, gathering feedback and iterating on the design to ensure it meets the needs of the research community
2023 – 2024

Undergraduate Research Fellow

IEETA - University of Aveiro
Aveiro, Portugal

Development of the new interface for the Montra tool, a software platform for health data visualization and analysis. Focused on creating user-friendly interfaces for complex health datasets, improving usability and accessibility for researchers and healthcare professionals.

KEY ACHIEVEMENTS

  • Analysis of the current state of Montra tool and identification of areas for improvement
  • Migration of the Montra tool interface to React and TypeScript for improved performance and maintainability
  • Development of the catalogue system for the Montra tool, enabling users to easily browse and access available data visualization components

Technologies

Tools and frameworks I work with

~/tech-stack

$ ls ~/tech-stack

[Web]
React TypeScript JavaScript Astro Svelte Next.js Tailwind
[Backend]
Python Java C# .NET FastAPI Django Flask Express Node.js
[Database]
PostgreSQL MySQL MongoDB SQLite Redis MSSQL
[DevOps]
Docker Git Terraform Kafka

$ _

Projects

Research projects, web applications, and software engineering work

Catalogue System

Ongoing

Development of a catalogue system for the GREG Portal Toolkit, a software platform that allows the medical research community to easily access, analyze health data and generate Real-World Evidence (RWE) from observational health data. The catalogue system enables users to easily browse and access available data, improving usability and accessibility for researchers and healthcare professionals.

ReactTypeScriptNext.jsHealth DataFastAPIPythonMongoDB

Odin

Completed

Personal knowledge hub platform that analyzes and organizes uploaded documents. Features include text extraction, metadata indexing, keyword-based search with highlighting, and AI-powered chat interface for generating insights from single or multiple documents. Streamlines document management and enables users to extract actionable insights from their data.

ReactTypeScriptTailwind CSSFastAPIPythonAINLP

MTG Card Collection Manager

Completed

Web application for managing Magic: The Gathering card collections using the Scryfall API. Features include card search, deck building, collection tracking, and price monitoring. Built with modern frontend technologies to provide a smooth user experience for card game enthusiasts.

Next.jsTypeScriptScryfall APIREST API

Palestine-Israel Conflict Visualization

Completed

Interactive data visualization project exploring the Palestine-Israel conflict through various datasets and timelines. Uses D3.js to create compelling visual narratives that help users understand complex historical and contemporary events. The project emphasizes data-driven storytelling and objective presentation of information.

D3.jsJavaScriptData VisualizationInteractive Graphics

Black Bunny

Completed

Web Application replacing the OHDSI tools WhiteRabbit and Usagi. The replaced applications were developed in Java interface and were used for data profiling and vocabulary mapping in the context of the Observational Health Data Sciences and Informatics (OHDSI) initiative. The new application, Black Bunny, is built with modern web technologies to provide a more user-friendly and efficient experience for researchers working with health data. For the vocabulary mapping, a novel approach was implemented using Embeddings and machine learning techniques to suggest mappings between source and target vocabularies, improving the accuracy and efficiency of the mapping process.

ReactTypeScriptFastAPIPythonMachine LearningEmbeddings

Sobe e Desce Companion App

Completed

A companion app for the traditional portuguese board game "Sobe e Desce". The app provides a digital interface for players to track their progress, manage game rules, and enhance the gaming experience. It includes features such as score tracking, rule reminders, and interactive game elements to make playing "Sobe e Desce" more engaging and enjoyable.

ReactTypeScriptTailwind CSSMobile Development

Teaching

Courses taught in machine learning, data science, and research methods

Course
Level
Period
Students
IES

Introduction to Software Engineering

Teaching assistant role supporting introduction to software engineering coursework. Helped students understand fundamental concepts including software development lifecycle, design patterns, testing methodologies, and collaborative development practices. Focused on practical application of software engineering principles through hands-on projects.

undergraduate
First Semester 2025
25 students
ES

Software Engineering

Teaching assistant role supporting advanced software engineering coursework. Helped students understand CI/CD, DevOps practices, observability, and scalability. Focused on practical application of software engineering principles through hands-on projects and real-world case studies.

graduate
First Semester 2025
40 students
PI

Informatics Project - BotBlocker Project

Helped students on their final project, BotBlocker, a tool designed to detect AI generated content on social media networkds. Provided guidance on project development, including architecture design, implementation strategies, and best practices for software development.

undergraduate
Second Semester 2025
4 students
IES

Introduction to Software Engineering

Teaching assistant role supporting introduction to software engineering coursework. Helped students understand fundamental concepts including software development lifecycle, design patterns, testing methodologies, and collaborative development practices. Focused on practical application of software engineering principles through hands-on projects.

undergraduate
First Semester 2024
25 students

Publications

Peer-reviewed research and scholarly work

Conference Papers

2
1

An Embedding-Based Machine Learning Solution for Medical Concept Mapping

Barros, V., Paradinha, R., Almeida, J. R., & Oliveira, J. L. (2025). An Embedding-Based Machine Learning Solution for Medical Concept Mapping. In Proceedings - IEEE Symposium on Computer-Based Medical Systems.
Abstract

The integration of heterogeneous clinical datasets represents a fundamental challenge in contemporary biomedical research, particularly when reconciling multi-language and multi-institution data sources. The challenge of this procedure lies in the effort required to map the original concepts with their standard definitions. Various automated mapping solutions can assist researchers in this process, but the complexity grows when handling multi-language datasets, resulting in substantial manual work for translation and mapping. In this paper, we proposed a novel framework for clinical concept harmonisation that leverages vector-based embeddings and semantic search methodologies to enhance interoperability in multi-cohort studies. The methodology incorporates comprehensive data profiling, ontology-driven concept alignment, and machine learning-based vector search within a unified architecture. We demonstrate the efficacy of this approach through practical application to Alzheimer's disease (AD) research datasets from distinct institutions with different languages, achieving effective cross-lingual concept mapping while maintaining compatibility with established standardisation frameworks. © 2025 IEEE

Published
2

A Semantic-Driven for Cohort Data Harmonisation into OMOP CDM Schema

Paradinha, R., Barros, V., Almeida, J. R., & Oliveira, J. L. (2023). A Semantic-Driven for Cohort Data Harmonisation into OMOP CDM Schema. In Studies in Health Technology and Informatics.
Abstract

Clinical research often requires integrating data from diverse sources, which differ not only in structure but also in semantics and language. Traditional extract-transform-load (ETL) pipelines struggle to handle semantic variability and lack built-in support for multilingual or ontology-driven harmonisation. This fragmentation limits the interoperability and reuse of clinical datasets in large-scale analyses. In this paper, we propose an integrated framework that combines an embedding-based concept mapping engine with an automated ETL pipeline using Apache Airflow. The mapping engine uses transformer-based embeddings to align clinical terms with standard concepts, producing outputs in White Rabbit and Usagi-compatible formats to ensure backward interoperability. We validated the system using multilingual real-world datasets demonstrating its ability to handle heterogeneous inputs and maintain end-to-end reproducibility. © 2025 The Authors.

Published