I am a highly skilled AI researcher and Data Scientist with a Ph.D. in Mathematics/Statistical Learning from IMPA. My expertise encompasses machine learning, deep learning, and natural language processing. With a strong foundation in theoretical knowledge, I have successfully led and contributed to industry projects that have gained national recognition. I am passionate about leveraging AI to solve real-world problems and am dedicated to advancing the field through research and innovation.
Accurately predicting the volume of amniotic fluid is fundamental to assessing pregnancy risks, though the task usually requires many hours of laborious work by medical experts. In this paper, we present AmnioML, a machine learning solution that leverages deep learning and conformal prediction to output fast and accurate volume estimates and segmentation masks from fetal MRIs with Dice coefficient over 0.9. Also, we make available a novel, curated dataset for fetal MRIs with 853 exams and benchmark the performance of many recent deep learning architectures. In addition, we introduce a conformal prediction tool that yields narrow predictive intervals with theoretically guaranteed coverage, thus aiding doctors in detecting pregnancy risks and saving lives. A successful case study of AmnioML deployed in a medical setting is also reported. Real-world clinical benefits include up to 20x segmentation time reduction, with most segmentations deemed by doctors as not needing any further manual refinement. Furthermore, AmnioML’s volume predictions were found to be highly accurate in practice, with mean absolute error below 56mL and tight predictive intervals, showcasing its impact in reducing pregnancy complications.
Split Conformal Prediction for Dependent Data
Roberto I. Oliveira, Paulo Orenstein, Thiago Ramos, and 1 more author
Split conformal prediction is a popular tool to obtain predictive intervals from general statistical algorithms, with few assumptions beyond data exchangeability. We show that coverage guarantees from split CP can be extended to dependent processes, such as the class of stationary β-mixing processes, by adding a small coverage penalty. In particular, we show that the empirical coverage bounds for some β-mixing processes match the order of the bounds under exchangeability. The framework introduced also extends to non-stationary processes and to other CP methods, and experiments corroborate our split CP’s coverage guarantees under dependent data.
ExactBoost: Directly Boosting the Margin in Combinatorial and Non-decomposable Metrics
Daniel Csillag, Carolina Piazza, Thiago Ramos, and 3 more authors
In Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, 28–30 mar 2022
Many classification algorithms require the use of surrogate losses when the intended loss function is combinatorial or non-decomposable. This paper introduces a fast and exact stagewise optimization algorithm, dubbed ExactBoost, that boosts stumps to the actual loss function. By developing a novel extension of margin theory to the non-decomposable setting, it is possible to provably bound the generalization error of ExactBoost for many important metrics with different levels of non-decomposability. Through extensive examples, it is shown that such theoretical guarantees translate to competitive empirical performance. In particular, when used as an ensembler, ExactBoost is able to significantly outperform other surrogate-based and exact algorithms available.