I am a software engineering researcher and I love what I do. In my investigations, I conduct experiments and develop tools to better understand and improve software development.
I received my MSc (2005) and DSc (2009) degrees in Computer Science from the University of São Paulo (ICMC-USP). Part of my PhD research was carried out at the University of California, Irvine (ICS-UCI) under the supervision of Cristina Lopes. In 2009, I started working at the Federal University of São Paulo (ICT-UNIFESP) as an Assistant Professor, being promoted to Associate Professor in the end of 2017. In 2016-2017 I went back to UCI as a Visiting Professor, where I am an Associated Researcher of the Institute for Software Research (ISR). Currently, my main research interests are software reuse, software testing, empirical software engineering, mining software repositories, and agile development. I'm Erdös number 4 (Erdös - Specker - Lieberherr - Lopes - me).
With the advent of the Internet, the amount of information available online is unimaginable. For instance, as of September 2014, there were 1 billion websites online. It is thus fair to say that we can discover at least some piece of information of about almost any topic we can think of. In fact, my young kids believe that everything they can imagine is on the Web. The Internet has become a kind of universal library. For a while this has been true for traditional content - such as text and video - and it is now becoming a reality for programming artifacts, in particular for source code. In fact, as of 2017, GitHub alone has been hosting more than 60 million projects, a large part of it being open source. We are thus practically on the verge of creating a universal code library1! The content is there but the problem is mainly twofold: (1) how to effectively reach it; and (2) once found, how to successfully integrate it to the developer's workspace. I believe we can improve software engineering in a number of ways once we adequately tackle such problems.
Since around 2007 I have been discovering different ways to facilitate code reuse from large-scale software repositories. In collaboration with Cristina Lopes from UCI, I have developed CodeGenie, a tool that uses test cases as an interface for code search (called test-driven code search). CodeGenie not only helps reaching the desired code but it also supports automatically weaving of code candidates into the developer's local project2. We have also developed a query expansion approach to improve the recall of code search engines that use methods' interfaces as an input to look for code (called interface-driven code search). The approach uses several types of thesauri and in a recent study we found that it can significantly improve recall for interface-driven code search.
More recently we have been looking at essential properties of large code repositories, in particular how methods tend to repeat from project to project both in terms of their interfaces and their functions (what we called interface and functional redundancy). We have found that redundancy is quite prevalent in these repositories, which points to the feasibility of tools that take into advantage such kind of replication.
Software testing and empirical software engineering are two of my other passions. In my Master's and PhD research I have extended traditional white-box testing techniques to address aspect-oriented programs. More recently I have investigated the impact of software testing education on coding skills (later extended here). In particular, we have found that students exposed to testing concepts can produce more correct software and that university instructors tend to have little knowledge about these same concepts. In collaboration with Alessandro Garcia from PUC-Rio and other researchers I have looked into the effectiveness of two of the most popular agile techniques: pair programming and test-first programming. We have found that the agile practices can in fact improve software development when compared to solo programming and test-last programming.
Currently I'm focusing on investigating new ways to improve software development by using large-scale code repositories. In particular, we are developing a platform called GenieRevive, which will automatically recommend better software found in code corpora to improve existing projects.
Please contact me if you are interested in chatting about any of these topics. I am always open to new collaboration opportunities and I also have open positions for both Master's and PhD students!
1In February 2017 I gave a talk at UCI where I further elaborate on this subject.
2CodeGenie is not currently available for download as it was developed as a prototype plug-in of an earlier version of the Eclipse IDE. However, we intend to make it available soon, as we are currently evolving it for newer versions of Eclipse.
A more complete list of my papers can be found at my author entry or my profile page. Here I list some of the most important ones in reverse chronological order.
1/2018 - Introduction to programming
1/2018 - Advanced Topics in Software Engineering (graduate)
2/2017 - Object-Oriented Programming
2/2017 - Software Testing
1/2017 - Introduction to programming
Thanks to Klérisson Paixão for sharing his homepage template.