GSoC project page

JAVADOC STYLE COVERAGE AND PARSER OPTIMIZATION

EXP

Open Source Development

I have always liked to code and open source development gives one a purpose to code, an objective to accomplish, collaborators to colloborate with and introduces the much needed formalism to ones work. Before GSoC, I previously had some experience of working with open source softwares. My first detailed exposure was to Apache POI when I worked with it to develop backend for a mobile app at my internship at Larsen & Toubro Infotech. The attraction to open source was quite natural and my first ever issue fix was for [Apache DataFu](https://datafu.incubator.apache.

Enough context now lets fast-forward to GSoC, which exultingly was well with in a year since my official encounter with open source. I wanted to get into GSoC so much, because as I said I like to code, that I exhausted all 5 project proposal slots. All my project proposals were for checkstyle. 5 May 2017, that's when I was intimated by GSoC folks that my proposal for Coverage of Documentation Comments Style Guide and performance optimization of Javadoc parser was selected and I was thrilled beyond any words can describe. Approximately the next 3 months, were deemed to be the GSoC official coding period and my project was particularly a big one and one requiring lots of discussions for the colloborators to reach a common conclusion.

Participating into GSoC helped me get credible open source experience at an incredible rate. GSoC adds severe value to open source development by primarily two componentes, one being deadlines and another, an official mentor. About deadlines, as I already mentioned my project was particularly a big one, I spent a good amount of time chasing deadlines. You can't get to understand real work flow, real compromises in code that you have to make, real efficieny and smart work until you are put on a clock. About mentor, well, working with mentor helps you speed up work and one can directly gain from the mentor's official experience which is pretty great considering that mentors generally tend to people who are very well toned in software development.

Overall, participating into GSoC was a pretty delectable experience for the code monster within me. I got to know several work flow patterns and bottomline being I got to know how to get the work done. To summarize, my GSoC was full of discussions, speculations, skimming through specfications, bug-fixing, desgining a grammar for javadoc, chasing deadlines, and meeting objectives.


Coding Standards

One of the great things about working with checkstyle is that you get to practise what you preach. What I mean to say is, since checkstyle uses itself to validate its own source and asserts the coding conventions and style quite thoroughly throughtout the code you get to know more and more of the industry wide coding standards as you work on checkstyle. In fact one of the objectives of my project was to come up with coverage of javadoc styling guidelines from oracle. It was designed to be the second part of the project but wasn't actually pursed in the project since the javadoc parser happens to need further optimzation before that. The coding conventions upheld by checkstyle are reflected by the following checkstyle configuration xml file specifying the checks used by checkstyle before accepting any new code

checkstyle_checks.xml

Checkstyle is used to by several other repositories one of them being Google Gauva. The coverage of the coding style enforced by checkstyle for the guava library can be seen here

See Also: Oracle code conventions, Oracle javadoc syntax (as recognized by its javadoc tool)


Software Testing

Code quality is of utmost important at checkstyle. A legion of checks validate the code before it can be added to checkstyle. Just a rough overview of an mvn verify for checkstyle would let you know what I am talking about. There are more than 2300 tests in UTs for checkstyle right now and the number is for sure not to decrease. If I remember correctly. there were not more than 1800 or 1900 checks when I first started with checkstyle, a few months before GSoC that is. So yeah, I was talking about the verify lifecycle phase of maven for checkstyle. First all these tests get deployed. After that checkstyle uses itself to validate its source code. After checkstyle, sevntu-checkstyle validates the source code. After that pmd, finbugs, spotbugs try to cover all the potential bugs in the code. Finally, all the new code added should be covered and that is asserted by cobertura which apparently instruments the code and hearing about code instrumentation always makes me go "woah !". One more thing, coverage shouldn't be considered sufficent to gauge the quality of the test code which I have personally experienced in my work and for the same reason folks at checkstyle brought PIT mutation testing into the picture which also happens to be my favourite test-suite in checkstyle, just a side note. PIT tests basically introudce mutations into the code, modify/remove source lines of code unpredicatbly, and if at least one UT fails for a mutation then that mutation is considered to be covered or killed. Repositories can set whatever mutation coverage is desirable to them for each module and for the whole code. One more thing, pit tests aren't executed during the verify phase but are executed on CI servers, but can be executed manually with the testing profiles using commands like mvn $PROFILE clean verify org.pitest:pitest-maven:mutationCoverage;. So yeah, I haven't worked with enough repositories to say that checkstyle is at the postive extremum of testing in the open source spectrum but I would say that anyway, cheeky me. So yeah, that is just the testing done on the local, after you create a PR, which is the only way checkstyle accepts code from general public, several more checks run on the updated code from about 10 different CI severs which I shall be covering in the section that follow. And if all this was not good enough, a separate regression testing tool is hosted at checkstyle/contribution. This regression testing tool actually was quite vital for my project and was further developed in it. Basically we needed to make sure that changing ANTLR grammar for javadoc didn't result in unexpected parse trees or break anything else. More on it below.

Random Links

Regression Testing tool

Interesting PMD failure

Sample cobertura report

Sample PIT report


CI

To precisely define and differentiate and understand continuous integeration, continuous deployment and continuous delivery has been some task for me. I feel much comfortable at just intuitively grabbing the concept from whatever the name directs at. Anyway, theoritical disucssions about these concepts is for some other day and we can just dive into the CI servers at checkstyle. Checkstyle uses about 10 CI servers as of Aug, 2017 and statuses of 8 of them for the main repository, as of 29 Aug 2017, can be looked at below:

Description of a few CIs employed at checkstyle:

  • Travis-ci - It runs several tests on checkstyle in the UNIX environment and with several -Duser.language and -Duser.region settings, i.e locale settings, for maven.Yes, checkstyle supports various languages. You can learn more about the kinds of locales checkstyle can support and the kinds of messages generated by checkstyle by looking at the *.properties files in these directories

  • AppVeyor - It runs mvn verify with different settings and the maven site plugin in Windows Environment,

  • TeamCity - It inspects code for several miscellaneous aspects which lead to poor design, or potential bugs or redundancy and so on.

  • Wercker - It executes no exception tests and no error tests by running checkstyle on various repositories. Just a side note, wercker has been bought by oracle.

  • Shippable - Take cares of all the PIT mutation testing that is done on checkstyle.

  • Codecov - Checks code coverage.

Random Links

Wercker reveals PostgreSQL JDBC driver has syntax error in javadoc

AppVeyor helps identify bugs in ANTLR

AppVeyor helps identify bugs in checkstyle

Interesting Shippable failure


The ANTLR Tool

If I am to pick just one, most exciting tool from all those I encountered in my GSoC project it would be this. My primary objective was javadoc parser optimization. Javadoc Parsing is done using the ANTLR tool. To understand the working of the ANTLR tool I had to first get a grasp on the basic parsing theory. So, I started with Chomsky's hierarchy and kept skimming through Wikipedia and whatever resources I could get my hands on to understand how parsers work during the first week of my project. Also, I had been granted access to the The Definitive ANTLR 4 Reference by my mentor and I had nothing to complain about. Just a side note, The Definitive ANTLR 4 Reference describes the working of the quite vivid and powerful ANTLR tool very beautifully and thoroughly, kudos to Terence Parr.

So basically, what you have to do to generate a parser using the ANTLR tool is have a lexer grammar, like this one, have a parser grammar, like this one or reather have combined grammar and feed it to the ANTLR tool using the right commands. ANTLR then gives several options to visualize (a GUI too), process (visitor design pattern/listeners), modify and do so many other things with the parse tree generated with the ANTLR parser. I wouldn't like to go any further into ANTLR and there already exists a complete reference book just for it.

Random Facts

SLOC count for Javadoc Parser grammar (as of Tue Aug 29 08:49:44 2017 +0530)

git log --numstat --oneline --pretty=format: --author="Piyush Sharma"  checkstyle/checkstyle/src/main/resources/com/puppycrawl/tools/checkstyle/grammars/javadoc/JavadocParser.g4 | gawk '{ add += $1; subs += $2;} END { printf "Lines Added: %s       Lines Removed: %s\n", add, subs}'
Lines Added: 417       Lines Removed: 398

SLOC count for Javadoc Lexer grammar (as of Tue Aug 29 08:49:44 2017 +0530)

git log --numstat --oneline --pretty=format: --author="Piyush Sharma"  checkstyle/checkstyle/src/main/resources/com/puppycrawl/tools/checkstyle/grammars/javadoc/JavadocLexer.g4 | gawk '{ add += $1; subs += $2;} END { printf "Lines Added: %s       Lines Removed: %s\n", add, subs}'
Lines Added: 19       Lines Removed: 11

Random Links

The Definitive ANTLR 4 Reference

ANTLR documentation

ANTLR github


Regression Testing

The regression testing took atleast 20% time of the first half of the project duration. Checkstyle already had a regression testing tool but it was not capable of testing changes in the ANTLR grammar. Thus, we needed to up the regression testing tool and give it the capabilities to test changes in ANTLR grammar and an issue was created for this. Initially it was propsed that we can hack a way out by creating a javadoc check that logs every line it encounters as a violation which would have allowed us to use the exisiting tool, but there were several nuances attached with this approach and there implementation and design concerns related to it too since to be able to formally provide the regression testing option to whoever it concerned, checkstyle would have to incorporate that puppet check into its main repository which seemed pretty questionable. Also cherry-picking a commit created with the puppet check or applying a patch didn't look like much reliable implementations either. So, I tried creating a new script in groovy for this purpose which would have given the script some sort of much appreciated platform independence but there were performance related issues with it too and the script was just not fast enough. I also tried transforming that groovy script into a proper java tool but again the problem persisted, it simply wasn't fast enough. Finally, Richard Veach, a member at checkstyle, churned out a bash script which was chosen but even with it, initially, when the javadoc parser was relatively less optimized to wheere it stands now, regression testing used to take quite a lot of time. Sometimes even more than 7 or 8 hours, that too without running checkstyle on some of the bigger repositories.

A modified version of Richard's script can be found at my fork here. The script has been modified to work with the branch docparseprof at my checkstyle fork. What it does is, along with giving the difference counts that the original script does too, it also nicely lays out the performance of the Javadoc parser in master and in patch, separately, in terms of the total times certain methods took to execute. As can be deciphered, this is a pretty useful feature for someone who's primarly concerned with javadoc parser optimization.

Previously not having a significant experience of working with bash scripts, working on this script was quite a positive experience for me. Right now the modifications have not been pulled in the master and remain in my fork but hopefully they shall be merged soon and an issue already exists for this at the corresponding repository.

Also, eventually one more thing was realized that, along with this new script, the existing groovy script, which did regression testing based on the violations logged by checks, should also be run since poor implementation or incorrect implementation of checks or checks that relied on particular elements in parse trees may produce difference counts at times and apparently one can't always predict which checks would get impacted from the changes in Javadoc grammar. Thus, regression testing for all javadoc checks along with the regression testing for the parse trees produced was the norm for all the grammar impacting commits.

Random Links

Groovy script that failed

The puppet check

Tool that actually produces the final diff report

ANTLR regression test report produced with my modified script


Challenges

  • Several objectives had to be achieved in the short span of the project period. Chasing deadlines imposed a greated need for parallelism in work.

  • The grammar appeals to the HTML specification and the oracle documentation guidelines. Discussing them and reaching a common conclusion with all collaborators required a great deal of discussion and thus a great deal of time.

  • Few times it so happened that organization admins were highly occupied which resulted in delayed review cycles.

  • Work division and PR management. The reports of regression testing ended up getting unmanageable for the reivewers a few times because of high difference counts. This imposed a greater need for me to understand the importance of dividing work which also happens to speed up review cycles and acceptance.

  • No single formal specification to look at. The javadoc parse tree was supposed to be in sync with the javadoc tool and the javadoc tool should be based on the javadoc syntax given in oracle docs, but it isn't that simple, there were several nuances attached and at times a greater need was felt for an exhaustive set production rules for 'javadoc' language.

  • We had to improvise as we went on because there is no formally established grammar for javadoc. For example, have a look at issues/4752, where first it was concieved that we should support a feature, spent a lot of time digging through specifications available, finally realized that it's no point supporting that issue. Though of course, such discussions are necessary and I would rather even call them vital but deadlines have no respect for them as such.