Friday, October 24, 2008

The impact of project reviews

Having recently finished our first iteration of our DueDates project, Daniel Tian and I set up a frozen copy of the project code and a new wiki page requesting reviews of our project. As classmates in our software engineering class were doing similar, a "review trading" system was implemented. Each person was given three other projects to review and each project would be reviewed by six or seven people in hopes of eliciting a broad spectrum of opinions and corrections.

Getting the most out of a review

In order that the different segments of the code and documentation were adequately reviewed, given that the busy schedules of the reviewers allowed for only a limited amount of time reviewing the project, we broke up the project into several pieces and assigned 2 or 3 of our reviewers to each piece. We knew our reviewers ahead of time, but on reflection, in a more widely known project a request for review would probably not know all of its reviewers identities. In that case I would break up the reviews perhaps by first letter of login name. A-H for one section, I-Q another, and R-Z for the last. This of course is not without weaknesses. What if no one is in the A-H range? What if the people in that range are assigned to review GUI code when they are much better versed with databases or documentation? That issue could be good or bad. These are all issues that should be kept in mind when requesting a review. In any case we broke up the project so that each person should at least attempt to run the project, but then concentrate their review on various pieces that were closely related. The assignments can be seen here.

Though we knew many of our errors before hand, and the review was merely a formality in some ways, our reviewers found several good improvements to be made. Erin Kim made several comments that suggest better ways to identify the correct tables when searching the UH Manoa Library. Creighton Okada and Robin Raqueno added how the use of arguments could be made more general in error reporting, and that some errors were slightly misleading. Daniel Arakaki left a good number of error reports regarding problems and shortcomings of the Javadocs. Anthony Du suggested that the CommandLineParser class possibly be refactored to only deal with the parameter parsing, and not so much the actions associated with the parameters. This is suggestion we definitely have been putting some thought into as well. None of them noted that our project had no unit testing at all. I did not see any comments from Scheller Sanchez however I'm not sure the issue in this case. (edit: I mistakenly said Tyler Wolff did not comment, but forgot that he had emailed me directly.) It could very well be I did not set up the review correctly through Google Projects, as I was a bit confused myself on several parts of the review process, as I'll mention shortly.

My reviews of others

I was charged with reviewing duedates projects silver, gold, and yellow. As is often the case with looking at someone else's code, I waffled between "thoroughly impressed" and "wtf is going on here." Though I had specific assignments to concentrate on, I found that I quickly wanted to review a lot of areas. As each one went along, I applied what I'd found in previous projects to the current one, having already seen common small errors it was quick to find and comment when I saw them again. I found myself often making comments that basically amounted to "you might consider doing it this way as that's how I did it in my project". I made many comments regarding code that I did not feel was as general and easily extended as I felt it should be to account for the growth of the project.

Difficulties with Google Project

While I find it amazing and wonderful that systems like this exist. It is so easy to set up, and of course free, but using the system does not always go as planned. In one case I "lost" some of my review comments somehow and began starting over, only to find them reappear later. I also began receiving emails from other projects regarding other people commenting on the project, apparently because I had started the review process, which again I found strange since I knew I wasn't the first review in that case. Daniel and I also found out as a result of this process, that our discussion group was not set up correctly and we were not receiving emails regarding comments on our own project. All of this does take a bit of practice and getting used to it seems!

Final thoughts

The review process is important for any project, whether it be software engineering or baking a cake. A project needs a variety of opinions and an organized way of collecting them. We're lucky to have tools like Google Project Hosting to help us with these tasks, even if not always as smoothly as wished. Hopefully future reviews, and I know there will be plenty, will be smoother and even more productive.

Monday, October 20, 2008

Introducing the Due Dates application.

Having outgrown stacks and added some new tools to my knowledge base, it is now time to start on a new project that I will be working on for at least the next two months. It is called Due Dates and is an open source Java application that can be used to track the due dates of borrowed items, such as library books. You can read all about the project, download it, and contribute here. Right now the application is fairly simple with online terminal output and access to a single lender, the University of Hawaii at Manoa library system, but in the future I hope to expand it to allow personalized settings, customizable reporting such as email and instant messaging, and a large database of procedures for accessing lenders.

Getting started

The project was started by Daniel Tian and myself. We began by setting up a project hosting site at Google Code and familiarized ourselves with the features and usage of the site. I added some basic issues to the issue tracking system and also began a rough skeleton of my initial program structure. I felt that there should be a static class that outlined the basic features for querying a lender. This LenderQuery class would then be extended for each lender and implement the procedures for querying that lender's website.

First meeting

Daniel and I were able to make quick progress on implementing a semi-functional system at our first meeting, however we sacrificed some of the good software engineering practices we'd been recently learning. We almost immediately set to hacking at our Httpunit example code without really finalizing a design plan and didn't do any unit testing along the way. We did run into a number of problems with setting up our environment to properly recognize all the necessary libraries. However, by the end of our first meeting we managed to commit code that was capable of at least determining if there were books checked out of the UH library. We had also learned a lot about library dependencies and endured some SVN problems, which turned out to be introduced by changes in some of our sample code. After three or four hours of hacking, we only had one commit to show for ourselves, another mistake in our process, and had forgotten (and continued to forget) to label each commit with the relevant issue being addressed. It was a successful first meeting, but definitely made me aware of how hard it is to break old habits and develop good software engineering processes.

Second Meeting

At our second meeting we were using a different computer and fixed some of the issues with dependencies and SVN that had not previously been noticed. We also discussed some of the problems with the approach we used in previous meeting and introduced some new issues to the issue tracking system. After a short meeting we were able to run the application with increased success, having implemented the ability to list the books on loan (which I had checked out that morning). Armed with a false sense of accomplishment, we made the grave mistake of waiting several more days before working on the project again.

Some time later

Yesterday, I began making some modifications and refactoring the code. I added a BorrowedItem class to the project and cleared up some of the class and method names to be more descriptive of their tasks. I practiced doing small commits often, rather than large, multiple file changs, however almost universally forgot to address which issue I was working on in my commit message. The issues needed to be broken up and though out better anyways, but I simply always forgot to add the "Issue-N" header. And I also completely forgot one commit message!

Later on that evening Daniel and I got together online and discussed what we needed to finish and change. He did some more code modification to try to improve toward our goal of making the program as extensible as possible. He also helped me with some mistakes I had made with designating the library dependencies. In the meantime I worked up a couple of wiki guides, one for Due Dates users, and another for potential developers.

Lessons and Experiences

Overall this was a very good experience and really demonstrates the power tools like SVN and issue tracking. However it also showed me that I have a long way to go toward becoming an effective software engineer and using those tools wisely. SVN makes it very easy to work independently from the rest of your team, but it is no substitute for good communication. The main issues I believe I need to work on are effective design, creating unit tests, and getting a better understanding of how to properly set up libraries so that another user can quickly begin to use or modify my code. Also, quick progress in the beginning doesn't I should reward myself with three days off! Overall though it has been a positive and rewarding experience.

Wednesday, October 8, 2008

I see more collaboration in my future

To add a little experience to my work with Google project hosting, as discussed in my previous entry, John Zhou has reciprocated the sharing of code and set up his own stack project page. Similar to my work on the stack-johnson project, I was able to easily set up stack-johnzhou on my workspace and do some modifications. His code seemed clean and tested. The output location was correct as well. I just added the simple clear method to the stack class and ran verify. Of course Emma coverage dropped without some testing, but it is otherwise still able to compile and verify.

Overall, most of the tools used in regards to SVN were very simple to set up and use. However, I'm sure there's much more to be discovered about them that help prove that a tool can be easy to use, yet immensely powerful with some experience. The difference between his method of collaborative programming and the process I used with Arthur Shum on our CodeRuler project are immediately obvious and profound. Arthur and I's manual sharing and combining of code for that was tedious and time consuming. Also, during this project, John showed me how to use ant tools in Eclipse, rather than the command line, just by right clicking an xml file and selecting Run As... Ant. I felt downright silly for not having found that earlier!

Making a collaborate effort.

A programmer working on his or her own can only accomplish so much before the limitations of time and talent begin to make the completion of a large project infeasible. To succeed in the modern programming environment requires collaboration between several or even thousands of individuals. Furthermore, for such a collaboration to be successful it is necessary for code to be shared, updated, and shared again using reliable and efficient methods. One way of accomplishing this is through SVN or subversion control. By creating a central depository of code and implementing safe ways of distributing code and tracking changes, it is possible for a large number of people to work together on a single system.

Google: a generous overlord

One source for a framework for SVN is through Google project hosting. Anyone can, for free, create and manage their own open-source project using this site. To help my classmates and I get comfortable with the subversion process, my software engineering professor set up a project containing the familiar Stack project. In order to easily be able to check out and update files from this repository, I downloaded a tool called TortoiseSVN. After a quick install and reboot (which I always hate because it can really break the productive process) I was able to create my own private space to work on the code downloaded from the Google project page.

Making my mark

Having downloaded the code, which at this point had been modified by most of the students in the class, I was worried there wouldn't be much for me to accomplish. However, I found that Stack still did not have a simple clear method to empty it, so I added that. What's more, the project was set to output to a /build/classes directory, rather than /bin, an issue that had previously caused strange problems in Eclipse. Having fixed those issues, I was able to quickly (after finding the cryptic password again which for some reason was not saving) upload the code along with a message about my changes. During my work with this project it also appeared to have acquired an empty and mysterious extra stack-johnson directory, which I deleted.

An offering to the world

Having successfully modified and updated code from a Google project, it was time for me to create my own project and to help test it, put it at the mercy of a fellow student, John Zhou, and the world, if it is interested. My personal latest iteration is no longer available only from a archaic zip-file in one of my other blog posts, but can be also found here. I also created a discussion group here. This whole online project world will also make it easier for me to work not just with others, but with myself in different locations, on different computers, which is often what I do. I definitely look forward to adding another layer of control and organization to my lessons on software engineering.

Wednesday, October 1, 2008

An incomplete Emma example

I mentioned in my previous entry that Emma is a good tool to see the coverage of JUnit tests. However, it is only a guideline, not an absolute measure of good testing. Simply being able to claim 100% coverage does not mean that all possible states in the program have been tested or that the tests were formulated correctly.

As a simple, if rather contrived example, this distribution contains an artificial error in the pop method of the stack class. By simply inserting a modulo four into part of the code it operates normally for pops when the stack is of small size, and Emma continues to report 100% coverage. While pushing three objects on the stack and popping it once is fine, pushing five objects on to the stack and popping once, which could be considered an equivalent class of testing, it might fail, depending on the pushed object. It will fail if the showNewError test in the TestStack class is not ignored, but the other previous test suite did not catch this error.

Though it is a rather contrived bug, it shows how important thorough testing is and that a programmer cannot simply rely on coverage tools to indicate good testing has been done. Such tools can tell you that tests are missing, but does not tell you if you're really done testing. That requires 110% effort and a report of 500% coverage or more!