Tuesday, May 14, 2013

Refactoring: Replace List With Object

I was reminded today of a refactoring which I have done several times, but which I hadn't specifically remember from the Refactoring book. So, I looked it up, and of course it is there.

Ah well, we can't all invent something new.

Anyway, the refactoring in question is "Replace Array With Object", though I've been seeing "Replace List With Object". I first encountered this in C#, and recently again in C++. The method for doing this refactoring I feel is more elegant than in Java or C (or Go for that matter), though primarily because Lists in C# and C++ have array-like syntax available, whereas in Java they do not.

Consider a naive data-structure which represents the "nuclear family":


enum Parents {
  DAD_INDEX = 0,
  MOM_INDEX = 1
};
list<Person> family = GetFamilyMembersParentsFirst();
Person dad = family[DAD_INDEX];
Person mom = family[MOM_INDEX];
int numChildren = family.size() - 2;


Then the client does this:

How will we handle families with more or less than one parent, or same-sex parents?

Replace List With Object to the rescue!! We need a class which represents the family as a unit. So, first off let's subclass list:


class Family: public list<Person>
{
}


Now, change GetFamilyMembersParentsFirst() to return a Family instead of a list<Person>:


Family GetFamilyMembersParentsFirst() {...}


Now, we can change the client to use Family:


enum Parents {
  DAD_INDEX = 0,
  MOM_INDEX = 1
};
Family family = GetFamilyMembersParentsFirst();
Person dad = family[DAD_INDEX];
Person mom = family[MOM_INDEX];
int numChildren = familiy.size() - 2;


Finally, we can start adding behaviors, like GetNumChildren(). This can be done by extracting the relevant behavior from the client code and moving it to the Family class:


class Family: public list<Person>
{
public:
    int GetNumChildren()
    {
        return size() - 2;
    }
}

...

Family family = GetFamilyMembersParentsFirst();
...
int numChildren = family.GetNumChildren();


Now Family can answer our questions about a given family unit. Hooray!

Now, let's take care of those pesky 'mom' and 'dad' variables. Since we're not changing how the class works yet, we'll just extract 'family[DAD_INDEX]' and 'family[MOM_INDEX]' to functions, and move them to the Family class:


class Family: public list<Person>{
public:
     Person Dad()
     {
         return (*this)[DAD_INDEX];
     }
     Person Mom()
     {
         return (*this)[MOM_INDEX];
     }
}
...
Family family = GetFamilyMembersParentsFirst();
Person dad = family.Dad();
Person mom = family.Mom();
int numChildren = family.GetNumChildren();


Hey, now we can add setters:


class Family: public list<erson>
{
private:
    Person dad;
    Person mom;
    list<Person> children;
public:
    void Dad(Person dad) { this->dad = dad; }
    void Mom(Person mom) { this->mom = mom; }
    void AddChild(Person child) { children.push_back(child); }
    ...
}


And use them within GetFamilyMembersParentsFirst():


Family GetFamilyMembersParentsFirst()
{
    Family family;
    ...
    family.Dad(GetMaleTaxPayer());
    family.Mom(GetFemaleTaxPayer());
    ...
}


Finally, we can change Family to not subclass list<Person>:


class Family
{
...
}


And everything still compiles and works! (Exactly what we would expect out of a good refactoring.)

Tuesday, December 4, 2012

Why Should I Limit My Work-In-Progress? A Tale Of Babies

Clara and Charlotte were born two days after Valentines Day. Even though I'd already welcomed two other little girls into this world, this event was a bit scary. It was my wife's first C-Section, the first time any of our kids stayed in the NICU, and the first time there were two babies at one time.

Most parents with both multiples and a single child will tell you that the first year with two babies is more than twice as difficult than with a single baby. Why? With one baby, there's a time every day when someone's taking a nap. When there's screaming, no one else is getting woken up. They pull only your hair, not each others.

Having twin babies can seem easier at times, too. They can entertain themselves earlier than one baby can. But, that really only lasts a few minutes. Making your house child-proof isn't any harder, either; all of the dangerous things still have to be found and picked-up every few hours. (Toddlers aren't particularly careful where they leave tiny hazards.)

Your work items are like your babies. They require attention, care and often love in order to graduate from In Progress to Done (or even Handed Off). It will often seem like you can get more done by having more babies. But, having multiple tasks or stories raises your overhead by forcing context switching and increasing the likelihood of mistakes being found later in your process ("How'd you get that plastic in your mouth? Oh, right; your sister distracted me for three seconds."). You cannot manage one task if you have another one interrupting you constantly.

"But," you might say, "I'm getting so much done!" It's possible, but unlikely, that you're seeing an large amount of work getting done. More likely is that you're fooling yourself. How are you measuring progress?

My recommendation is to adopt a KanBan board without Work-In-Progress limits and start measuring what your current velocity is. Then start to experiment. Play with different limits. Adjust your definition of 'In Progress' to include Blocked items.

Sometimes, you're 'stuck' with a high WIP. But, unless it's your kids, you can almost always assert a priority and give the most important task your whole attention and love.

Friday, May 18, 2012

Agile Does Not Mean Being Micromanaged


Today I joined some friends (ex-coworkers) in a goodbye party for an ex-coworker who is moving on to greener pastures. I haven't seen several of these people in a few years, so naturally they asked "What are you doing now?" And I told them that I am an agile coach. And I got asked "And you haven't killed yourself yet?"

Clearly, something is wrong here.

A little background: the company I used to work for was acquired and 'Agile' was imposed upon my old friends. This is an 'Agile' which may be far too familiar to some of you, but I've only heard stories of it before. At the party, two separate people told me they thought 'Agile' was intended to turn programmers into replaceable sprockets who need not think. For example, if they come across a bug in the system, they are supposed to create a defect in the tracking system and move on. Someone will add it to a sprint sometime. Or not. All decisions are made by project managers, and handed down.

The Agile Manifesto and the Agile Principles are a good place to start to validate whether an organization is acting in an agile way. "Individuals and interactions over processes and tools" from the manifesto, and "Build projects around motivated individuals" from the principles both give us an indication of the role of management in these environments: contribute, don't control.

Sunday, May 6, 2012

Why We Refactor

I have always enjoyed Mike Taylor's Reinvigorated Programmer, probably because he so obviously enjoys several things I also enjoy (Doctor Who, programming, Buffy The Vampire Slayer, sushi). I was perusing some of his posts from the last year and came across the following comment regarding the process of refactoring code:
"the purpose of all that shuffling is really only to get the deck clear so we can concentrate properly on the hard bits without getting distracted by crud".


I think this is missing the point. Refactoring give us tools for rigorously transforming code. There are several reasons why this might be useful. One reason is that the programmer before us was an incompetent buffoon  Another reason is that we work in and with teams, and collaborative work tends to get messy with time. We need to perform housecleaning so the technical debt doesn't cause us to go bankrupt.

These reasons are largely in line with what Mike is suggesting, I think. However, there's another reason to perform refactorings on a regular basis, and it is crucial to how we work as programmers.

The code you wrote six months ago (or six weeks ago) solved one particular problem pretty well. This is probably not the exactly problem you're trying to solve anymore. The requirements may have changed (accounting suddenly really _does_ want to run reports against itemized receipts), or maybe you just saw a better way to code the whole thing. Now you've got to pull all the affected abstractions up to the new world order. Classes start to fit together differently. Design patterns get replaced by different patterns, or disappear altogether. Engines suddenly find themselves servicing wholly new systems.

The way forward is with the refactoring tools. Instead of rewriting the system to conform to our new way of thinking, we can transform it. We can mold it so it continues to fit our needs. And we prefer this to a rewrite because it is less costly, we can deliver with only half of the transformation done (usually) without unduly impacting features, and the new designs practically write themselves.

I think "the hard bits" Mike references are exactly where refactoring shines. Making sure the performance profile is acceptable is a refactoring task. Ensuring the application is extensible through plugins is an exercise in refactoring. Extracting out the timer behaviors from the license module to be reusable is refactoring; so is reconciling the 3 different timers you just found in the system, because they were disguised as something else. Refactoring touches nearly everything we do, so we really need to make sure we do it well.

Tuesday, May 1, 2012

How Do You Name Your Tests?


How do you name your classes? You probably think of the noun or verb which best describes the behavior you want; perhaps there are some technical details which slip through like the name of a design pattern. But generally, a class' name should tell you what that class is responsible for. (Edit: Jeff Langr and Tim Ottinger have written an excellent article on the mechanics of how to name your tests.)

Methods are easier to name, perhaps because we do it so much more often. Usually, you'll decide that the method is important because of the return value or the side effects on the enclosing class. Then try to communicate that in as few characters as possible.

Test fixtures and methods are different. A test name doesn't communicate what the test does, it communicates what SOMETHING ELSE does.

Let's look at some common naming conventions used in tests. Our System Under Test will be a Video Store billing system.


TEST_F(BillingTest, WhenRentingARegularMovie_GoodCustomerGetsOneFrequentRenterPoint)
TEST_F(BillingTest, CustomerWithNoLateFeesGetsOneFrequentRenterPointWhenRentingOneRegularMovie)


These two examples are pretty standard fixture names. Specifically, they just tell you the system being tested at the macro level. This may be common at the early stage of TDD because you haven't extracted subsystems yet. This is expected to change as development continues.


TEST_F(CustomerTest, WhenRentingARegularMovieGoodCustomerGetsOneFrequentRenterPoint)


This example names the fixture after the subsystem being tested. There's duplication between the test method name and the fixture name because there's a subset of Customers which we're concerned with. This will happen eventually to every test named this way.


TEST_F(GoodCustomerTest, WhenRentingARwegularMovieGetsOneFrequentRenterPoint)


This example learned a small lesson, and named the fixture after the general condition of the subsystem, thus eliminating the duplication.


TEST_F(CustomerFrequentRenterPointTest, WhenRentingARegularMovieGoodCustomerGetsOne)


This example tries to describe the effect as if it were the system under test. It is common to confuse the two, but it confuses the writer of the tests as much as the reader.


TEST_F(CustomerInGoodStandingRentingOneRegularMovie, GetsOneFrequentRenterPoint)


The last example demonstrates two features of good naming. The first derives from the fact that the fixture is used to share setup between tests, and so can be explicit about what is being shared.

The second feature is that the test method tells you what behavior to expect before the condition under which to expect it. This puts the emphasis of the test on the property of the system which the test is exercising. You would expect to see tests in the same fixture to be easy to compare against eachother. For example:

TEST_F(CustomerInGoodStandingRentingOneRegularMovie, GetsOneFrequentRenterPoint)
TEST_F(CustomerInGoodStandingRentingOneRegularMovie, Pays1_95ForOneDay)
TEST_F(CustomerInGoodStandingRentingOneRegularMovie, Pays2_95ForTwoDays)

Tests document our systems' expected behavior, and names play an important part of that. You want names which help navigate through that documentation. Put the most important information about your system at the first place the next coder will look: the beginning of the test method name. And if you've got underscores in your name to separate the parts, you're probably doing it wrong.

Monday, March 12, 2012

6 Tips For Writing (Code)

I came across a list of 6 tips for writing by John Steinbeck: http://www.brainpickings.org/index.php/2012/03/12/john-steinbeck-six-tips-on-writing/ . This struck me as being somewhat applicable to programmers, so I tweeted such. And now, I want to expound on that idea.


Abandon the idea that you are ever going to finish. Lose track of the 400 pages and just write one page for each day. ...
This is all about having a maintainable pace. A maintainable pace is valuable to the writer and the programmer in part because it relieves burnout. But it is also valuable to the editor and Product Owner because it provides an unmatched ability to forecast delivery.

... Never correct or rewrite until the whole thing is down. Rewrite in process is usually found to be an excuse for not going on. ...
This is a partial quote, which indicates that the association between writing an programming is a bit loose. However, it does remind me of Kent Beck's "Make it work, make it right, make it fast." Don't refactor or redesign until you've got enough written to know what you're talking about. You may think you need a set of metric unit types, but unless you've got some code which tells you that you're getting ahead of yourself. Solve only the problems you can prove that you have.


Forget your generalized audience. In the first place, the nameless, faceless audience will scare you to death and in the second place, unlike the theater, it doesn’t exist. ...
I've seen plenty of in-house code which was designed to withstand the application of a malicious or incompetent programmer. It's a waste of everyone's time. You cannot design around a malicious coworker, and you cannot protect yourself against an incompetent programmer. Instead, determine what kind of programmer your organization hires (you, for example), and write code for that person. Your real audience is you two weeks or six months after you wrote the code.


If a scene or a section gets the better of you and you still think you want it—bypass it and go on. When you have finished the whole you can come back to it and then you may find that the reason it gave trouble is because it didn’t belong there.
If you can't think of a good class/method/function/field name, maybe you don't really understand what you're trying to accomplish with it. If you're having problems making your code generic, stop. Come back later when you understand the problem better.


Beware of a scene that becomes too dear to you, dearer than the rest.
This is applicable in nearly every corner of life. Code or architecture which you are attached to becomes difficult to change. Change is the lifeblood of a programmers process. The code changes as your understanding of the problem changes. The code changes as the customer's demands change. The code changes as you learn new techniques. Anchors will drown you in your seas of change.

If you are using dialogue—say it aloud as you write it. 
This relates to naming things, especially systems of things, and especially test methods. When you say the name of a method, it should be easy to turn into a sentence. The receiver is usually the object, the method is usually the verb and adverb. The parameters are usually the grammatical objects. When you follow this advice, you get a rich vocabulary for the domain. And you almost never succumb to Primitive Obsession.


So, there you have it: six tips for writing code. It's an imperfect but useful mapping from creative writing to programming. Programmers can learn much from authors, and maybe we can teach them a thing or two in return. But keep in mind that they've been at it for a few millennia. Us, not so much.

Monday, December 5, 2011

Reducing Extract Method on a Reduce Loop

Let's say you have code like this, wherein totalYs is being used for multiple purposes, including accumulate the Y values within the collection of Xs.


int totalYs = ...;
Collection<X> xs = ...;
... something that uses totalYs ...

for (X x : xs ) {
    totalYs += x.getY();

}
...
process(totalYs);


You can start isolating the Y accumulation (to extract to a method) by introducing a temporary variable.



int totalYs = ...; 

Collection<X> xs = ...;
...

int tmpTotalYs = 0;
for (X x : xs ) {
    totalYs += x.getY();

}
totalYs += tmpTotalYs;
...
process(totalYs);



Then, replace all instances of totalYs within the loop:




int totalYs = ...;
Collection<X> xs = ...;
...

int tmpTotalYs = 0;
for (X x : xs ) {
    tmpTotalYs += x.getY();
}

totalYs += tmpTotalYs;
...
process(totalYs);


Finally, you can extract your method easily, and inline the temporary variable:


int totalYs = ...; 

Collection<X> xs = ...;
...

totalYs += collectYs(xs);
...
process(totalYs);


Now, totalYs is much easier to manipulate. The lure of this transformation is that tests will pass at every step.

BTW, this will work for any associative operation with an identity value in place of addition.

Chris