Lately I've begun work on Infosys again - a mixture of new features,
bug-fixes and an insane amount of refactoring. Because Infosys is a
pet-project of mine, partly done to experiment and learn, I have no
problem redoing all the core parts now and then - just for kicks.
However, the application also needs to work at the end of the day, and
unit-tests seem a fairly good idea (well, essentially they always do,
but sometimes more than others), so I've set about updating and creating
more unit-tests for Infosys.
Unit-testing is fairly easy to get going with if you've designed your
app properly - which essentially means you've avoided singletons and
globals, and used dependency injection. And you've either gone with OOP
or a functional paradigm (when your unit is a script 2,000 lines long
running in global scope, unit-testing just isn't an option).
The problem
However, even given the above, there may still be a snag or two. For
instance, there is still the good old question of the database. Also, if
you happen to have implementation details that are of protected
visibility, you might face some problems. And, of course, you might just
be in the middle of converting your app to a more reasonable
architechture.
Specifically, for Infosys, the problems I'm facing are:
- lack of data for testing database related classes
- lack of access to internals of classes
- tests must not change live data
The part of the code I'm working on now is an Active Record pattern
hierarchy of data-classes. Most of the methods in the classes I want to
test deal with data, one way or the other - so there's just no testing
without data. Had I been using a Data Mapper pattern, I wouldn't have
had the problem, seeing as then I'd just be able to directly set the
data in the objects. Instead I went for a hierarchy along the lines of:
class DBObject
{
// all db data goes here
protected $storage = array();
// various code here, including data loading stuff such as:
public function findById($id)
{
// code that queries the database goes here
}
}
class User extends DBObject
{
// various code here that calls inherited methods
}
Now, the code reuse is nice and I'm happy with that - however, I'm less
happy with not being able to set data directly on the objects. Actually,
I'm unhappy about not being able to set important data like IDs, which
are shielded off from the outside. Even if I were able to do that,
though, I would still face another problem: the data classes use
inherited methods to load other objects, and they also keep track of
state through protected properties. On top of that there's then the
typical problem of avoiding tests messing up the database.
Lack of data - solution
The most obvious solution to the problem is to mock or stub the database
access. Luckily, the data classes are all injected with a DB object that
handles all database queries - this means that if I replace the DB
properly, the data classes will never know the difference but I can
write and run exactly the tests I want, providing the data objects with
the data I need.
There are different ways to go about mocking out a class: you can either
use PHPUnits built in methods, or you can subclass the class you need to
test. Both have pro's and con's - but normally I'd start with PHPUnits
mocks as one of the pros here is that it's easy to get started with.
This means using the inherited getMock() method to create a mock class
dynamically. Or, at least it used to mean that: now you can instead use
the getMockBuilder() method to chainlink some methods and end up with
the same result. The difference between the two?
// getMock() way
$mock = $this->getMock(
'class',
$methods_to_mock,
$constructor_params,
$new_class_name,
$disable_original_constructor,
$disable_original_clone_constructor,
$disable_autoload
);
// getMockBuilder() way
$mock = $this->getMockBuilder('class_name')
->disableOriginalConstructor()
->disableAutoload()
->getMock();
In case you only need to disable the constructor and autoload, the
getMockBuilder() way is so much more intuitive! It makes understanding
the test code a lot easier too, as you know what the parameters are for
the mock without looking them up.
So, armed with this, I could get to work mocking my DB class to inject
it into my data objects. Essentially, the basics I needed was:
// build DB mock
$mock = $this->getMockBuilder('DB')
->disableOriginalConstructor()
->getMock();
This works and the data objects happily accept the mock. In terms of
dealing with data it doesn't do too much, though. To fix that:
// accept calls to query()
$mock->expects($this->any())
->method('query')
->will($this->returnCallback(array($this, 'mockDBQuery')));
// accept calls to exec()
$mock->expects($this->any())
->method('exec')
->will($this->returnCallback(array($this, 'mockDBExec')));
The point of the above calls is twofold: first, I can provide data to
the objects through their normal channels, but secondly I also get a
chance to react to the SQL run. As the SQL is mostly auto-generated,
it's fairly easy to setup rules for it. Currently I've just got a fairly
big method reacting to various queries run, but I could easily setup
testing so that each test declares which queries it expects should be
run, and anything else would trigger a fail.
One note about the methods above: making a PHPUnit mock return different
data for multiple calls to the same method can be hard to figure out,
but a bit of googling shows at least two ways:
- using the at() method
- using a callback
The at() method allows you to specify when a given method should be
called and what should happen when it is called. This also allows you to
specify that the same method will be called several times with different
results - normally PHPUnit just overwrites previous expects() for the
same method. With at() you specify the order of calls - however, that is
also the downside, you need to specify in exactly which order calls are
made. Should anything change, you then need to change your testing code
as well - even though the behaviour of your SUT remains the same (and
this is essentially the only thing you should care about).
So I'm using the second option, the callback. This works in the same
fashion as callbacks normally do in PHP - provide an array with an
object and the name of the method to call, and PHPUnit will call that
method when the expected test-method is called, providing your return
value from the callback to the caller of the test method. That allows me
to do something like:
public function mockDBQuery()
{
$args = func_get_args();
$query = $args[0];
$arguments = isset($args[1]) && is_array($args[1]) ? $args[1] : array();
if ($query instanceof Select) {
$arguments = $query->getArguments();
$query = $query->assemble();
}
if (stripos($query, 'describe `users`') !== false) {
return $this->returnUserTableInfo();
}
if (stripos($query, 'describe `roles`') !== false) {
return $this->returnRoleTableInfo();
}
throw new Exception('Unexpected query: ' . $query);
}
protected function returnUserTableInfo()
{
return array(
'data goes here',
);
}
protected function returnRoleTableInfo()
{
return array(
'data goes here',
);
}
Similarly, in the exec() method mock I can check that queries look right
and come with the correct amount of parameters.
Lack of access to internals of classes
The second problem I'm facing is that some of the behaviour of the data
objects is different based on internal state. More specifically, if they
have been loaded with data from the database, they're aware of that.
What it comes down to is essentially awares of object ID: it's only set
when loading the object or creating it in the database, to make sure it
cannot be set or changed from the outside (it would suck pretty bad to
load an object, accidentally change the ID, then save it as a new
object). While it's a good safety to have, it also creates problems, as
I'd need to create the object and then load it with data - even though I
have no need for that in my tests (it's best to minimize the amount of
code in tests, because you want to test as few things at a time as
possible, to be able to tell exactly what is breaking).
Lack of access to internals of classes - solution
To get around this problem, I went for the second solution outlined
above: subclassing. The beauty of this solution is that you can have
your cake and eat it: I am not changing the behaviour of my data objects
in any way, yet I still get to set them up exactly as I need them.
Example:
class DBObject
{
protected $has_loaded = false;
protected $storage = array();
public function isLoaded()
{
return $this->has_loaded;
}
}
class User extends DBObject
{
// various code and stuff to test here
}
class UserMock extends User
{
public function overrideHasLoaded($bool)
{
$this->has_loaded = !!$bool;
return $this;
}
public function overrideId($id)
{
$this->storage['id'] = $id;
return $this;
}
}
The User class doesn't have the overrideHasLoaded() or overrideId()
methods, neither does the DBObject class - so by subclassing User and
adding those methods, I'm not changing the behaviour of User, which
means the results of my tests will be valid. This way I can get at the
internals of the User class without exposing any of it to normal
operations. The mock subclasses I create this way live with my tests so
they won't get autoloaded by mistake - which could cause problems by
exposing an interface that shouldn't exist.
Conclusion
There just really isn't any good reason to avoid unit-testing - not even
technical ones. In fact, once you get started testing stuff, it turns
out it's good fun to figure out how you can make sure your code is
tested! So get to it :)