This is the second in a bunch of posts I have in my head around UI automation testing. Go back and read the first for some more background on why I like this topic. I promise this one is a quicker read.
Fluent Interfaces are what all the cool kids are doing these days. If you’re not up to speed, here’s a link to get you started. Go ahead and read it – it’s an easy read, and I’ve got all the time in the world.
Welcome back. To review, a fluent interface aims to make code more readable with method chaining being a big part of how it accomplishes this. Fluent interfaces are self-referential and usually terminate with a void return.
In C#, fluent interfaces are enabled using extension methods. Extension methods are ways to “bolt” functionality onto a class without having to inherit from it. That’s nifty in other scenarios as well – if you want to extend a sealed class for instance. In the case of a fluent interface, the idea is to always return the type you’re extending in your extension methods until you get to a completion. Completions return void, or if you’re not feeling pure they could simply return a different type with the result you’re looking for. You can also switch types as you move through a process, making this approach a great way to implement a workflow.
Right now, I’m super-interested in UI automation for testing. Actually, I’m kind of obsessed. UI automation just feels like magic to me – work out your test case steps once, sprinkle a little code around and voila! You don’t have to click those buttons again – a computer will do the work for you. I’m sure a beachside life filled with Mai Tais and riches will follow. I love the idea of a fluent interface for writing test cases because the flow of the case is much more readable. That helps us reason about what a test is doing. Here’s an example:
[TestCase("redacted", "redacted", ExpectedResult = 4)]
public int TestSecurityTrimmedActionButtons(string UserName, string Password)
{
int count;
browser.LoginDriver()
.Login(UserName, Password)
.ActionsDriver()
.ViewSubmission()
.GetCountOfActionButtons(out count)
.Logout();
return count;
}
I think that’s pretty clear – login, view an item, figure out how many “action buttons” there are, and get out of Dodge. There’s a bit of code in behind each of those methods, but there’s pretty good separation of concerns and it’s pretty DRY. Not a terrible piece of work.
The API I’m building is designed around the concept of screen drivers – there’s a separate driver for every screen in the application being built, and I build extension methods for those drivers’ types. Note that the concept of a “screen” is a bit loose – sometimes I’m implementing drivers for controls or frames that appear on different screens – that’s all in the name of writing less code, so I don’t mind that my metaphor doesn’t completely stand up to scrutiny. The goal is more mai tais and riches, not more toiling over a hot keyboard. The drivers themselves are just shims that provide the scaffolding for the extension methods, but they could definitely do more if needed. All drivers inherit from ScreenDriverBase, shown here.
public class ScreenDriverBase : IScreenshot
{
public void Logout()
{
var browser = Mvx.Resolve();
var logoutButton = browser.FindElement(By.ClassName("login-logout"));
Actions actions = new Actions(browser);
actions.MoveToElement(logoutButton);
actions.Perform();
logoutButton.Click();
WebDriverWait wait = new WebDriverWait(browser, new TimeSpan(0, 0, 30));
wait.Until((b) =>
{
return b.FindElement(By.Id("sfLoginWidgetWrp")) != null;
});
}
}
You can’t help but notice that ScreenDriverBase implements the Logout() method. That’s probably a bit dirty, but to me it seemed like the logical place to put this code – you want to be able to logout at the end of any process, and each process is driven by a different screen or screens. I could have implemented this as an extension of ScreenDriverBase but this actually feels more natural to me for this particular case. Sound off in the comments if you think this is a terrible, awful, rotten thing. I’m all ears.
So that’s the setup. Here’s the magic.
I want to be able to take screenshots as I proceed through my test. That’s super-helpful, because even if a test passes the screen could still have some layout issues that my robot helper isn’t going to notice. Screenshots let testers quickly flip through a bunch of pictures looking for anomalies. They’re also helpful when tests fail – the screenshots can show you the state of the screen when things went pear-shaped, which helps developers quickly diagnose what the problem is. I don’t want to implement a screenshot method on every class, and I don’t want to break my fluent interface so I need a way to return to the type of the driver I’m currently working with, preferably from the instance of the type I already have. Finally, since I’m implementing my fluent interface with extension methods, I need to extend all my drivers and manage the return magic in an extension method, and I shouldn’t do that in a surprising way – life for the people after me will be more enjoyable if I put all like things in the same place. While I’m enjoying mai tais on the beach.
Recall that generic methods are a bit of compiler sugar that let developers write methods with type parameters. The thing I hadn’t really thought about with generics before is that you can not only declare a method like
static void Swap(ref T lhs, ref T rhs)
You can also declare a method like
public static T ScreenShot(this ScreenDriverBase driver,string StepName) where T : class
Being able to have the return type be generic means that I can now have this extension method that applies to anything that inherits from ScreenDriverBase return whatever the specified type is. The declaration above isn’t ideal yet, because the where T : class means that I can give any reference type as a type parameter. That means I could use, say, object and the compiler would think that was groovy. From a design perspective, that’s not so groovy – I want to keep everybody on this fluent interface rail that I’ve put them on, so a better choice is
public static T ScreenShot(this ScreenDriverBase driver,string StepName) where T : ScreenDriverBase
Now I’ve constrained everybody to passing in something that inherits from ScreenDriverBase ensuring that I get at least a ScreenDriverBase back from the screenshot method. To get the actual return I want, all I have to do is
try
{
return (T)Convert.ChangeType(driver, typeof(T));
}
catch (Exception)
{
return default(T);
}
In fact, since I’ve already constrained T to be a class, and more specifically some inheritor of ScreenDriverBase, I can just do a safe cast and call it a day:
return driver as T;
Technically that second way is less safe, but I believe I’ve boxed things in enough to be protected – if you’re feeling conservative, the first way is the way for you. Whichever way I go, my developers and testers can party over the returned driver. Testers can now pop screenshots in wherever makes sense for them with very little effort. Hoorah!
[TestCase("submitter1", "submitter1", ExpectedResult =4)]
[ScreenshotAction]
public int TestSecurityTrimmedActionButtons(string UserName, string Password)
{
int count;
browser.LoginDriver()
.Login(UserName, Password)
.ActionsDriver()
.ScreenShot("Login")
.ViewSubmission()
.GetCountOfActionButtons(out count)
.Logout();
return count;
}