Java

@RTR_tech

As I write this, today is beach day @RenttheRunway (July 31, 2014).

What does that mean? It means our CEO Jenn Hyman (@Jenn_RTR), decided that the whole team deserved a day of “Mandatory Fun” at an awesome beach with tennis courts and a pool and the whole nine yards, on the company's dime.

A few days before that Jenn spontaneously ordered an ice cream truck to come to the office and give everyone free ice cream. We're not talking 10 people here - we’re talking about the whole, enough-to-fill-multiple-busses, RTR office.

A few days before that there was a Hawaiian style BBQ/Luau at our Dream Fulfillment Center in Jersey (where your dresses are shipped from).

You get the idea.

But lots of startups have occasional crazy fun surprises like that (right?). What's really telling about a company is the day-to-day. This blog is about life @RTR_tech, not RTR in general. Make no mistake: this is a tech organization and it’s an awesome one.

Let me tell you a story. It's going to be a bit scattered, and probably long-winded, but I think you'll like it.

I promise I'll mention some tech along the way.

A little more than a year ago, I was living with my family in Ann Arbor, MI where I had worked for about 13 years. Ann Arbor is an awesome little college town, but my wife wanted to move back to NY where she grew up. As a prerequisite, I wanted to make sure I found a job in NYC where I could truly be happy. This was more difficult than you might imagine.

I was looking for a small-ish, startup-ish, flexible, open-minded tech company with outstanding people.

I eventually discovered a large list of NYC companies of this ilk. Cruising down the list in alphabetical order, I eventually came to Rent the Runway.

"A fashion/e-commerce company?," I thought, "meh."  I passed right over it.

Fortunately, it didn't end there. The CTO of another company (whose selfless kindness I should have repaid long ago) reconnected me with RTR_tech via CTO Camille Fournier (@skamille) I had a phone call with her and was extremely impressed. The rest is history. I also spoke to Vijay (@vjsubr), head of Analytics, Jenn Hyman and various awesome members of the tech team and I was convinced.

One year later, I can tell you that this company is far from just a "fashion/e-commerce company" and is anything but "meh."

These people are awesome and humble. They learn from their mistakes and they learn fast. They actively seek feedback. They listen deeply. They really, honestly care about their teammates. They have created a culture of openness, idea sharing, risk-taking, fun, friendliness, honesty, continuous improvement, teaching and in-the-end, success. And they pay me to hang out with them all day!

What follows is just some of the awesome stuff I’ve had the pleasure of living with @RTR_tech.

First off, we use modern technology tempered with a good bit of wisdom that the “Shiny New Thing” is not always as awesome as it seems. The grass is not always greener on the other side. We’ve adopted a lean-and-mean, modern Java web service framework called Dropwizard. Not Node. Not Scala. Not Python. They're all cool, don't get me wrong, but Java is tried and true and way cooler than you might think if you just take the time to get to know it. :) Plus: Java 8! And cool stuff like @AutoValue. We're working hard to move our front end code to Backbone. There's some Ruby in the mix for good measure. Plus, have you seen what renttherunway.com looks like? It's freaking beautiful! We work with awesome designers and we really care about making it not just work right, but also look great.

We have regular, weekly one-on-one meetings with managers, team leads and peers. These are not perfunctory tech status updates. In these meetings we talk openly about what's bugging us. We refine our goals and take steps towards achieving them. We work to carve out time for people to work on that open source contribution they've had in mind (making @AutoValue work with Hibernate perhaps). As a manager, one of my personal rules of thumb is that performance reviews should not contain any negative surprises (or only good surprises). I've learned to use this one-on-one time to make sure that's the case and that I'm guiding/teaching (and learning from!) my reports to avoid those negative surprises. Just a few days ago I was writing a performance review. When I really got into the meat of it, I realized that I had some new suggestions around goals the person could set, but that we hadn't talked about yet. Before I wrote those things into the review, I grabbed the person and talked through the suggestions. The next day, we had the Important Official Mid-Year Review meeting. I delivered my write-up, and watched the pride swell up in the individual. This is not to say there was no constructive feedback, but that that feedback had already been well established and internalized. That's how you do performance reviews. We do that here @RTR_tech. If you're on my team, you'll get a performance review like that. I really care - we really care - about our teammates as people. In the end, that's all we have. Java code is easy. People are hard. Help them refine and follow their goals. Help them thrive. Help them learn. Help us learn.

@RTR_tech we are a learning organization from top to bottom. As such we’re encouraged (by @skamille and others) to attend technical conferences. RTR_tech will send you to speak or just attend. And of course, @skamille, @markwunsch, and @OMGannaks held a panel discussion @RTR_teach - oops, @RTR_tech - about how to write successful conference presentation proposals, because… they're awesome.

Similarly, we do Drinks and Demos every Friday. This is our chance to share essentially whatever we want as engineers. Maybe it's @timx showing off the new RTR Unlimited product/feature his team just released or @ericqweinstein teaching us some new stuff about "The Rubies." It could be @CarloBarbara doing a case study of how to systematically debug a tough OutOfMemoryError, because, as @skamille says, "don't flail."

Speaking of "don't flail:" you can learn that and so much more by going to @skamille's weekly office hours and having a great discussion with her. I'm not exaggerating that every time you sit down to talk with her, you'll walk away having learned something or with a new perspective on something. She maintains these office hours as a way to remain accessible and keep the communication lines open despite the fact that our tech team is quite large and growing rapidly. How accessible? Our awesome intern Maude (@QcMaude - A Post Internship Look at RTR) has mentioned to me how great it is that she is able to openly talk to Camille as she would any other peer.

There are lots of big brains out there in the world though, and we can't send the entire tech team to the next “Edgy Tech Conference.” In order to fill that gap we bring in guest speakers from time to time. We’ve invited people like the creator of Backbone (@jashkenas) and (@mrb_bk), an engineer from Code Climate (which we use for JS/Ruby static analysis).

"What if," you say, "I get a crazy idea that I just want to try and I never have time to build it?" You'll have your chance! In the year I've been here, we've had a hack day, a (3-day) hack “week” and a full, five-day hack week the last week of August. On hack week you get to work on (more or less) whatever you want as long as it’s vaguely related to making RTR more awesome. Best of all, these hack week projects often turn into real, even huge, projects that alter the roadmap of the company! For example, on renttherunway.com we have a feature where users can upload their own photos. Unfortunately, the quality of these photos varies wildly. What if we had a way to automatically give each photo a quality score so that we could show the higher quality photos more prominently? I happen to have a background in machine vision. For my first hack day project, I built a first cut of such an image quality metric. It was simple, but actually showed visible improvements. (In a nutshell, the metric gave preference to images with 1 or 2 faces and images that are overall brighter.) Another example? Our huge new feature, a whole new subscription-based rental model called RTR Unlimited, started life as a hack week project.

But wait! There's more! A recent new hire asked me if RTR_tech does anything with open source. Indeed, we do! For starters, we are committers and some of the main promoters of the aforementioned Dropwizard. We also have more than one OSS project that we produced internally: Alchemy for A/B testing and Conduit for simplifying the use of message queues. I'm certain these projects are just the beginning. In fact, it's worth noting that our internal software development process is modeled to a degree on open source development. We use pull requests for everything, no matter how small, and naturally, have awesome unit test coverage. This is a great way to develop and combines the best of OSS development with the benefits of being in an office right next to your teammates. For one thing, I have found pervasive code reviews to be an excellent way to spin up new people on company standards or languages that are entirely new to them. It really works and has allowed me to remove one worry from my list of "oh man, new person, so much to cover."

All of this sounds fantastic, but we’re a startup so we must be working like dogs, right?

Well...

We have unlimited vacation. Now I admit to being a bit (a lot) cynical. On joining RTR_tech, I assumed that "unlimited vacation" was code for "guilt-based vacation." I was wrong. This is simply not the case. People take lots of time off, myself included, and I haven't seen any guilt-tripping at all. Remember, we're talking about real people here. Awesome people. It turns out, we respect each others’ need for time off. When @MichelleWernick goes to Paris, we all get excited for her and people step up to help fill the gap in QA. When @skamille goes to Hawaii, we don't flood her inbox; we step up and exercise our latent CTO superpowers. When she gets back, there's Hawaiian candy on the kitchen counter and funny stories about emergency room visits.

In fact, it doesn't end with unlimited vacation. We have stellar work/life balance in general. @RTR_tech we understand that the trick is to anticipate, plan, and course-correct a project early. We work hard to avoid feature creep and instead focus on quality. We work smarter, not harder. In this way, we deliver awesome software without people freaking out at the last minute, having to put out fires, and working 12-hour days. Sure, we're not perfect at it, but we've had major successes and we believe in it. We're continuously improving. (How about a Drinks and Demos presentation about what went right and what went wrong in that last big project release? OK!) We embody this in other ways too: Unlimited vacation begets unlimited maternity leave. (I've seen it more than once! It happens! It works! All companies should do it!) And after your maternity leave, maybe you want to have your newborn brought by the office every day so you can take time to feed it. Of course! Who wouldn't allow that? I have a son and a wife. I get to go home and eat dinner with them and when I return to the office the next day, renttherunway.com is still there and I get to do more cool work on it with my fantastic team! I'm more loyal for the stability it provides. I take the time to write a blog entry because of it, and I spend that much more energy recruiting the next awesome teammate because I love telling them about it! What if you don't have a family? What if you're young and unattached? RTR is in NYC! Go enjoy living in New York Effing City!

Did I mention we're encouraged to write articles for our tech blog? Perhaps you've seen said blog? These kind of things are literally listed in our quarterly tech team goals. "1) Ship feature X so we can rent some more dresses. 2) Post 6 tech blog entries."

OK, I promise I'm almost done gushing, but stick with me a bit longer. There's so many more cool things to tell you.

I had a new junior engineer start recently. I gave him a week or two to settle in before I gave him his first major task. Let me set the stage: we're nearing completion of some initial research that uses Python to do Markov Chain Monte Carlo simulation for parameter inference on a Bayesian Network. (To predict the future. NBD.) I asked New Guy to determine if it's viable for us to port that to Java for production use. (The answer, two weeks later, appears to be yes, if a bit inelegantly, via Yadas (Yada Yada Yadas). So that's a thing we do @RTR_tech. Rad, right?

Our interview process is an area we've recently been working to improve (because we're doing lots of it). So how are we doing? Yesterday, I got overwhelmingly positive feedback from a new hire. He thought that it was fantastic to have an RTR_tech engineer guiding him through the whole interview process, answering his questions, being open, honest and enthusiastic. Shortly thereafter, he took the gig. Sounds like it's working!

Did you notice that I used one or two (or forty-two) Twitter handles here? That's because I'd love for you to join our conversation! I want to hear from you! What are we missing? What more can we do? To my fellow RTR_tech-ers (is that a thing?): what is life @RTR_tech to you? Follow us on Twitter @RTR_tech! But don't stop there! Connect with these awesome people about awesome stuff: @skamille, @ericqweinstein, @markwunsch, @bhsdrew, @CarloBarbara, @timx, and many more (use transitive closure to find everyone!)….

This is life @RTR_tech.

P.S. - I really wanted to figure out how I could work in a joke about Talk Like A Pirate Day being a big deal at RTR, but, as any good engineer knows, sometimes you have to kill your little darlings.


A Rubyist's View of Modern Java

An Unpleasant Introduction

When I was in college (back in 2007), "learning Java" meant "forging Java applets in the dark heart of Mount Eclipse." As you would imagine, this involved a lot of import java.awt.* and public class SelfLoathing extends Applet and lava and blood and Uruk-hai. (Some of this is certainly my own failing; my boss, who knows much more about Java than I do, loves Eclipse.) I don't think anyone died as a direct result of writing old-school Java or attempting to use Eclipse, but I witnessed at least one grown man reduced to tears. I finished the semester and swore I would never write Java again.

The Language (Java 8, or "The Ocho")

Java has come a long way in seven years. A more comprehensive guide to those changes is available in this excellent series of posts over on the Parallel Universe blog; for the points most salient to Rubyists, Pythonistas, JavaScriveners, PHPeons, and others of more dynamically-typed persuasions, I've made a short list of the delightful bits of modern Java we're using here at Rent the Runway (TL;DR at the bottom).

To reverse each string in a list in Java 7 and earlier, you used to have to pull a stunt like this (assuming you weren't in an interview and were allowed to use Java's built-in String methods):

To a Rubyist, this is something of a cross between waiting in line at the DMV and burning a hecatomb: quite a bit of rigmarole to obtain a simple result (a driver's license, a favorable wind toward Ithaka). Now, you could write truly Javatastic Ruby:

However, you'd be much more likely to write:

Or, if you're feeling fancy:

To quote Eddie Izzard: et voilà. Truly, Ruby is an elegant language for a more civilized age. Could Java ever look like this? Surprisingly, the answer is yes! In Java 8—a.k.a. The Ocho™—you can now write:

Is this as wond'rous as the one-line Ruby version in mine eyes? No. But it's leaps and bounds above the imperative/procedural style to which Java, the quintessential object-oriented language of the 20th (and now 21st) century, has been shackled for the better part of two decades. And this is really only scratching the surface of what's available in The Ocho™ (I'm gonna make this a thing, I swear): better type inference, method parameter reflection, and fibers (sort of) are just a few of the additions that are turning Java into modern Java (and making modern Java a really pleasant language to write).

The Ecosystem (Frameworks and Tooling)

I've found that even more important than the features of the language you write is the ecosystem in which it lives; one of the reasons I quit Haskell was because its package manager, Cabal, was dreadful. (It probably still is, though I haven't checked recently.) Java's package management systems are well-known and mature (I'm a fan of Maven myself), but the same can't be said for the myriad Java web frameworks out there.

First, I'm going to straight up skip J2EE: partly because I'm not old enough to remember it, but mostly because anyone who is probably doesn't want to.

Some people will tell you about "modern" web frameworks like Spring Web MVC or Play. Spring is (near as I can tell) not much more than a wrapper around all the things nobody likes about Java EE, and I haven't been able to discern any core philosophy with regard to best practices or patterns for solving common web service problems. And if this were the worst of these frameworks—if the design were confusing but the underlying technology sound—that would be one thing. I've actually witnessed Play, via some JNI-based Unforgivable Curse, segfault when you added a route. Which is, I suppose, what you get for wanting to do something insane like that.

Enter Dropwizard, which is not only sane, but absolutely delightful. Despite my limited Java superpowers, I got the "Hello, world!" service up and running in about fifteen minutes. Everything worked as expected. It made sense. Though it's a young project (started in 2011 and recently released version 0.7), there's excellent documentation and a helpful (though still somewhat small) community. I'm really pleased it's our framework of choice at Rent the Runway, and I've actually started going out of my way to read Java pull requests to learn more about it and how we use it.

Finally, I'd be remiss in my short survey of modern Java without a brief discussion of IDEs. As mentioned, I despise Eclipse and wouldn't mind seeing it blotted out of all human memory, but I strongly believe in knowing (and, ideally, liking) your tools, so I think anyone productive with Eclipse should continue to use it. That said, I picked IntelliJ when I came back to Java earlier this year, and it's excellent: on-the-fly warnings (which I generate liberally), code completion, automatic refactoring, and (best of all) immediate and smooth integration with Java 8. IntelliJ's certainly not a new tool, but its newest version (13) is, in my humble opinion, the editor of choice for modern Java.

TL;DR

  • Java 8 is a serious improvement on the core language, and I think programmers everywhere (particularly those writing Ruby, Python, JavaScript, and PHP) owe it another look;
  • Dropwizard is a first-class web service framework and the first Java web framework I'm really excited to dig into;
  • IntelliJ makes developing in Java (particularly Java 8) a breeze, and I wholeheartedly recommend it. (This is from a guy who spends 90% of his day in Vim.)

Am I going to switch to only writing Java tomorrow? Of course not. But with Java 8, Dropwizard, and IntelliJ, I'm interested in learning more about the language, its platform, and its tools, and I'm excited to build something great with them.


Cathartic Code Cleanup with @AutoValue

profile pic
profile pic

Sometimes removing code is more satisfying than adding it. For whatever reason, be it an appreciation for clean, elegant software, or a hint of OCD in my personality, I've been known to remove a lot of code in my commits. It's always a great feeling to deprecate that last usage of an old system, refactor duplicated functions, or eliminate some boilerplate code. The latest code cleanup mechanism we've been using on the Pricing team here at Rent the Runway is a tool called @AutoValue (that's a Java annotation, not a Twitter handle), developed and recently open-sourced by Google.

As the name implies, AutoValue helps automate the implementation of value classes; or classes that exist just to encapsulate some fields/values, and don't provide much additional logic. As an example value class, we have an 'Item' class at RTR which contains an ID, a barcode, and a few other fields.

A traditional Java implementation of a class like this can be a hundred lines of code (our Item class is almost that -- 94 lines), after you put in getters, setters, hashCode, equals, and toString functions. Most programmers will auto-generate these with their IDE, but the functions are a burden to review and maintain. The AutoValue documentation's description of these functions is spot on:

"Their wide expanses of boilerplate sharply decrease the signal-to-noise ratio of your code, and constitute probably the single greatest violation of the DRY (Don’t Repeat Yourself) principle we are ever forced to commit."

Why should a class that just holds a few fields take up a hundred lines of code?

That’s where AutoValue comes in. Our 94-line Item class was reduced to 21 lines once we AutoValue'd it. That’s a 78% cut.

Figure 1: Code bloat reduction after 30 minutes usage of @AutoValue ** Actual results may vary

Before

After

screenshot
screenshot

But in all seriousness, if you take a close look at the before and after shots, you’ll notice the new class has exactly the info you want; and nothing else.

Here's the before shot, boilerplate galore:

Figure 2: Item.java - Without AutoValue

[gist]https://gist.github.com/rtannenbaum/9922218[/gist]

... and the after shot, simplified with AutoValue:

Figure 3: Item.java - With AutoValue

[gist]https://gist.github.com/rtannenbaum/9922291[/gist]

So how does this work?

All you have to do is create an abstract class annotated with @AutoValue, and include a static creator and getters for your fields. Behind the scenes, the AutoValue annotation processor generates derived source code, which you never have to see (but can if you want).

For example, here's the source code that AutoValue generated for our Item class, retrieved from a hidden compiler output directory:

Figure 4: Generated source code [gist]https://gist.github.com/rtannenbaum/9939177[/gist]

Some features, benefits, and uses

1. Prefer AutoValue over tuples. Tuples are often used as a convenient way to quickly represent a collection of fields, but the tuple obfuscates the fields’ meanings and relationships to one another. With AutoValue you can create a much better representation of your data, almost as quickly and succinctly.

2. JSON serialization is easy with Jackson annotations. Just annotate the creator with @JsonCreator and the fields with @JsonProperty, and your class will serialize to JSON.

Figure 5a: Jackson serialization [gist]https://gist.github.com/rtannenbaum/9922330[/gist]

Figure 5b: Serialized JSON [gist]https://gist.github.com/rtannenbaum/9922344[/gist]

3. AutoValue helps prevent null pointers. It includes null checks on every field by default, unless you specify @Nullable fields in the creator. It’s easy to forget null checks and propagate null pointer exceptions (see Figure 2 -- no null checks), so it’s nice that AutoValue takes care of this for you.

4. AutoValue makes your classes immutable. It’s easy to forget to do this too (see Figure 2 again -- no final fields; this is the last time I’ll pick on this code ...), so it’s nice that AutoValue takes care of this as well. (Here are some benefits of immutability)

If you do want mutable fields, AutoValue doesn’t support it directly, but it’s easy to work around. In our Item class, say we want to make ‘barcode’ mutable. Including a 'withBarcode' function kind of cheats (by making a new object), but is a clean way to change a field's value if you don’t mind the overhead of a new instance:

Figure 6: “Mutable” fields

[gist]https://gist.github.com/rtannenbaum/9922296[/gist]

5. … and many more in the AutoValue documentation.

Maven

You may need to tweak this slightly depending on your Maven configuration, but we integrated AutoValue into our build using the following pom.xml entries:

- auto-value dependency: [gist]https://gist.github.com/rtannenbaum/9922381[/gist]

- annotation processor plugin [gist]https://gist.github.com/rtannenbaum/9922396[/gist]

- compiler plugin, with annotation processing disabled in the compile phase (compilerArgument -proc:none) to prevent duplicate source files from being generated: [gist]https://gist.github.com/rtannenbaum/9922409[/gist]

That’s all. We hope you have fun reducing your code bloat too!


Our Stack. Scalability Is Key!

CAB_bio_pic-thumb
CAB_bio_pic-thumb

Intro

We use Java and we like it! When I say this to some people, they cringe. They look at me like I'm a dinosaur who doesn't realize his extinction is coming. I usually smirk, I get it. These are folks that or so in love with Ruby/Rails or Python/Django that they forget it's just a tool for solving a certain kind of problem. I can understand why, they are both great tools! I built my personal site in Rails and it was a lot of fun. It's easy to iterate, and the community makes adding value very simple. The problem is you when you fall in love with your hammer, everything looks like a nail. In this post, I'll tell you some of the problems we faced, and why that made a pure 100% Rails app a bad choice for us. Then we will talk about how we are thinking about leveraging Ruby in our stack. But first, a little history is required.

Our Old Stack

Over three years ago our founders hired some consultants to build an MVP for their vision. They came back with PHP/Drupal and MySQL, a monolithic architecture. It was functional, allowing the founders to start growing the business. Customers were trying it out, and loving the service. The business had legs and traffic was steadily rising. Fast forward to last year, and our infrastructure was handling the flow, but it was creaking loudly. This architecture was fragile, and it didn't scale well in any direction.

Some of the issues we had with the codebase included:

  • Monolithic code base. How do you make changes to one part without releasing everything else? How do you this with 20 devs? 100?
  • How do you improve performance in this architecture? Caching will help some if your data is mostly static. Scaling horizontally at this level may not be not enough to increase throughput, it's too coarse and it requires logical separation to avoid DB concurrency issues.
  • Concerns weren't separated, so one component going down risks taking down the whole site. Ideally you have "swim lanes", basically silos where if any component in the tech stack goes down, only that silo is affected. The product detail page can go down without taking down checkout, the homepage can be unavailable but those on product pages don't notice.
  • How do you support multiple platforms? We don't want to write the same database logic in many different places, and the monolithic code base won't support apps for iphone, android, or ipad.
  • The Drupal Code was spaghetti code. There was SQL in the views and controllers, which meant even making simple view changes was complex and risky. Adding insult to injury, the test coverage was non-existent. This made iterating a dangerous task.

Making it Scale

We like to think of Scalability in terms of the AFK Scalability Cube[1]. There are three axes for scaling you application

  1. The Y - axis (Splitting architecture out by service or function)
  2. The X - axis (Creating N instances of a component, all replicas. Excluding 1 designated for writes. Fronted by a load balancer)
  3. The Z - axis (Splitting resources by user characteristics. i.e The West coast on 1 pool, the East on another)

The further along these axes you get, the better your application scales. We know we needed a highly availability and concurrency, so we needed to scale in multiple directions.

Here is how we made this happen in practice, some of which we are still iterating on:

  • (Y axis) We started dismantling the monolithic beast and migrating to a Service Oriented Architecture (SOA), based on Java backend web services.
    • RESTful web services delivering JSON
    • The concerns are well separated, a service has a specific job and does that job well
    • Easier for development teams to own a full stack of components. This allows us to create small teams which act on goals without spelling out direction, fostering innovation.
    • Allows for more frequent releases
    • Creates disaster isolation: if only that service goes down, the rest of the site should remain available where you don't have cross swim lane dependencies.
    • Why Java? It is easy to test, easier to scale, and more powerful than most high-level languages.
    • (X axis) We have pools of service instances fronted by load balancers.
      • For a service that is slower or more popular, throw more instances in the pool
      • Since the services themselves are smaller than one big app, you can better leverage hardware resources
      • (Y axis) The view is thin, and renders whatever the services provide.
        • Create different view layers for different platforms
        • Services don't care who the client is, they just respond to requests!
        • (All) Metrics gathering. You can't make it better if you can't measure it!

Ruby usage at RTR

When I think about SOA and scalability, I think about the JVM. It's stable, proven, and optimized. When you factor in the size of the community and tools available, it makes a great choice for writing your services. I couldn't see myself writing highly concurrent services in Ruby or Python, but maybe it's because I'm not as familiar with those languages. With that said, I haven't heard of many companies with similar performance requirements going down that route either. Going with Java comes with some overhead. It's verbose, and isn't as convention driven. This means it takes devs longer to write/read code and understand APIs. It also means you need strong leadership pushing solid practices, otherwise you can end up with multiple approaches. This is actually where I think Rails excels, convention over configuration speeds up development. So if you don't require high scalability or have a big dev team, Rails might be the way to go. It's definitely faster time to market, and a better tool for building an MVP.

As we continue to dismantle our monolithic beast and move away from PHP, we want something light and simple that we can control. For that reason, we are moving to Sinatra/Ruby. Stay tuned for more info on how that goes...

-@CarloBarbara

[1] AKF Cube


Dropwizard and Quartz Part 1: Scheduled Java Jobs In A Nice Package

ColinM_1-thumbIt's very common to take simple,  regular, or automated tasks, cook them into your favorite executable and add another entry to crontab, and off you go, because you have bigger fish to fry. And this is fine, sort of.. It will work well, and your job(s) may continue to run for quite some time without errors. But what happens when you have lots of  jobs? Jobs that run on multiple schedules? Jobs that must not run concurrently?  Jobs that rely on flaky 3rd party services that fail every now and then? How are you going to manage and keep track of that easily? Over time it can become a nasty mess. The problem is that you have out-grown CRON and need something a little more sophisticated. There are a number of solutions out there: some are free, others can be expensive; some may be too simple, others too complex or cumbersome, and in the end  they may be decent but not exactly what you want. You don't want to reinvent the wheel with a grassroots solution, but you will need something that is flexible and malleable enough for your needs. If, and this may be a big IF, you are a Java shop, or have jobs/tasks that are platform/language agnostic, Dropwizard + Quartz could be a great solution for you.

Quartz is a Java scheduler, that is in principle very similar to good-old CRON, but with a lot more bells and whistles. Dropwizard is a well thought-out web services framework which will provide an excellent wrapper for managing and keeping tabs on your scheduled jobs.

A little bit about Dropwizard

From dropwizard.codahale.com:
 Dropwizard is a Java framework for developing ops-friendly, high-performance, RESTful web services. Dropwizard pulls together stable, mature libraries from the Java ecosystem into a simple, light-weight package that lets you focus on getting things done.

That pretty much says it all. Using Jetty, Jersey, and Jackson, among other things, it makes building a web service very very simple, giving you a number of nice features out of the box like configuration and health checking which we will discuss briefly below.

A little bit about Quartz

 From quartz-scheduler.org:
Quartz is a full-featured, open source job scheduling service that can be integrated with, or used along side virtually any Java EE or Java SE application - from the smallest stand-alone application to the largest e-commerce system. Quartz can be used to create simple or complex schedules for executing tens, hundreds, or even tens-of-thousands of jobs; jobs whose tasks are defined as standard Java components that are programmed to fulfill the requirements of your application. The Quartz Scheduler includes many enterprise-class features, such as JTA transactions and clustering. 

Quartz does everything CRON can do and much much more. Jobs can be stateful or stateless, monitored throughout every step of their life-cycle, and it comes with all of the Java error and event handling goodness you need. There are 3 main components to Quartz: the scheduler (of which you can have many, be we'll be using one), triggers, and jobs. As mentioned above Quartz has a clustered mode of operation where a schedules and jobs can be shared among multiple instances, but we haven't played around with that (yet!).

What we want out of this

The goal here is to have a centralized system that:

  • Runs our jobs exactly when and as often we want (Flexible)
  • Handle temporary failures (Robust)
  • Sends notifications of critical/permanent failures (Reliable)
  • Handles complex jobs using (almost) any 3rd party Java lib or service (Useful)
  • Allows non-tech personnel to see what's going on (User Friendly)
  • Testable and maintainable (Quality)

To do this we use some of Dropwizard's and Quartz's awesomesauce:

  • Create a managed instance of a Quartz scheduler for graceful start-up, shutdown, etc
  • Use a Dropwizard Health check to watch our Quartz Scheduler
  • Quartz Scheduler and  Job listeners
    • To track the current state of the system
    • Handle errors
    • Retry jobs that failed due to temporary issues (locked resources, timeouts, etc)
  • Dropwizard Configuration to set up our Quartz Scheduler
  • Quartz Job XML files to instantiate our jobs and triggers
  • Add web resources that interact with out managed Quartz Scheduler.

Dropwizard + Quartz: The Nitty Gritty

The code samples below provide a skeleton to get Quartz up and running within a Dropwizard web service. In this post we will be breaking down the most basic parts we need to give us a simple foundation to build upon in later posts. The first step is to create your main Dropwizard Service class that kicks everything off.

[gist]http://gist.github.com/3121212.js[/gist]

Our Dropwizard managed Quartz class is responsible for starting, stopping and checking in on our Scheduler and its jobs.

[gist]http://gist.github.com/3121315[/gist]

Configuring Dropwizard & Quartz

Dropwizard has a straightforward  configuration mechanism that uses YAML or JSON configuration files, making it easy to set environment and initialization parameters. We will be making use of this to set our Quartz Scheduler properties. This could be done in a separate quartz.properties file, but it in most cases it is better to keep all of your environment settings in one place.

[gist]http://gist.github.com/3121429.js[/gist]

Dropwizard YAML Confiuration for Quartz

The YAML configuration is pulled  in when the service is kick off using a simple command line argument. For example:

java -jar myDropwizardQuartzService.jar server production.yml

[gist]http://gist.github.com/3121351[/gist]

Dropwizard - Quartz Healthcheck

Dropwizard has a simple metrics and health check system that makes keeping tabs on your services or service features very straight forward. As our managed Quartz Manager / error handling gets more complex, its state/health can be completely encapsulated such that the health check doesn't need to be altered.

[gist]http://gist.github.com/3123161.js[/gist]

Creating Jobs Through XML

In this example we are instantiating/scheduling our jobs by listing XML files that described the jobs themselves, any data we want to pass in, and their triggers. It is possible to have multiple XML files, as seen in the configuration example above. Each XML file can contain multiple jobs and triggers.

[gist]http://gist.github.com/3121364.js[/gist]

For more about Quartz jobs and triggers take a look at the examples and tutorials.

In later posts we will cover and go into some more detail on the following topics:

  • More about  jobs, passing data, using  job and scheduler contexts
  • Scheduler and Job listeners
  • Handling and Retrying a job when it fails
  • Web resources and interface for the Quartz Service

Stay tuned!