Plans

This page collects a few things we hope to work on soon but haven't started yet. If you are interested in working on one of these or have thoughts on how it should work, send an email to the mailing list.

UI Improvements

The current jQuery javascript tree is quite slow, and having a single tree with all the jobs is simply TMI.

Flow Display

You should be able to display job flows as full DAGs. SVG is an easy way to draw these graphs, but requires doing the layout of the nodes yourself. But hey, graph layout is fun.

Authentication

Currently we have none.

Gantt Chart for Flow Execution

Optimizing a flow of partially parallel, partially sequential jobs can be quite difficult. Having a simple visualization of which jobs are taking the longest and blocking everything else can help enormously.

Time estimates for Flows

We already track the past execution times for jobs, it would be good to roll up the past estimates into a projection (say a moving avg of the last few runs), and by doing this for each job in the flow be able to give better % complete and projected completion time information.

Core improvements

Refactor the overall job manager and graph execution layer a bit.

REST interface

We should add rest interfaces for the main job control functionality to make scripting easier.

Fix or kill HDFS viewer

We have a HDFS viewer that is a little prettier but missing a lot of the functionality of Hadoop's. It does allow per-filetype plugins.