More on multithreaded topological sorting

« Hard core multicore with TestNG

New blog »

More on multithreaded topological sorting

I received a few interesting comments on my previous entry regarding the new multithreaded topological sort I implemented in TestNG and there is one in particular from Rafael Naufal that I wanted to address:

The @Priority annotation couldn’t be adapted to say which free methods get scheduled first? BTW, the responsibility of knowing which nodes are free couldn’t be moved to the graph of test methods?

This is pushing my current algorithm even further in the sense that we not only want to schedule free nodes as they become available, we also want to schedule them in order of importance.

What makes a node more important than another? Its level of dependencies. The more a node is depended upon, the more beneficial it is to schedule it as soon as possible since this will end up freeing more nodes. Admittedly, you are still bounded by the size of your thread pool, but this is exactly what we want: increasing the pool size should lead to more parallelism and therefore better performance, but the current scheduling algorithm being fair (or random) means that we are not guaranteed to see this performance increase.

My first reaction was to modify my Executor but as it turns out, you can actually do this with the existing implementation. The constructor of all the Executors takes a
BlockingQueue in parameter, which is the queue that the Executor will use to process the workers. Unsurprisingly, there already is a priority queue called PriorityBlockingQueue.

All you need to do is to use that queue instead of the default one when you create your Executor and then make sure that the workers you pass it have a natural ordering. In this case, the weight of a worker is how many other workers depend on it, which is very easy to calculate.

On a related topic, I wanted to get a closer look at how the algorithm I described in my previous blog post actually works. I described the theory and I have tests that show that it seems to work as expected, but it occurred to me that I could actually “view it from the inside” with little effort.

First, I added a method called toDot which generates a Graphviz file representing the current graph. This turned out to be trivial:

/**
* @return a .dot file (GraphViz) version of this graph.
*/
public String toDot() {
  String FREE = "[style=filled color=yellow]";
  String RUNNING = "[style=filled color=green]";
  String FINISHED = "[style=filled color=grey]";
  StringBuilder result = new StringBuilder("digraph g {\n");
  Set<T> freeNodes = getFreeNodes();
  String color;
  for (T n : m_nodesReady) {
    color = freeNodes.contains(n) ? FREE : "";
    result.append("  " + getName(n) + color + "\n");
  }
  for (T n : m_nodesRunning) {
    color = freeNodes.contains(n) ? FREE : RUNNING;
    result.append("  " + getName(n) + color + "\n");
  }
  for (T n : m_nodesFinished) {
    result.append("  " + getName(n) + FINISHED+ "\n");
  }
  result.append("\n");
  for (T k : m_dependingOn.getKeys()) {
    List<T> nodes = m_dependingOn.get(k);
    for (T n : nodes) {
      String dotted = m_nodesFinished.contains(k) ? "style=dotted" : "";
      result.append("  " + getName(k) + " -> " + getName(n) + " [dir=back " + dotted + "]\n");
    }
  }
  result.append("}\n");
  return result.toString();
}

Then I modified the executor to dump the graph every time a worker terminates, and finally, I wrote a shell script to convert these dot files into images and to create an HTML file. I ran a simple test case, processed the files with the shell script and here is the final result.

A yellow node is “free”, green means that the node is “ready” (to be run in the thread pool), grey is “finished” and white nodes haven’t been processed yet. Dotted arrows represent dependencies that have been satisfied.

As you can see, the execution matches very closely what you would expect based on my description of the algorithm and I confirmed that changing the size of the thread pool creates different executions.

Tags: testng

This entry was posted on December 13, 2009, 12:45 pm and is filed under Java. You can follow any responses to this entry through RSS 2.0. Both comments and pings are currently closed.

7 Comments

#1 by Nish on December 13, 2009 - 1:21 pm

Your example always schedules every free test, which would happen with the random scheduler as well. Do you have an example where scheduling random free nodes differs from scheduling free nodes based on their dependencies?
#2 by Cedric on December 13, 2009 - 1:34 pm

Nish,
Take a look at this new graph:
beust.com/topo/Topo8.png
When a1 completes, it frees both b1 and c1, but with this scheme, c1 would have a weight of 0 (no dependees) while b1 has a weight of 2 (2 nodes depend on it), so b1 would get scheduled first, and when it completes, it will free 2 more nodes (b2 and b3).
#3 by Brian Slesinsky on January 17, 2010 - 2:08 pm

I can’t tell from these examples, but hopefully you’re counting the number of descendants of each node rather than the number of children?
#4 by Olivier on January 18, 2010 - 8:24 am

Hi Cedric,
I’ve used a similar technique to structure a large batch process recently. In my case, a task also has the ability to dynamically spawn various child tasks, that get reinjected in the dependency graph (probably not relevant for TestNG though).
Another interesting issue is to evaluate the maximum parallelism that can be achieved on the graph (under ideal circumstances), which gives an upper bound on the thread pool’s size. This amounts to calculating the independence number of the graph, which can be solved in linear time for a directed acyclic graph.
#5 by Daniel on January 19, 2010 - 9:08 pm

I ran into a similar problem trying to parallelize the eager initialization of Spring singleton beans in order to reduce application startup time [1]. I really like your implementation, especially the dot file generation as a visualization aid… so I stole the idea! Thanks for a great series of posts.

[1] github.com/gredler/spriths
#6 by Jack3d on June 5, 2011 - 6:00 pm

Thanks for the tip. I’ve been having the same issues and I thought I would never find any solution online! I’ve used your technique and it worked out great! Kudos!
#7 by Annie Ignacio on September 15, 2011 - 12:30 am

Great site! I have an eager solutions to those problems I’ve encountered and ruin my works. I modified the executor to dump the graph . Finally, I can wrote a good script to convert these dot files into images and to create an HTML file. I can a simple processed the files with the script, an interesting issues will be evaluated. Thanks for the idea…

Comments are closed.

Welcome

My name is Cédric Beust. I am a software engineer and the creator of the Java testing framework TestNG. If you like this blog, you should follow me on twitter or on Google+. You can also email me at cedric@beust.com.
Top entries

Agile people still don't get it

Coding challenge

Coding challenge wrap-up

Confessions of a reluctant switcher

Dynamic language, refactoring: pick one

Hardcore multicore with TestNG

Hardcore multicore with TestNG (2)

JUnit pain

Structural typing vs duck typing

The Fan Syndrome

The perils of duck typing

The return of statically typed languages

Why I love Ruby

Why Ruby on Rails won't become mainstream
Recent posts

Coding challenge: a sliding window map

Advanced dependency injection with Guice

A note on null pointers

Reinventing assertions

Apple maps

Dark Souls

You’ve been implementing main() wrong all this time

Video game update

Solution to the “Power story” challenge

Non coding challenge: a power story

Various ways to get randomness wrong

A new coding challenge

The only sure way to lose is not to play

Ushering TestNG into the new year

Ground control to major Tom
Archives

September 2012 (1)

August 2012 (2)

July 2012 (1)

June 2012 (1)

April 2012 (1)

March 2012 (4)

February 2012 (3)

January 2012 (1)

December 2011 (2)

November 2011 (3)

October 2011 (7)

September 2011 (2)

August 2011 (10)

July 2011 (8)

June 2011 (2)

May 2011 (7)

April 2011 (2)

March 2011 (7)

February 2011 (4)

January 2011 (6)

December 2010 (8)

November 2010 (2)

October 2010 (1)

September 2010 (4)

August 2010 (12)

July 2010 (10)

June 2010 (5)

May 2010 (7)

April 2010 (7)

March 2010 (8)

February 2010 (2)

January 2010 (2)

December 2009 (1)

November 2009 (4)

October 2009 (1)

September 2009 (2)

August 2009 (1)

July 2009 (2)

June 2009 (1)

April 2009 (4)

March 2009 (4)

February 2009 (1)

November 2008 (3)

October 2008 (4)

September 2008 (3)

August 2008 (1)

July 2008 (1)

June 2008 (5)

May 2008 (6)

April 2008 (1)

March 2008 (3)

February 2008 (1)

January 2008 (5)

November 2007 (3)

October 2007 (3)

September 2007 (5)

August 2007 (2)

July 2007 (3)

June 2007 (2)

May 2007 (5)

April 2007 (3)

March 2007 (4)

February 2007 (2)

January 2007 (6)

December 2006 (4)

November 2006 (3)

October 2006 (7)

September 2006 (2)

August 2006 (7)

July 2006 (9)

June 2006 (3)

May 2006 (5)

April 2006 (6)

March 2006 (7)

February 2006 (11)

January 2006 (7)

December 2005 (9)

November 2005 (9)

October 2005 (7)

September 2005 (9)

August 2005 (9)

July 2005 (8)

June 2005 (10)

May 2005 (10)

April 2005 (12)

March 2005 (11)

February 2005 (16)

January 2005 (12)

December 2004 (9)

November 2004 (9)

October 2004 (12)

September 2004 (14)

August 2004 (15)

July 2004 (15)

June 2004 (12)

May 2004 (12)

April 2004 (14)

March 2004 (20)

February 2004 (9)

January 2004 (16)

December 2003 (9)

November 2003 (9)

October 2003 (11)

September 2003 (13)

August 2003 (16)
Meta

Log in

Entries RSS

Comments RSS

WordPress.org

Otaku, Cedric's blog

Thoughts about software development