Skip to content

Naiad on YARN and Azure HDInsight

One of the most commonly asked questions about Naiad is, “How on earth do you run it on a cluster?” When we first released Naiad, the only solution available was to grit your teeth and run the distributed programs manually, using scripts that were manually tailored to a particular cluster. In the mean time, Hadoop YARN has become a widely available framework for running data-processing applications on a cluster. This post describes how we ported the latest version of Naiad to run on top of Hadoop, and how this makes it easier to run Naiad programs on your data in Microsoft Azure.

Read more…

Announcing the Naiad 0.4 release

Since we released Naiad as open-source software on GitHub last October, we have been busy adding features that make it easier to use Naiad for your applications, in settings ranging from your laptop to a distributed cluster. In the coming weeks, we’ll be posting about what’s new. This post gathers together the things that have changed.

Read more…

Tuning the performance of Naiad. Part 1: the network

We have recently been talking about Naiad’s low latency (see, for example, Derek’s presentation at SOSP).  If you have ever tried to coax good performance out of a distributed system, you may be wondering exactly how we get coordination latencies of less than 1ms across 64 computers with 8 parallel workers per machine.  In this series of posts we’re going to reveal to our curious – and possibly skeptical – users what we did in gory detail.
Read more…

Naiad available on GitHub

We are pleased to announce that Naiad is now available on GitHub under the Apache 2.0 open source license. Naiad builds and runs using the .NET CLR on Windows, and Mono on Windows, Linux, and OS X.

Read more…

An introduction to timely dataflow

Regular readers of this blog might notice that the Naiad team is rather fond of defining new kinds of dataflow. Just yesterday, we did it again with a paper that defines “timely dataflow”. In this post, I’m going to introduce timely dataflow at a high level, and follow-up posts will talk more about the distributed Naiad system that we have built on top of it. Read more…

Naiad: A Timely Dataflow System

We’ve just finished work on a paper that describes the distributed implementation of Naiad. I will be presenting it in November at the ACM Symposium on Operating Systems Principles (SOSP) in Nemacolin, PA. UPDATED: The video is now available to watch: (MP4) (Flash).

Read more…

Graph Analysis and Hilbert Space-filling Curves

Way back at the beginning of time, we had a post on performing PageRank on a 1.5B edge graph of who-follows-who on Twitter. We talked a bit about how several big data systems don’t do quite as well as a 40 line, single-threaded C# program. There was also a promise to show how to make things go much, much faster. So we’re going to do that today.

Read more…


Get every new post delivered to your Inbox.

Join 60 other followers