The buffer, as mentioned in the previous design note, is Eve’s abstraction for media data, whether it is video, audio, or some other form of media that has not yet even been conceived. To recap, buffers store a format specification that details the kind of media type, audio channel count, byte width, video frame size, etc., along with the data itself. However, buffers need not store the data themselves, only a promise that the data still exists somewhere and is accessible by a knowledgeable object. This means that the data might not exist locally, inside of the Buffer structure (e.g. it may be in GPU memory), the data may simply be a pointer (e.g. in a reference counting scheme), and it might even be implicit (e.g. the frames are generated upon request).

Buffers in Eve cannot be resized, but their data can be changed. Buffers have one assigned media format descriptor, and this cannot be changed. One buffer (or part of it) can be copied into another, if and only if:

  1. the two Buffers media formats are as identical as possible, and one of:
    1. the two Buffers are of the exact same Java type
    2. there exists a correct marshaling method in the Compositor

The marshaling methods provide conversion methods from other Buffer types to the Compositor’s Buffer type, and vice versa. Because these methods can often cause bus saturation due to the amount of memory transfer, Eve’s design allows them to be implemented manually, allowing hand-optimization of a potential bottleneck. As for the term “as identical as possible”, this will be revisited on a later post regarding media formats.

The simplest way to represent a buffer of audio or video data is to allocate a contiguous block of space in system memory, the size of which being equal to the size per frame multiplied by the number of frames. Because this is such a common way of representing data, Eve implements this class as NativeBuffer, which implements a Buffer interface. As implied, NativeBuffer uses java’s nio package to provide near-native performance on all operations. Furthermore, if special hardware is not used for the input, output, or compositing stages of the media pipeline, Buffers will never have to be (un)marshaled because reflection can tell us at any part of the process whether or not incoming Buffers are of the correct type, or not.

The rendering pipeline of Eve follows the producer/consumer design pattern. In order to create output, timelines must be able to create a producer to render content in a one-use, low-cost way. The render pipeline then acts as the consumer for this producer, feeding frames from the timeline into the output (whatever that might be — the output is abstracted from the rendering pipeline).

The first drawback with the producer/consumer pattern involves the idea of passing around buffers, the designed abstraction of low-level media data for Eve’s purposes. Because we can implement the Buffer interface any way we choose, we can pass around buffers that do not need to be copied, making the cost of permeating data through the pipeline negligible relative to the cost of the actual work being done. The input layer, rendering layer, and output layer no longer have to agree on Buffer implementations, just that these Buffers must have the same media type. The problem then lays in how we pass data between the layers of the pipeline.

The solution to this is two-fold: first, introduce marshaling and unmarshaling methods for Buffers passing into and out of the rendering stage, respectively. This lets the rendering pipeline — the one with most of the heavy lifting to do — mandate which buffer format is best for its job. However, it would be a mistake (and go against the idea of loose binding) to force the renderer to account for every type of buffer that the input layer could throw at it; indeed, years later, we want the exact same renderer to be able to work with any new input layers that have been added on. So the second step is to create a common, “native” Buffer format to use when all else fails. This nets us the performance benefit when the renderer knows how to efficiently encapsulate its input, the ease of programming when the input layer actually uses this format, and the flexibility to use any input layer we want, as the renderer can request a conversion of the input buffer to the native format before encapsulation. Finally, this solves the unasked question: what kind of buffer should the output layer be passed?

I’d like to close with a discussion on the hierarchical nature of Producers. Like the timelines they are meant to express, Producers too are used in a recursive fashion — each Clip creates a delegate Producer to pass into the render queue. Because these could be track or timeline clips, they could create their own hierarchy of Producers within. The one remaining problem with this hierarchical producer/consumer pattern is: how do we make it fast? I won’t lie: a plan is in the works, but it hasn’t been finalised. The hope is that the performance part of the code will be separated from the functionality part of the code, allowing Eve first to be correct, and then to be usable in real-time.

Here’s a quick run-down of the terminology that I’ll use in this discussion.

A clip is an object that represents a piece of media, such as “the first 23 seconds of video from vacation.avi” or “50,000 samples of an 8kHz sine wave”. A track is a container for clips of a certain media type (audio or video).  A timeline, then, can be seen simply as a named collection of tracks. Finally, an effect is a (length-preserving) transformation placed on a clip, such as a film grain effect, a saturation adjustment, or a volume normalisation.

In an attempt to make editing large projects easier, and to facilitate re-use of common timeline elements (bumpers, episode introductions, credit sequences, and so on), Eve takes a hierarchical approach to the timeline of a video project. Put simply, timelines can reference other timelines (and parts of other timelines). This means that you can insert the video and/or audio from one timeline into another, and have it act as a sort of black box. Timeline clips, as these are called, can reference either a particular track of another timeline, a different track of the current timeline, or the fully-composited output of a given media type from a different timeline. A good question is, where’s the use case?

Say you want to add a colour cast to your entire production: simply create a new timeline, and add in the composited audio and video tracks from your main timeline. All that is left to do is to apply the colour cast effect to the correct clip, and you’re set. Notice that because timeline clips are a type of clip, and clips must choose one specific media type, there will be both a video and an audio clip in your new timeline.

A more industry-specific example would be that you could split the workload of a long project up into several different slices of time, and define each as a new timeline. After being worked on seperately, the timelines can be added together at the very end very simply — just add each timeline to the master timeline in series, add in any boilerplate introduction or credit sequence timelines, and you’re ready to export. All this, without having to copy around and assemble hundreds or thousands of clips. You get a cleaner workflow, the ability to add transitions for free and mix up the order after the fact, as well as the semantic plus of being able to name the timelines for easier navigation. Finally, all of these last-minute decisions can be reversed more easily, allowing for more creative play.

Hopefully I’ve made all of this clear. Any criticism?

My name is Cameron Gorrie; I’m a third-year (at the time of writing) undergraduate Computer Science student at the University of Toronto. I’ve set up this blog as a reminder to myself: document everything! Hopefully, there will be the side-effect of showcasing some interesting things to anyone who wants to see them. What kinds of interesting things?

I’m writing a video editing system using Java, Swing, Gstreamer, and OpenGL. It’s tentatively called Eve, a short form for “the Extensible Video Editor” (right now, it’s “the Eventual Video Editor”. Every computer scientist is also a comedian). So far a great deal of the back-end work is complete, and Gstreamer is even loading movie files on both Ubuntu and Windows Vista. Everything is pluggable and the system has even been designed with the front-end/back-end paradigm in mind, so that the two can run in entirely different processes. I will be documenting some of the back-end choices I have made to this blog in the coming weeks, so if this has piqued your interest make sure that you check back.

I’m assisting in the research of the usability factor of various parallel programming systems with Greg Wilson and another undergraduate student, Andriy Borzenko at the University. The research is in its preliminary stages, and already I feel like I’m in over my head — but in a certain charming manner, as if I could dance around the details of mainframe systems programming and still get something out of it. It should be interesting!

In the summer of 2008, I interned for a software development company called The Jonah Group, developing large-scale business software. They’re a fantastic bunch of people and I recommend them if you’re looking for custom software or web development done right. Their track records for delivery and cost estimation are among the best in the business, if they aren’t putting everyone else in the dust. I’m now working with them part-time.

Personally, I am somewhat of a photographer and create quite a lot of video. My research interests are AI, computer vision, parallel programming, and image analysis/manipulation. I’ve edited and helped film quite a few comedic short productions. Stay tuned.

Next up: design documentation for Eve, a necessary plug for Gstreamer, and some thoughts on parallel processing.

Who’s writing this?

My name is Cameron Gorrie. I'm an undergraduate student at the University of Toronto, with a strong interest in Artificial Intelligence and Computer Graphics. You can read more about me, or read my CV. If you have work or research opportunities in my interest areas, please do not hesitate to contact me.
April 2021
S M T W T F S
 123
45678910
11121314151617
18192021222324
252627282930