DirectFB 2.0: Efficient 2D Vector Graphics and Media

Let's get started with some cool extensions we can do to the current graphics core easily.

= Rendering Extensions to 1.1 =
 * New features already in
 * affine coordinate transformations (3x2 fixed point matrix)
 * anti-aliased rendering of (out-)lines or polygon edges (any draw/blit)
 * color and alpha mask support (third blit operand A8, ARGB, RGB16, any...)
 * Proposed extensions
 * Blit/FillQuadrangles

= Rendering, acceleration (and media) architecture for 2.0 =

In the long term we need a completely new interface with a redesigned render core (formerly "graphics core", or at least the major part) and new driver APIs. The old interface will still be available, either as a reimplementation using the new render core or with the old one integrated as another driver (model), like a GLES implementation would be.

Not even a draft:
 * Rendering a completely predefined scene, a bunch of predefined objects or a just a bunch of primitives (three levels from retained to immediate).
 * RenderScene
 * RenderObjects
 * RenderElements and
 * SetAttributes

There's also a draft version.

There will always be hardware with only limited vector graphics support if at all and for these it is DirectFB's job to provide decent acceleration with breakdown of higher level or unsupported vector graphics operations down to the hardware primitives.

New architecture
The new rendering architecture will be the optimal solution with best scalability for any graphics accelerator from most primitive (DMA copy/fill) to most advanced with/without OpenVG support.

It will be built on top of
 * A new accelerated rendering core (new driver interface)
 * the existing rendering architecture and
 * OpenVG/Cairo

While Cairo seems fine to become the software only implementation of the new interface, it will also be implemented on top of it to leverage any (non-)basic 2D acceleration available.

Same goes for OpenVG, but the intention in this case is to have a generic implementation of the new API which provides generic hardware acceleration.

Building OpenVG on top makes sense as well as underneath, it just depends on the hardware capabilities.

The new API is going a step further than OpenVG, Cairo and all those call based immediate rendering interfaces. It will finally allow the last step to move all information on the GPU side to just send a command for rendering the complete frame of a user interface.

Migration
The new core is just one (the native) implementation of the new API.

Before it "arrives" (is usable):

IWater. V           | CairoDirectFB    | V           | IDirectFBSurface ---'>> Old Graphics Core >> Hardware

Afterwards a lot of constellations make sense:

CairoWater V     IWater .--->>> New Graphics Core -->>>> Hardware/OpenVG V           |              vv   CairoDirectFB((---:))? (( Old Graphics Core >> Hardware ))? V           | IDirectFBSurface ---'

The first fully featured IWater implementation (IWater_Cairo) will be based *on* Cairo, while in the end most people will use IWater_native (using the new rendering core) to implement Cairo or any other API on top, leveraging any acceleration available.

Network and Extensions
Joints are universal entities for building networks of attributes, elements and other joints...

A network can be seen as a scene graph with a versatile structure and extensible types.

Different joint types exist to allow efficiency in all use cases:
 * Declaration of a complete network by using static buffers and initializers
 * Building a huge network quickly by connecting a few bigger networks, e.g. static buffers
 * Complete initialization of a network by just reading a single stream from disk/net, or mmapping a file
 * Dynamic management of the network including modifications of attributes and basic reconfiguration
 * Rapid reconfiguration of the whole network structure, adding and removing a lot
 * Loading and rendering parts of excessively big and complex scenes like maps easily without much overhead

The range of type values is limited and has been split into the following sub ranges:
 * Base types with official type range of 0x0000-0x3FFF and private range of 0x4000-0x7FFF.
 * Extended types with official type range of 0x8000-0xBFFF and private range of 0xC000-0xEFFF.
 * The remaining space starting at 0xF000 is RESERVED!

The base types allow further traversal without knowing anything about the actual joint type, because it is guaranteed that the number of 'bytes' (base header member) to skip this joint are calculated based on the base header: WaterJoint.

Extended types are usually logical groups of other joint types, with small to large networks behind them, not meant to be processed further anyhow, if the type is not known by the traversing party, library, application, hardware...

There's no way to put an extended joint without a known base "around" it, for jumping over before handling it at all.

If there's a stream of an extended type, like WJT_SHAPE_STREAM, the number of bytes to skip between each stream item is based on the size of its header: WaterShape. If that type of stream is not supported, only the amount of bytes to skip, that is set in the common base header (WaterJoint::bytes) is needed.

What about Css, Networking, Javascript, DOM, SVG, MPEG-4, MHEG etc etc. ?
No problem, just add some new type(s) of joints (scene graph element), but outside of the Base API using extension mechanisms. Imagine GPUs that will walk the network and support things like applying a special kind of movement to an object or a complete tree as it is hierarchical.

This makes Water a universal architecture for step by step (migrating) acceleration of everything!

There will likely be an implementation of this new declarative drawing on TI Davinci as well, doing the whole scene graph traversal on DSP. Step by step Water allows moving code from CPU to DSP up til running most code of an MHEG engine on the DSP!

Using shader languages the same should be possible with GPUs. In DirectFB 2.0 multiple GPUs, or other kinds of processors from multi purpose DSPs to Water Hardware, will be managed at the same time. Imagine 200+ stream processors all thirsty wanting Water! Parallel rendering is one of the big topics of DirectFB 2.0!

Micro Core and Modules
The new rendering core must be lightweight with good abstractions so basically any actual light- or heavyweight implementation, like transforming a curve into a line strip or emulating anti-aliasing with multiple passes, could be implemented as a module and plugged into the core. That way a lot of people will not complain about the added complexity too much. If the old drawing operations are enough, they can use the micro core with a few extensions if at all and will probably have the same amount of code as before.

Best case, of course, is when you have a lightweight core, because you can simply pass through all high level stuff to Cairo or OpenVG or even better dedicated hardware or firmware running on a DSP.

Completeness of public APIs
We should take care that we will always be able to do simple mappings and then "translate" further via plugin modules, e.g. to render a gradient by using a lot of smaller elements with vertex coloring OR with a generated texture. That is what the driver and core need to negotiate as outlined in Managed Rendering.

Currently, only rendering related items are defined besides the generic network architecture. Further extensions could allow complete mapping of SVG 1.2, MPEG-4 and others to Water or a Water derived API. We currently have 16 bits to denote each type of a network joint (element), a lot of space to reserve officially for such extensions!

Rendering Example
This demonstrates the entry points at the different stages of the managed rendering architecture. It ranges from single attribute / element per call (7 calls), over attribute / element streams (4 calls), up to rendering the complete scene with a single call (1 call!). The original Cairo example code took 10 calls, as opposed to one RenderShape. The latter one could be cached completely and might only require another DMA command reusing the cached commands from the previous run...

A more sophisticated example is the Water version of df_matrix.