Name

    NV_bindless_multi_draw_indirect

Name Strings

    GL_NV_bindless_multi_draw_indirect

Contact

    Christoph Kubisch, NVIDIA (ckubisch 'at' nvidia.com)

Contributors

    Jeff Bolz, NVIDIA
    Piers Daniell, NVIDIA
    Markus Tavenrath, NVIDIA
    Eric Werness, NVIDIA
    Vincent Zhang, NVIDIA

Status

    Complete

Version

    Last Modified Date: July 3rd, 2015
    Revision: 3

Number

    OpenGL Extension #432

Dependencies

    NV_vertex_buffer_unified_memory is required.
    
    OpenGL 4.3 or ARB_multi_draw_indirect is required.

    The extension is written against the OpenGL 4.3 Specification, Core Profile.
    
Overview

    This extension combines NV_vertex_buffer_unified_memory and 
    ARB_multi_draw_indirect to allow the processing of multiple drawing
    commands, whose vertex and index data can be sourced from arbitrary 
    buffer locations, by a single function call.

    The NV_vertex_buffer_unified_memory extension provided a mechanism to 
    specify vertex attrib and element array locations using GPU addresses.
    Prior to this extension, these addresses had to be set through explicit
    function calls. Now the ability to set the pointer addresses indirectly
    by extending the GL_ARB_draw_indirect mechanism has been added.
    
    Combined with other "bindless" extensions, such as NV_bindless_texture and
    NV_shader_buffer_load, it is now possible for the GPU to create draw
    commands that source all resource inputs, which are common to change 
    frequently between draw calls from the GPU: vertex and index buffers, 
    samplers, images and other shader input data stored in buffers.
    

New Procedures and Functions

        void MultiDrawArraysIndirectBindlessNV(enum mode, 
                                               const void *indirect, 
                                               sizei drawCount, 
                                               sizei stride, 
                                               int vertexBufferCount);

        void MultiDrawElementsIndirectBindlessNV(enum mode, 
                                                 enum type, 
                                                 const void *indirect, 
                                                 sizei drawCount, 
                                                 sizei stride, 
                                                 int vertexBufferCount);

New Tokens

    None.

Additions to Chapter 10 of the OpenGL 4.3 (Core) Specification (Vertex 
Specification and Drawing Commands)

    Additions to Section 10.5, "Drawing Commands Using Vertex Arrays"

    After the description of MultiDrawArraysIndirect and before the introduction of
    DrawElementsOneInstance, insert the following on p.311:

        The command

        void MultiDrawArraysIndirectBindlessNV(enum mode, 
                                               const void *indirect, 
                                               sizei drawCount, 
                                               sizei stride, 
                                               int vertexBufferCount);

    behaves similar to MultiDrawArraysIndirect, except that <indirect> is
    treated as an array of <drawCount> DrawArraysIndirectBindlessCommandNV
    structures.
    
    It has the same effect as:

        typedef struct {
          GLuint   index;
          GLuint   reserved; 
          GLuint64 address;
          GLuint64 length;
        } BindlessPtrNV; 
    
        typedef struct {
          DrawArraysIndirectCommand   cmd;
          BindlessPtrNV               vertexBuffers[];
        } DrawArraysIndirectBindlessCommandNV;

        if (<mode> is invalid)
            generate appropriate error
        else {
            GLuint64 vtxAddresses[MAX_VERTEX_ATTRIBS];
            GLsizeiptr vtxLengths[MAX_VERTEX_ATTRIBS];
            for (i = 0; i < MAX_VERTEX_ATTRIBS) {
              GetIntegerui64i_vNV(VERTEX_ATTRIB_ARRAY_ADDRESS_NV,i,&vtxAddresses[i]);
              GetInteger64i_v(VERTEX_ATTRIB_ARRAY_LENGTH_NV ,i,&vtxLengths[i]);
            }
        
            const ubyte * ptr = (const ubyte *)indirect;
            for (i = 0; i < drawCount; i++) {
                const DrawArraysIndirectBindlessCommandNV* cmdNV = 
                              (DrawArraysIndirectBindlessCommandNV*)ptr;
                
                if ( cmdNV->cmd.instanceCount != 0 ){
                    for (n = 0; n < vertexBufferCount; n++){
                      BufferAddressRangeNV(VERTEX_ATTRIB_ARRAY_ADDRESS_NV,
                                           cmdNV->vertexBuffers[n].index
                                           cmdNV->vertexBuffers[n].address,
                                           cmdNV->vertexBuffers[n].length);
                    }

                    DrawArraysIndirect(mode,
                                       (DrawArraysIndirectCommand*)ptr);
                }
                if (stride == 0)
                {
                    ptr +=  sizeof(DrawArraysIndirectCommand) + 
                           (sizeof(BindlessPtrNV) * vertexBufferCount);
                } else
                {
                    ptr += stride;
                }
            }
            
            for (i = 0; i < MAX_VERTEX_ATTRIBS) {
              BufferAddressRangeNV(VERTEX_ATTRIB_ARRAY_ADDRESS_NV,i,vtxAddresses[i],vtxLengths[i]);
            }
        }

    <drawCount> must be positive, otherwise an INVALID_VALUE error will be
    generated.

    After the description of MultiDrawElementsIndirect and before the introduction
    of MultiDrawElementsBaseVertex, insert the following on p.316:

        The command

        void MultiDrawElementsIndirectBindlessNV(enum mode, 
                                                 enum type, 
                                                 const void *indirect, 
                                                 sizei drawCount, 
                                                 sizei stride, 
                                                 int vertexBufferCount);

    behaves similar to MultiDrawElementsIndirect, except that <indirect> is
    treated as an array of <drawCount> DrawElementsIndirectBindlessCommandNV
    structures.
    
    It has the same effect as:
    
        typedef struct {
          GLuint   index;
          GLuint   reserved; 
          GLuint64 address;
          GLuint64 length;
        } BindlessPtrNV; 
    
        typedef struct {
          DrawElementsIndirectCommand cmd;
          GLuint                      reserved; 
          BindlessPtrNV               indexBuffer;
          BindlessPtrNV               vertexBuffers[];
        } DrawElementsIndirectBindlessCommandNV;

        if (<mode> or <type> is invalid)
            generate appropriate error
        else {
            GLuint64   idxAddress;
            GLsizeiptr idxLength;
            GetIntegerui64vNV(ELEMENT_ARRAY_ADDRESS_NV,&idxAddress);
            GetInteger64v(ELEMENT_ARRAY_LENGTH_NV,&idxLength);
            
            GLuint64 vtxAddresses[MAX_VERTEX_ATTRIBS];
            GLsizeiptr vtxLengths[MAX_VERTEX_ATTRIBS];
            for (i = 0; i < MAX_VERTEX_ATTRIBS) {
              GetIntegerui64i_vNV(VERTEX_ATTRIB_ARRAY_ADDRESS_NV,i,&vtxAddresses[i]);
              GetInteger64i_v(VERTEX_ATTRIB_ARRAY_LENGTH_NV ,i,&vtxLengths[i]);
            }
            
            const ubyte * ptr = (const ubyte *)indirect;
            for (i = 0; i < primcount; i++) {
                const DrawElementsIndirectBindlessCommandNV* cmdNV = 
                              (DrawElementsIndirectBindlessCommandNV*)ptr;
                
                if ( cmdNV->cmd.instanceCount != 0 ) {
                    BufferAddressRangeNV(ELEMENT_ARRAY_ADDRESS_NV,
                                         0,
                                         cmdNV->indexBuffer.address,
                                         cmdNV->indexBuffer.length);
                                         
                    for (n = 0; n < vertexBufferCount; n++){
                      BufferAddressRangeNV(VERTEX_ATTRIB_ARRAY_ADDRESS_NV,
                                           cmdNV->vertexBuffers[n].index
                                           cmdNV->vertexBuffers[n].address,
                                           cmdNV->vertexBuffers[n].length);
                    }

                    DrawElementsIndirect(mode,
                                         type,
                                         (DrawElementsIndirectCommand*)ptr);
                }
                
                if (stride == 0)
                {
                    ptr +=  sizeof(DrawElementsIndirectCommand) + 
                            sizeof(GLuint) +
                            sizeof(BindlessPtrNV) + 
                           (sizeof(BindlessPtrNV) * vertexBufferCount);
                } else
                {
                    ptr += stride;
                }
            }
            
            BufferAddressRangeNV(ELEMENT_ARRAY_ADDRESS_NV,0,idxAddress,idxLengths);
            for (i = 0; i < MAX_VERTEX_ATTRIBS) {
              BufferAddressRangeNV(VERTEX_ATTRIB_ARRAY_ADDRESS_NV,i,vtxAddresses[i],vtxLengths[i]);
            }
        }

    Modifications to Section 10.3.10 (p. 305) "Indirect Commands in Buffer Objects"

    Modify both instances of "DrawArraysIndirect, DrawElementsIndirect,
    MultiDrawArraysIndirect and MultiDrawElementsIndirect" on
    p. 305 to read "DrawArraysIndirect, DrawElementsIndirect,
    MultiDrawArraysIndirect, MultiDrawElementsIndirect, 
    MultiDrawArraysIndirectBindlessNV and MultiDrawElementsIndirectBindlessNV".


Additions to the AGL/GLX/WGL Specifications

    None.

GLX Protocol

    None.

Errors

    INVALID_VALUE is generated by MultiDrawArraysIndirectBindlessNV or
    MultiDrawElementsIndirectBindlessNV if <drawCount> is negative.

    INVALID_VALUE is generated by MultiDrawArraysIndirectBindlessNV or
    MultiDrawElementsIndirectBindlessNV if <stride> is not a multipe of eight.
    
    INVALID_OPERATION is generated by MultiDrawArraysIndirectBindlessNV or
    MultiDrawElementsIndirectBindlessNV if the 
    VERTEX_ATTRIB_ARRAY_UNIFIED_NV client state is not enabled.
    
    INVALID_OPERATION is generated by MultiDrawElementsIndirectBindlessNV 
    if the ELEMENT_ARRAY_UNIFIED_NV client state is not enabled.

New State

    None.

New Implementation Dependent State

    None.

Issues

    1) BufferAddressRangeNV is defined to take GLsizeiptr as length.
       Should the length parameter inside the indirect structure be cast
       to GLsizeiptr when passed as argument?

    RESOLVED: When the indirect buffer is stored in host memory, the cast
    should be performed, otherwise, 64-bit should be assumed.

    2) BufferAddressRangeNV also covered fixed function vertex state, should
       this be exposed in this extension?

    NO, the named attributes aren't necessary, as it is very likely that 
    shaders are used to also handle the parameters of individual drawcalls.
    For fixed function shading the interleaving of related state changes
    with drawcalls would already be efficiently handled by regular
    MultiDraw or NV_vertex_buffer_unified_memory functionality.
    
    3) In which state should the vertex/index pointers be left after the 
       execution of the command?
       
    RESOLVED: All pointers should be reset to what was last
    specified by the client side using BufferAddressRangeNV.
    
    4) How does this extension relate to the newer NV_command_list?
    
    NV_command_list provides a more flexible draw indirect mechnanism. It
    can be more efficient if there are redundancies in the vertex and index
    buffer addresses, as it allows specifying addresses and drawcalls with
    different frequencies in the command stream.

Revision History

    Rev.    Date      Author    Changes
    ----  --------    --------  -----------------------------------------
     1                ckubisch  Internal revisions
     2    1/5/2015    ckubisch  bugfix, remove of "reserved" in
                                DrawArraysIndirectBindlessCommandNV
                                added reference to NV_command_list
     3    7/3/2015    ckubisch  bugfix, wrong math used in the "if (stride == 0)" 
                                pointer advancing. GLsizeiptr for length instead of
                                GLuint64