none
directshow filter with kinect device

    Question

  • The kinect device can act like a camera but I was wondering if its possible to create a capture filter since its libraries can be accessed in c++?

    I would like to know where i can start on this task. I know how to open the device in code using the kinect sdk but how do I capture the video and just route it to the filter so other applications can use the kinects rgb camera as a normal camera device and record video?


    Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth. - "Sherlock holmes" "speak softly and carry a big stick" - theodore roosevelt. Fear leads to anger, anger leads to hate, hate leads to suffering - Yoda

    Monday, March 12, 2012 9:54 PM

Answers

  • The Thinker wrote:
    >
    >I could give help you with kinect code samples in c++ that open the feed
    >it seems you just have to open the stream and enable the rgb feed with
    >the current sdk.  if you can tell me how I can pass the video feed from
    >the kinect to a capture program the correct way or if activemovie was
    >correct ...
     
    You said it is recognized in Skype, so you must already have a source
    filter of some kind.  Usually, a capture source will be a "push source",
    based on the CPushSource sample code in the DirectShow SDK.  In such a
    filter, the lowest level worker function is the video out pin FillBuffer
    function.  It gets called with an IMediaSample that needs to be filled. You
    copy data in there and return it.
     
    >if you were to help  I could easily find code to open the video stream/feed
    >on the kinect because thats basic kinect code  opening the video stream
    >and should be easy but I think he left out the type of quality of the
    >stream like 1200 * 600 (not actual size but maybe he left out that
    >property which you can set.)
     
    The Kinect camera is 640x480.  That's the only size it feeds.  It can do
    either RGB32 or a YUV 4:2:2 format.
     
    >Biggest questions Im trying to get answered are:
    >
    >1. Setup the stream correctly in the filter (excludes kinect code which
    >is already in the code. It should have a statement to load the rgbstream
    >which is color and I dont remember the author of the thread putting in
    >a specific size for the video but I could easily find any reference
    >materials you need) so that the end user program will recoginize it
    >as a capture device/source.
     
    In a push source filter, the output pin's GetMediaType function fills in a
    CMediaType structure that describes the format(s) you support.  It gets
    call for format 0, then format 1, etc.  You return formats until you run
    out.  If you only have one, you fail the second call.
     
    You need to call SetType (MEDIATYPE_Video), SetSubtype (either
    MEDIASUBTYPE_RGB32 or one of the YUV formats), SetFormatType (probably
    FORMAT_VideoInfo2), and you need to set up a VIDEOINFOHEADER2 with the
    actual size of the frames you will ship.  That includes a DIB header that
    you fill in.
     
    >2. Possible refinements to registry like you said that the code above
    >would probably need to be recognized as a camcorder type capture device
    >and receive the rgbvideo from kinect (easily done but mostly settings
    >I should know about so I can look them up).
     
    YOU SAID IT WAS BEING RECOGNIZED BY SKYPE, didnt you?  If so, then this
    task is done.  If you're in the list of video capture devices, that's what
    you want.
     
    You may want to test with AMCap or Graphedt first -- those are simpler
    environments.
    --
    Tim Roberts, timr@probo.com
    Providenza & Boekelheide, Inc.
     

    Tim Roberts, VC++ MVP Providenza & Boekelheide, Inc.
    • Marked as answer by The Thinker Monday, March 19, 2012 3:16 PM
    Sunday, March 18, 2012 12:55 AM
  • Woohoo!! I think I figured it out!!!!

    I was having the same problem as you: worked in graphedit and amcap, but it bombed in google+/skype/etc.  Probably because I had installed the Kinect SDK v1.5 instead of that old "Beta 2" version.

    I noticed that the VideoInfoHeader didn't have the AvgTimePerFrame or dwBitRate fields set.  As soon as I set those - shazam!

    I compiled with KinectSDK version 1.5, VS 2010, and WindowsSDK 7.1.

    I realize that this thread is a few months old, but if you're around, give it a try and let me know if this "OrangeMod" version works for you too. 

    http://tachyon.zapto.org/foo/OrangeMod/

    --luke

    • Marked as answer by The Thinker Monday, September 3, 2012 8:54 PM
    Monday, September 3, 2012 8:39 AM

All replies

  • The Thinker wrote:
    >
    >The kinect device can act like a camera but I was wondering if its possible
    >to create a capture filter since its libraries can be accessed in c++?
     
    --
    Tim Roberts, timr@probo.com
    Providenza & Boekelheide, Inc.
     

    Tim Roberts, VC++ MVP Providenza & Boekelheide, Inc.
    Wednesday, March 14, 2012 5:29 AM
  • Well he did post the code link in that thread but can you help me refine possible problems with his code? It was working before in beta 2 but im guessing he did something wrong because now it acts like its just loading in skype or other programs but loads fine in amcap from everyones tests. Is their anything you can help me with tim that would improve his filter since source is available?

    It looks like its load like a dll component but he uses AMovie in his code is this the proper code to use for creating a filter or should a filter in windows 7 be created another way?


    Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth. - "Sherlock holmes" "speak softly and carry a big stick" - theodore roosevelt. Fear leads to anger, anger leads to hate, hate leads to suffering - Yoda

    Wednesday, March 14, 2012 12:07 PM
  • The Thinker wrote:
    >
    >Well he did post the code link in that thread but can you help me refine
    >possible problems with his code? It was working before in beta 2 but im
    >guessing he did something wrong because now it acts like its just
    >loading in skype or other programs but loads fine in amcap from
    >everyones tests.
     
    Does Skype recognize it as a capture source?  The largest part of the
    difficulty is getting the registry set up so that it can be recognized as a
    capture source.
     
    If it's getting recognized, then there must be something in the format.  A
    debugger will help.
     
    >It looks like its load like a dll component but he uses AMovie in his code
    >is this the proper code to use for creating a filter or should a filter in
    >windows 7 be created another way?
     
    "AMovie", short for ActiveMovie, was the original name of the technology
    that was eventually released as DirectShow.  The means for registering
    filters has not changed.
    --
    Tim Roberts, timr@probo.com
    Providenza & Boekelheide, Inc.
     

    Tim Roberts, VC++ MVP Providenza & Boekelheide, Inc.
    Friday, March 16, 2012 4:48 AM
  • It does recognize it still as a capture source in skype and microsoft expression but I get nothing but rotating circles i skype and nothing in expression but it hanging on me. I could give help you with kinect code samples in c++ that open the feed it seems you just have to open the stream and enable the rgb feed with the current sdk.  if you can tell me how I can pass the video feed from the kinect to a capture program the correct way or if activemovie was correct but the person needed to pass more information this time and so it didnt work please tell me tim what would help me the most.  As said above that thread has the source code download in c++ or if i didnt mention it was c++ i just did.  if you were to help  I could easily find code to open the video stream/feed on the kinect because thats basic kinect code  opening the video stream and should be easy but I think he left out the type of quality of the stream like 1200 * 600 (not actual size but maybe he left out that property which you can set.)

    Biggest questions Im trying to get answered are:

    1. Setup the stream correctly in the filter (excludes kinect code which is already in the code. It should have a statement to load the rgbstream which is color and I dont remember the author of the thread putting in a specific size for the video but I could easily find any reference materials you need) so that the end user program will recoginize it as a capture device/source.

    2. Possible refinements to registry like you said that the code above would probably need to be recognized as a camcorder type capture device and receive the rgbvideo from kinect (easily done but mostly settings I should know about so I can look them up).

    P.S. if you access the thread link above try downloading the source before logging back in again and you it will not take so long to access the thread.


    Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth. - "Sherlock holmes" "speak softly and carry a big stick" - theodore roosevelt. Fear leads to anger, anger leads to hate, hate leads to suffering - Yoda



    • Edited by The Thinker Friday, March 16, 2012 12:42 PM
    Friday, March 16, 2012 12:27 PM
  • Needed to post this so you can understand how it the sdk works better because it has directshow funtionality built into the sdk but you have to use proper parameters but I know with a little work myself I could get you a stream but Im guessing if I post this and partial code you might be able to see where im going with this:

    Kinect for Windows Architecture



    The SDK provides a sophisticated software library and tools to help
    developers use the rich form of Kinect-based natural input, which senses and
    reacts to real-world events.


    The Kinect Sensor and associated software library interact with your
    application, as shown in Figure 1.


    Figure 1.  Hardware and software interaction with an
    application

    The components of the SDK are shown in Figure 2.


    Figure 2.  SDK Architecture


    These components include the following:



    Kinect hardware

    The hardware components, including the Kinect Sensor and the USB hub, through
    which the sensor is connected to the computer.


    Kinect drivers

    The Windows drivers for the Kinect Sensor, which are installed as part of the
    SDK setup process as described in this document. The Kinect drivers support:



    • The Kinect Sensor's microphone array as a kernel-mode audio device that you
      can access through the standard audio APIs in Windows.
    • Streaming image and depth data.
    • Device enumeration functions that enable an application to use more than one
      Kinect Sensor that is connected to the computer.

    KinectAudio DirectX Media Object (DMO)
    The Kinect DMO that extends the microphone array support to expose
    beam-forming and source localization functionality.
    Windows 7 standard APIs
    The audio, speech, and media APIs in Windows 7, as described in the Windows
    7 SDK and the Microsoft Speech SDK.

    The NUI API



    The NUI API is the core of the Kinect for Windows API. It supports
    fundamental image and device management features, including the following:



    • Access to the Kinect Sensors that are connected to the computer.
    • Access to image and depth data streams from the Kinect Sensor.
    • Delivery of a processed version of image and depth data to support skeletal
      tracking.

    This SDK includes C++ and C# versions of the SkeletalViewer sample.
    SkeletalViewer shows how to use the NUI API in an application that captures data
    from the NUI camera, uses skeletal images, and processes sensor data. For more
    information, see "Skeletal Viewer Walkthrough" on the SDK website.


    NUI API Initialization



    The Kinect drivers support the use of multiple Kinect Sensors on a single
    computer. The NUI API includes functions that enumerate the sensors, so that you
    can determine how many Kinect Sensors are connected to the computer, get the
    name of a particular sensor, and individually open and set streaming
    characteristics for each sensor.


    Although the SDK supports an application using multiple Kinect Sensors, only
    one application can use each Kinect Sensor at any given time.


    Kinect Sensors: Enumeration and Access


    C++ and managed code applications enumerate the
    available Kinect Sensors, open a sensor, and initialize the NUI API in one of
    the following ways:

    To initialize the NUI API, and use only one Kinect Sensor
    in a C++ application





    1. Call NuiInitialize. This function initializes
      the first instance of a Kinect Sensor device on the system.



    2. Call other NUI functions to stream
      image and skeleton data and manage the cameras.



    3. Call NuiShutdown when use of the Kinect Sensor
      is complete.



    To initialize the NUI API and use more than one Kinect
    Sensor in a C++ application





    1. Call NuiGetSensorCount to determine how
      many sensors are available.



    2. Call NuiCreateSensorByIndex to
      create an instance for each Kinect Sensor that the application uses. This
      function returns an INuiSensor
      Interface
      interface pointer for the instance.



    3. Call INuiSensor::NuiInitialize
      Method
      to initialize the NUI API for the Kinect Sensor.



    4. Call other methods on the INuiSensor Interface interface
      to stream image and skeleton data and manage the Kinect Sensor.



    5. Call INuiSensor::NuiShutdown
      Method
      on an instance of a Kinect Sensor to close the NUI API when use of
      that sensor is complete.



    To initialize the NUI API and use one or more Kinect
    Sensors in managed code





    1. Get a sensor from a KinectSensor
      Class
      .

        KinectSensor sensor = (from sensorToCheck in KinectSensor.KinectSensors 
                                where sensorToCheck.Status == KinectStatus.Connected 
                                  select sensorToCheck).FirstOrDefault();
      


    2. Call KinectSensor.Start Method to
      initialize the NUI API for the Kinect Sensor.



    3. Call additional methods in the managed interface to stream image and skeleton
      data and to manage the Kinect Sensor.



    4. Call KinectSensor.Stop Method when
      use of the Kinect Sensor is complete.


    Initialization Options



    The NUI API processes data from the Kinect Sensor through a multistage
    pipeline. At initialization, the application specifies the subsystems that it
    uses, so that the runtime can start the required portions of the pipeline. An
    application can choose one or more of the following options:




    Color
    The application streams color image data from the sensor.
    Depth
    The application streams depth image data from the sensor.
    Depth and player index
    The application streams depth data from the sensor and requires the player
    index that the skeleton tracking engine generates.
    Skeleton
    The application uses skeleton position data.

    These options determine the valid stream types and resolutions for the
    application. For example, if an application does not indicate at initialization
    of the NUI API that it uses depth, it cannot later open a depth
    stream.


    NUI Image Data Streams: An Overview



    The NUI API provides the means to modify settings for the Kinect Sensor, and
    it enables you to access image data from the sensor.


    Stream data is delivered as a succession of still-image frames. When
    initializing the NUI API, the application identifies the steams it will use. It
    then opens those streams with additional stream-specific details, including
    stream resolution, image type, and the number of buffers that the runtime should
    use to store incoming frames. If the runtime fills all the buffers before the
    application retrieves and releases a frame, the runtime discards the oldest
    frame and reuses that buffer. As a result, it is possible for frames to be
    dropped. An application can request up to four buffers; two is adequate for most
    usage scenarios.


    An application has access to the following kinds of image data from the
    Kinect Sensor:



    • Color data
    • Depth data
    • Player segmentation data

    Color Image Data



    Color data is available in the following two formats:




    RGB color
    RGB color provides 32-bit, linear X8R8G8B8-formatted color bitmaps, in sRGB
    color space. To work with RGB data, an application should specify a color or
    color_YUV image type when it opens the stream.
    YUV color
    YUV color provides 16-bit, gamma-corrected linear UYVY-formatted color
    bitmaps, where the gamma correction in YUV space is equivalent to sRGB gamma in
    RGB space. Because the YUV stream uses 16 bits per pixel, this format uses less
    memory to hold bitmap data and allocates less buffer memory the stream is
    opened. To work with YUV data, your application should specify the raw YUV image
    type when it opens the stream. YUV data is available only at the 640×480
    resolution and only at 15 FPS.

    Both color formats are computed from the same camera data, so that the YUV
    data and RGB data represent the same image. Choose the data format that is most
    convenient given your application's implementation.


    The Kinect Sensor uses a USB connection to pass data to the PC, and that
    connection provides a limited amount of bandwidth. The Bayer color image data
    that the sensor returns at 1280×1024 is compressed and converted to RGB before
    transmission to the runtime. The runtime then decompresses the data before it
    passes the data to your application. The use of compression makes it possible to
    return color data at frame rates as high as 30 FPS, but the algorithm that is
    used leads to some loss of image fidelity.


    Depth Data



    The depth data stream provides frames in which each pixel contains the
    Cartesian distance (in millimeters) from the camera plane to the nearest object
    at that particular x and y coordinate in the depth sensor's field
    of view. There are two possible ranges for depth data: the default range and the
    near range as determined by the values in DepthRange Enumeration. Use DepthImageFormat Enumeration to choose the
    data format.


    Applications can process data from a depth stream to support various custom
    features, such as tracking users' motions and identifying background objects to
    ignore during play.


    Each pixel in the depth stream uses 13 bits for depth data and 3 bits to
    identify a player. A depth data value of 0 indicates that no depth data is
    available at that position because all the objects were either too close to the
    camera or too far away from it. When skeleton tracking is disabled, the 3 bits
    that identify a player are set to 0.


    Player Segmentation Data



    The Kinect system processes sensor data to identify up to six human figures in front of the Kinect Sensor and then creates the player segmentation map. This map is a bitmap in which the pixel values correspond to the player index of the person in the field of view who is closest to the camera, at that pixel position. Players can be tracked or non-tracked: only tracked players have complete skeletal information with the spatial position of 20 body joints. A maximum of 2 players can be tracked at any time by the Kinect system. Your application can choose which players to track, or it can allow the system to choose 2 by default.


    Although the player segmentation data is a separate logical stream, in
    practice the depth data and player segmentation data are merged into a single
    frame:



    • The 13 high-order bits of each pixel represent the distance from the depth
      sensor to the closest object, in millimeters.
    • The 3 low-order bits of each pixel represent the player index of the tracked
      player who is visible at the pixel's x and y coordinates. These
      bits are treated as an integer value and are not used as flags in a bit field.

    A player index value of zero indicates that no player was found at that
    location. Values one and two identify players. Applications commonly use player
    segmentation data as a mask to isolate specific users or regions of interest
    from the raw color and depth images.


    Retrieving Image Information



    Application code gets the latest frame of image data by calling a frame
    retrieval method and passing a buffer. If the latest frame of data is ready, it
    is copied into the buffer. If your code requests frames of data faster than new
    frames are available, you can choose whether to wait for the next frame or to
    return immediately and try again later. The NUI Image Camera API never provides
    the same frame of data more than once.


    Applications can use either of the following two usage models: polling
    or event.


    Polling Model



    The polling model is the simplest option for reading data frames. First, the
    application code opens the image stream. It then requests a frame and specifies
    how long to wait for the next frame of data (between 0 and an infinite number of
    milliseconds). The request method returns when a new frame of data is ready or
    when the wait time expires, whichever comes first. Specifying an infinite wait
    causes the call for frame data to block and to wait as long as necessary for the
    next frame.


    When the request returns successfully, the new frame is ready for processing.
    If the time-out value is set to zero, the application code can poll for
    completion of a new frame while it performs other work on the same thread. A C++
    application calls NuiImageStreamOpen to
    open a color or depth stream and omits the optional event. To poll for color and
    depth frames, a C++ application calls NuiImageStreamGetNextFrame
    and a C# application calls ColorImageStream.OpenNextFrame
    Method
    .


    Event Model



    The event model supports the ability to integrate retrieval of a skeleton
    frame into an application engine with more flexibility and more accuracy.


    In this model, C++ code passes an event handle to NuiImageStreamOpen.
    When a new frame of image data is ready, the event is signaled. Any waiting
    thread wakes and gets the frame of skeleton data by calling NuiImageStreamGetNextFrame.
    During this time, the event is reset by the NUI Image Camera API.


    C# code uses the event model by hooking a KinectSensor.ColorFrameReady
    Event
    or similar event to an appropriate event handler. When a new frame of
    data is ready, the event is signaled and the handler runs and calls ColorImageStream.OpenNextFrame
    Method
    to get the frame.


    Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth. - "Sherlock holmes" "speak softly and carry a big stick" - theodore roosevelt. Fear leads to anger, anger leads to hate, hate leads to suffering - Yoda

    Friday, March 16, 2012 12:51 PM
  • The Thinker wrote:
    >
    >I could give help you with kinect code samples in c++ that open the feed
    >it seems you just have to open the stream and enable the rgb feed with
    >the current sdk.  if you can tell me how I can pass the video feed from
    >the kinect to a capture program the correct way or if activemovie was
    >correct ...
     
    You said it is recognized in Skype, so you must already have a source
    filter of some kind.  Usually, a capture source will be a "push source",
    based on the CPushSource sample code in the DirectShow SDK.  In such a
    filter, the lowest level worker function is the video out pin FillBuffer
    function.  It gets called with an IMediaSample that needs to be filled. You
    copy data in there and return it.
     
    >if you were to help  I could easily find code to open the video stream/feed
    >on the kinect because thats basic kinect code  opening the video stream
    >and should be easy but I think he left out the type of quality of the
    >stream like 1200 * 600 (not actual size but maybe he left out that
    >property which you can set.)
     
    The Kinect camera is 640x480.  That's the only size it feeds.  It can do
    either RGB32 or a YUV 4:2:2 format.
     
    >Biggest questions Im trying to get answered are:
    >
    >1. Setup the stream correctly in the filter (excludes kinect code which
    >is already in the code. It should have a statement to load the rgbstream
    >which is color and I dont remember the author of the thread putting in
    >a specific size for the video but I could easily find any reference
    >materials you need) so that the end user program will recoginize it
    >as a capture device/source.
     
    In a push source filter, the output pin's GetMediaType function fills in a
    CMediaType structure that describes the format(s) you support.  It gets
    call for format 0, then format 1, etc.  You return formats until you run
    out.  If you only have one, you fail the second call.
     
    You need to call SetType (MEDIATYPE_Video), SetSubtype (either
    MEDIASUBTYPE_RGB32 or one of the YUV formats), SetFormatType (probably
    FORMAT_VideoInfo2), and you need to set up a VIDEOINFOHEADER2 with the
    actual size of the frames you will ship.  That includes a DIB header that
    you fill in.
     
    >2. Possible refinements to registry like you said that the code above
    >would probably need to be recognized as a camcorder type capture device
    >and receive the rgbvideo from kinect (easily done but mostly settings
    >I should know about so I can look them up).
     
    YOU SAID IT WAS BEING RECOGNIZED BY SKYPE, didnt you?  If so, then this
    task is done.  If you're in the list of video capture devices, that's what
    you want.
     
    You may want to test with AMCap or Graphedt first -- those are simpler
    environments.
    --
    Tim Roberts, timr@probo.com
    Providenza & Boekelheide, Inc.
     

    Tim Roberts, VC++ MVP Providenza & Boekelheide, Inc.
    • Marked as answer by The Thinker Monday, March 19, 2012 3:16 PM
    Sunday, March 18, 2012 12:55 AM
  • Thanks tim that makes a lot more sense then the hardware dsf talks. Should have mentioned that my dream is game development so im trying to learn 3d concepts and about video loading and what your saying makes sense more then the hardware items.

    I will post this for scott in case he figures out a solution before me.

    But from what im reading I fill IMediaSample with my image data and set the videos size and it will work? Other then that one question i forgot to ask how would I set options that show up in the program using the filter (like you said its possible to change resolution but what if i wanted to do that or move the kinect camera up and down which would be simple tasks in the program)?


    Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth. - "Sherlock holmes" "speak softly and carry a big stick" - theodore roosevelt. Fear leads to anger, anger leads to hate, hate leads to suffering - Yoda

    Monday, March 19, 2012 3:15 PM
  • The Thinker wrote:
    >
    >But from what im reading I fill IMediaSample with my image data and set
    >the videos size and it will work?
     
    The size of the video is set by negotiation when the pins are connected and
    remains unchanged from that point on.  Yes, you copy the frame data to the
    data pointer in the IMediaSmple and set the number of bytes that you wrote.
     
    >Other then that one question i forgot to ask how would I set options that
    >show up in the program using the filter (like you said its possible to
    >change resolution but what if i wanted to do that or move the kinect
    >camera up and down which would be simple tasks in the program)?
     
    You have to have documentation on the device to do that.  That is a custom
    feature provided by a custom interface.  You can't discover what it is
    without documentation.
    --
    Tim Roberts, timr@probo.com
    Providenza & Boekelheide, Inc.
     

    Tim Roberts, VC++ MVP Providenza & Boekelheide, Inc.
    Tuesday, March 20, 2012 3:51 AM
  • kinectsensor.cameraelevationangle is the property i think to set camera elevation in c++ and managed code (vb.net and c#) but im wondering how to expose through the filter this property or at least pass values to the filter from capture program to set this property like in a camera properties dialog box to set camera options or is this question moving more towards driver related.


    Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth. - "Sherlock holmes" "speak softly and carry a big stick" - theodore roosevelt. Fear leads to anger, anger leads to hate, hate leads to suffering - Yoda

    Tuesday, March 20, 2012 7:15 PM
  • The Thinker wrote:
    >
    >kinectsensor.cameraelevationangle is the property i think to set camera
    >elevation in c++ and managed code (vb.net and c#) but im wondering how
    >to expose through the filter this property or at least pass values to
    >the filter from capture program to set this property like in a camera
    >properties dialog box to set camera options or is this question moving
    >more towards driver related.
     
    I don't know what kinectsensor.cameraelevationangle means.  That must be a
    property on an interface somewhere.
     
    Your DirectShow filter exposes properties to its client application by
    defining its own COM interface and implementing it.  You can add whatever
    interfaces you want.
     
    Adding property pages is more complicated, but there are examples in the
    DirectShow section of the Windows SDK that shows how to do that.
    --
    Tim Roberts, timr@probo.com
    Providenza & Boekelheide, Inc.
     

    Tim Roberts, VC++ MVP Providenza & Boekelheide, Inc.
    Thursday, March 22, 2012 3:54 AM
  • kinectsensensor is a reference to controlling and receiving data from the kinect sensor and or accessing it. cameraelevationangle sets the kinects elevation angle but mostly just want to set kinect elevation angle using this property and resolution since kinect supports different resolutions. 

    P.S.

    Your DirectShow filter exposes properties to its client application by
    defining its own COM interface and implementing it.  You can add whatever

    interfaces you want.

    Does this mean i could use the checkmark in managed applications to make it appear available to com apps like the picture below and it will appear inside my app too?

    Or is this more for dll files?


    Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth. - "Sherlock holmes" "speak softly and carry a big stick" - theodore roosevelt. Fear leads to anger, anger leads to hate, hate leads to suffering - Yoda


    • Edited by The Thinker Thursday, March 22, 2012 7:17 PM
    Thursday, March 22, 2012 7:16 PM
  • Tim wrote:
    >>P.S. Your DirectShow filter exposes properties to its client application by
    >>defining its own COM interface and implementing it.  You can add whatever
    >>interfaces you want.
     
    The Thinker wrote:
     
    >Does this mean i could use the checkmark in managed applications to make it
    >appear available to com apps like the picture below and it will appear
    >inside my app too?
     
    Well, it's an odd question.  You would be writing a DirectShow filter -- a
    DLL that is loaded into a DirectShow graph and hooked up to other
    DirectShow filters.  The COM services you offer would only be consumed by a
    DirectShow graph.  You COULD load your filter into other applications, but
    they wouldn't have the infrastructure to do anything with it.
    --
    Tim Roberts, timr@probo.com
    Providenza & Boekelheide, Inc.
     

    Tim Roberts, VC++ MVP Providenza & Boekelheide, Inc.
    Saturday, March 24, 2012 4:19 AM
  • Well thanks it was worth a question to ask. How could i expose the cameraelevationangle property to my enduser application by way of a possible dialog box I just need to allow the user to see my kinect camera in app and then when necessary set properties from the kinect sdk libraries from a dialog box. For instance, when in properties for kinect camera i have a slider to change the kinect cameraelevationangle property which i set for my camera to be used. 

    Whats the simplest way to expose this from filter to enduser app? Other then that you explained everything else so I guess I will just go if you can easily answer this without going through tons of code.

    BTW, thanks for all you help lately. I appreciate it!  ;)


    Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth. - "Sherlock holmes" "speak softly and carry a big stick" - theodore roosevelt. Fear leads to anger, anger leads to hate, hate leads to suffering - Yoda

    Sunday, March 25, 2012 11:50 PM
  • Never mind i got the link correct but still no video im beginning to think something wasnt set correctly in his kinect source section but he could have left out a piece in the other section. His filter worked on and off in skype so im guessing he wasnt doing something properly.

    Possibly could you take a look above at his code tim in the link it looks correct or at least the filter code does? I might be mistaken and in the kinect code you just intialize the device and create the device with the properties you want and just feed video data to filter. If it times you out I could post onto codeplex or google his source that way you could download it and see whats wrong. I think that it would make a great addition to the kinect sdk when it gets done and everyone will use it.


    Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth. - "Sherlock holmes" "speak softly and carry a big stick" - theodore roosevelt. Fear leads to anger, anger leads to hate, hate leads to suffering - Yoda



    Tuesday, April 3, 2012 5:25 PM
  • BTW tim I was talking about property pages above when I meant setting the camera angle for kinect and that was just the kinect sdks property for setting the cameras angle.  I think I could probably just take video code from kinect sdk and transfer the video data to filter but do you have an example of how I can pass data to IMediaSample correctly from a seperate cpp file Im just wondering what I might need to do or do I simply just call class.method(imediasampledata here).


    Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth. - "Sherlock holmes" "speak softly and carry a big stick" - theodore roosevelt. Fear leads to anger, anger leads to hate, hate leads to suffering - Yoda. Blog - http://jefferycarlsonblog.blogspot.com/


    • Edited by The Thinker Thursday, April 19, 2012 2:58 PM
    Thursday, April 19, 2012 2:53 PM
  • The Thinker wrote:
    >
    >...but do you have an example of how I can pass data to IMediaSample correctly
    >from a seperate cpp file Im just wondering what I might need to do or do I
    >simply just call class.method(imediasampledata here).
     
    I don't understand the question.  You don't "pass data to" an IMediaSample.
    An IMediaSample is just a container that holds a pointer to data, a size,
    and a couple of timestamps.  There are methods to set all of those values.
    --
    Tim Roberts, timr@probo.com
    Providenza & Boekelheide, Inc.
     

    Tim Roberts, VC++ MVP Providenza & Boekelheide, Inc.
    Sunday, April 22, 2012 2:39 AM
  • Heres the imediasample from the code I was talking about:

    HRESULT CVCamStream::FillBuffer(IMediaSample *pms)
    {
        REFERENCE_TIME rtNow;
        
        REFERENCE_TIME avgFrameTime = ((VIDEOINFOHEADER*)m_mt.pbFormat)->AvgTimePerFrame;
        rtNow = m_rtLastTime;
        m_rtLastTime += avgFrameTime;
        pms->SetTime(&rtNow, &m_rtLastTime);
        pms->SetSyncPoint(TRUE);
        BYTE *pData;
        long lDataLen;
        pms->GetPointer(&pData);
        lDataLen = pms->GetSize();
    	if (m_pParent->m_kinected)
    	{
    		m_pParent->m_kinectCam.Nui_GetCamFrame(m_pParent->m_pBuffer, m_pParent->m_pBufferSize);
    		int srcPos = 0;
    		int destPos = 0;
    		for (int y = 0; y < 480; y++)
    		{
    			for (int x = 0; x < 640; x++)
    			{
    				if (destPos < lDataLen-3)
    				{
    					if (g_flipImage)
    						srcPos = (x * 4) + ((479-y) * 640 * 4);
    					else
    						srcPos = ((639 - x) * 4) + ((479-y) * 640 * 4);
    					pData[destPos++] = m_pParent->m_pBuffer[srcPos];
    					pData[destPos++] = m_pParent->m_pBuffer[srcPos+1];
    					pData[destPos++] = m_pParent->m_pBuffer[srcPos+2];
    				}
    			}
    		}
    	}
    	else
    	{
    		for (int i = 0; i < lDataLen; ++i)
    			pData[i] = rand();
    	}
        return NOERROR;
    } // FillBuffer

    As you can see it looks like this line gets one frame at a time from the kinect stream opened using a sub from another cpp file:

    m_pParent->m_kinectCam.Nui_GetCamFrame(m_pParent->m_pBuffer, m_pParent->m_pBufferSize);

    Heres the code of that sub:

    void KinectCam::Nui_GetCamFrame(BYTE *frameBuffer, int frameSize)
    {
        const NUI_IMAGE_FRAME *pImageFrame = NULL;
    	WaitForSingleObject(m_hNextVideoFrameEvent, INFINITE);
        HRESULT hr = NuiImageStreamGetNextFrame(
            m_pVideoStreamHandle,
            0,
            &pImageFrame );
        if( FAILED( hr ) )
        {
            return;
        }
        INuiFrameTexture *pTexture = pImageFrame->pFrameTexture;
        NUI_LOCKED_RECT LockedRect;
        pTexture->LockRect( 0, &LockedRect, NULL, 0 );
        if( LockedRect.Pitch != 0 )
        {
            BYTE * pBuffer = (BYTE*) LockedRect.pBits;
    		memcpy(frameBuffer, pBuffer, frameSize);
        }
        NuiImageStreamReleaseFrame( m_pVideoStreamHandle, pImageFrame );
    }

    Okay my question is: is data being passed correctly onto the filter or is something amiss in the code above?

    If thats not it I will have to I guess maybe I will start another thread and post all of the code from all cpp files.

    If your ever in ky I will have to pay for lunch because you've been very helpful ;)


    Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth. - "Sherlock holmes" "speak softly and carry a big stick" - theodore roosevelt. Fear leads to anger, anger leads to hate, hate leads to suffering - Yoda. Blog - http://jefferycarlsonblog.blogspot.com/

    Tuesday, April 24, 2012 5:10 PM
  • The Thinker wrote:
    >
    >Heres the imediasample from the code I was talking about:
    >...
    >As you can see it looks like this line gets one frame at a time from
    >the kinect stream opened using a sub from another cpp file:
     
    Yes.
     
    >Okay my question is: is data being passed correctly onto the filter or is
    >something amiss in the code above?
     
    It looks reasonable enough, although it's not very efficient.  It assumes
    the width is always 640 and the height is always 480, it assumes the output
    pin will always be RGB24 and the Kinect will always deliver RGB32.
     
    >If thats not it I will have to I guess maybe I will start another thread
    >and post all of the code from all cpp files.
     
    Why, what DO you see?  Have you stepped into this code with a debugger to
    prove that you are getting good images from the Kinect API?
    --
    Tim Roberts, timr@probo.com
    Providenza & Boekelheide, Inc.
     

    Tim Roberts, VC++ MVP Providenza & Boekelheide, Inc.
    Thursday, April 26, 2012 3:58 AM
  • problem is: in amcap it works just fine but expression encoder and skype dont like the filter at all. Do they require a special setting or something or do I require more stream settings in code to get them working? I found out that the author of the filter is using the directshow forum VCAM filter sample with some minor modifications.  I have checked out the settings for the filter but cant seem to figure out why it only works in amcap.


    Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth. - "Sherlock holmes" "speak softly and carry a big stick" - theodore roosevelt. Fear leads to anger, anger leads to hate, hate leads to suffering - Yoda. Blog - http://jefferycarlsonblog.blogspot.com/



    • Edited by The Thinker Thursday, April 26, 2012 12:10 PM
    Thursday, April 26, 2012 12:01 PM
  • The Thinker wrote:
    >
    >problem is: in amcap it works just fine but expression encoder and
    >skype dont like the filter at all.
     
    Does that mean they don't have it in their list, or that you can choose it
    but there's no video?
    --
    Tim Roberts, timr@probo.com
    Providenza & Boekelheide, Inc.
     

    Tim Roberts, VC++ MVP Providenza & Boekelheide, Inc.
    Saturday, April 28, 2012 4:35 AM
  • I can choose it but theirs no video. In skype it acts like its loading now but no video appears.  In expression encoder it just acts like it going to crash because it eventually displays the no responding message. I could just get a kinect for windows to be sure of that but I think its a configuration error somewhere in code.

    Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth. - "Sherlock holmes" "speak softly and carry a big stick" - theodore roosevelt. Fear leads to anger, anger leads to hate, hate leads to suffering - Yoda. Blog - http://jefferycarlsonblog.blogspot.com/

    Saturday, April 28, 2012 12:15 PM
  • Woohoo!! I think I figured it out!!!!

    I was having the same problem as you: worked in graphedit and amcap, but it bombed in google+/skype/etc.  Probably because I had installed the Kinect SDK v1.5 instead of that old "Beta 2" version.

    I noticed that the VideoInfoHeader didn't have the AvgTimePerFrame or dwBitRate fields set.  As soon as I set those - shazam!

    I compiled with KinectSDK version 1.5, VS 2010, and WindowsSDK 7.1.

    I realize that this thread is a few months old, but if you're around, give it a try and let me know if this "OrangeMod" version works for you too. 

    http://tachyon.zapto.org/foo/OrangeMod/

    --luke

    • Marked as answer by The Thinker Monday, September 3, 2012 8:54 PM
    Monday, September 3, 2012 8:39 AM
  • Thanks i will try that out. Question will this work with expression encoder too? If it does I think you deserve a cookie. You will make many independent film makers  and video bloggers happy if you do!  Edit: nevermind it partially works in expression in encoder it works for a few seconds and then freezes up unlike in skype. Its so weird.

    Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth. - "Sherlock holmes" "speak softly and carry a big stick" - theodore roosevelt. Fear leads to anger, anger leads to hate, hate leads to suffering - Yoda. Blog - http://www.computerprofessions.co.nr



    • Edited by The Thinker Monday, September 3, 2012 8:55 PM
    Monday, September 3, 2012 8:45 PM
  • I didn't even know what expression encoder was until a googled it 20 minutes ago. I downloaded the trial version of Expression Encoder 4 and it did the same thing you described - froze after a couple of seconds.

    I was able to narrow it down to the line of code that is causing it:

    m_pParent->m_kinectCam.Nui_GetCamFrame(m_pParent->m_pBuffer, m_pParent->m_pBufferSize);

    in CVCamStream::FillBuffer in Vcam.cpp.

    I'm guessing it's a deadlock thing having to do with timing. Maybe EE4 is requesting frames at the different rate or something? The Nui_GetCamFrame() function is part of the Kinect SDK so even if I could find a way to debug the filter while using EE4, I wouldn't be able to step into that function.

    I haven't tried the filter with Skype, but Google+ works so it must be doing something differently. I'll play around with it a little, but I don't really have any good ideas on where to start.

    Monday, September 3, 2012 11:40 PM
  • i take that back - Nui_GetCamFrame() is part of KinectCam.cpp which isn't part of the Kinect SDK. It contains a WaitForSingleObject() call - I bet that's what's blocking it. I'll see if I can figure out why it's locking...
    Monday, September 3, 2012 11:44 PM
  • Well, I threw two hours at it and didn't really get anywhere. If anyone wants to pick up where I left off I'll write up what I found so far. My schedule is busy so I may not have time to look at it again for a while so these notes might later prove useful to me as well.

    In KinectCam::Nui_GetCamFrame() there was no UnlockRect() to match LockRect() for the texture. That wasn't causing the freezing, but it seems like it's a good idea to have it there.

    The fact that the first few frames are good tells me that it's not a colorspace or media negotiation problem. I pondered the idea that maybe some of the problem is because this filter is based on a push source, which according to Microsoft is different than a live source. Without further study, I don't know enough about the two paradigms to make a judgement.

    I disabled the event (which doesn't seem to do anything because it still works perfectly fine in graphedit and google+ without the event), and then set the timeout for NuiImageStreamGetNextFrame to 33 milliseconds. I then put in some code to log most of the Kinect SDK functions and ran it for 30 seconds with Google+, GraphStudio, and EE4.

    The logs are here: http://tachyon.zapto.org/foo/OrangeMod/

    I'm very happy that I made that log, because now my clueless factor has doubled.

    If you see something I don't, email me: lclemens@gmail.com

    Tuesday, September 4, 2012 2:30 AM
  • What do you mean by you "set the dwBitRate" fields? Did you have to modify the Kinect SDK source and recompile it or something? What is KinectCam.ax ? Did you have to wrap the Kinect in your own dshow capture source wrapper or was there already one in the SDK that you tweaked?

    Thanks.

    Related is this thread: http://social.msdn.microsoft.com/Forums/en-US/kinectsdk/thread/4ee6e7ca-123d-4838-82b6-e5816bf6529c


    • Edited by rogerdpack2 Wednesday, October 17, 2012 5:25 PM
    Wednesday, October 17, 2012 5:23 PM
  • This guy name Scott Orange wrote a directshow video capture source filter that he calls KinectCam.ax - his thread is located at that URL that you added in your post. He has the source code somewhere on there and that's what I was compiling, except I was using the latest Microsoft Kinect SDK instead of the beta version that he used.  Setting the dwBitRate field made it work for me with Google hangout - I use it all the time now.  I wasn't able to get it to work with Expression Encoder 4 though, which is what "The Thinker" is trying to do.

    You can get my version here: http://tachyon.zapto.org/foo/OrangeMod/ .

    --luke

    Wednesday, October 17, 2012 6:38 PM
  • skywalker it appears the newer v1.6 has better camera features.

    Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth. - "Sherlock holmes" "speak softly and carry a big stick" - theodore roosevelt. Fear leads to anger, anger leads to hate, hate leads to suffering - Yoda. Blog - http://www.computerprofessions.co.nr

    Thursday, October 18, 2012 1:03 PM
  • Hi.  I have Scott's source.  Would you post yours as well?
    Friday, October 19, 2012 4:31 PM
  • I have not made but minor changes at this point Illskywalker has made more progress then me on getting something working.


    Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth. - "Sherlock holmes" "speak softly and carry a big stick" - theodore roosevelt. Fear leads to anger, anger leads to hate, hate leads to suffering - Yoda. Blog - http://www.computerprofessions.co.nr

    Saturday, October 20, 2012 4:41 PM
  • As for my source code....

    The only thing I have working is that it now works under Google Hangout & Skype with the 1.5 SDK.  I was trying unsuccessfully to get it to work with Expression Encoder 4.  I'll give you the source that I have, but I should warn you that it currently spits out a log file of all the Kinect SDK functions called - I was using that to try and track down why Expression Encoder 4 isn't working. Also, I hard-coded everything to 640x480.

    I zipped my source here: http://tachyon.zapto.org/foo/OrangeMod/

    --luke

    Tuesday, October 23, 2012 5:59 PM
  • I think expression encoder must require some special settings for its filters or something because other cameras run fine compared to the filters i've tryed for kinect. I would check the expression encoder sdk docs illskywalker for different ways to program expression encoder. 

    Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth. - "Sherlock holmes" "speak softly and carry a big stick" - theodore roosevelt. Fear leads to anger, anger leads to hate, hate leads to suffering - Yoda. Blog - http://www.computerprofessions.co.nr



    • Edited by The Thinker Tuesday, October 23, 2012 7:22 PM
    Tuesday, October 23, 2012 7:20 PM
  • I am a Microsoft Employee and am working on my own Kinect DirectShow Capture Filter, and have run across it working poorly in Expression Encoder too. I'm on decent terms with the guys who wrote EE, and am now working with them to see why it is or isn't working. Obviously, I can't go and read other people's source code for a Kinect Capture Filter, but I can certainly share my own thoughts about what should or should not be done in somebody else's code, if you ask the right questions. So - ask away here in the forum and I'll try to answer questions. By the way, if the filter is hanging, it's a very good chance that any WaitForXXX is deadlocking your FillBuffer. You need to make sure to quit out of any wait when the filter's Stop( ) is called, no matter what.

    By the way, Expression Encoder has a weird "quirk" where it will call Run()  and Stop( ) on your filter several times in rapid succession before finally calling Run( ). This is certainly odd and more than likely will never be fixed. I don't know the reasoning behind it. But if your filter doesn't handle this gracefully, it will blow up EE or crash it.

    Another gotcha is that the latest Kinect SDK is taking a LONG time to call NuiShutdown. It may look like you are hung, but really it's just the kernel taking a LONG time to return to user mode.


    ~ Eric R., Senior Dev, MS Research You can probably figure out more about me if you try!

    Wednesday, April 24, 2013 6:28 AM