locked
IMFSourceReader : is it really possible to accurately seek and read samples from an MP3 source ? RRS feed

  • Question

  • When decoding PCM data from an MP3 from position 0 and in one go there is obviously no problem. However, when I try to seek to a certain position for reading some samples, no matter what I do, the seeking is never accurate and consequently, the produced output is incorrect.

    (My test is quite simple, i do read blocks of 44100 samples until EOF (and do seek between each call); then i do compare the produced output with the original file using an audio editor and a phase scope. The first block is always correct, but over time it drifts nearly about 20000 samples and it is clearly audible as pseudo delay effect.)

    According the docs, you have to account for that and that's what I did, for instance when seeking to some position, the exact position in buffer starts at offset 576. But when I do check with an audio editor, the returned data is 252 samples later ...

    I've also checked my formulas that converts between REFERENCE_TIME and samples but they don't seem to be the problem:

    int64 ReferenceTimeToSamples(REFERENCE_TIME time)
    {
    	return round((double)GetSampleRate() / 10000000 * time);
    }
    
    REFERENCE_TIME SamplesToReferenceTimeD(double samples, double sampleRate)
    {
    	return (REFERENCE_TIME)round((double)samples / sampleRate * 10000000);
    }

    Question :

    Is exact sample retrieval possible when randomly seeking within an MP3 file ? If yes, then what is the approach ?

    Thank you !


    • Edited by Aybe One Friday, March 6, 2015 2:26 PM
    Friday, March 6, 2015 2:22 PM

Answers

  • What you are experiencing is expected because an mp3 Layer III frame can be spread over several frames. For a certain bitrate/sampling_rate you can calculate the frame length. It is either ‘n’, or ‘n+1’, which depends on the padding bit. But, that doesn’t mean that all compressed data will fit in frame_size bytes. Also, sometimes compressed data takes less than frame_size bytes. Therefore, when frame needs to be “overfilled”, the extra data will be contained within next frame that can accept it. You can find more about this if you search for “bit reservoir”.

    So, what happens when you seek is that you are decoding a frame that is missing some data required to be decoded properly because that data is located in one or more previous frames.

    • Marked as answer by Aybe One Thursday, March 19, 2015 10:52 AM
    Wednesday, March 18, 2015 11:11 PM

All replies

  • What you are experiencing is expected because an mp3 Layer III frame can be spread over several frames. For a certain bitrate/sampling_rate you can calculate the frame length. It is either ‘n’, or ‘n+1’, which depends on the padding bit. But, that doesn’t mean that all compressed data will fit in frame_size bytes. Also, sometimes compressed data takes less than frame_size bytes. Therefore, when frame needs to be “overfilled”, the extra data will be contained within next frame that can accept it. You can find more about this if you search for “bit reservoir”.

    So, what happens when you seek is that you are decoding a frame that is missing some data required to be decoded properly because that data is located in one or more previous frames.

    • Marked as answer by Aybe One Thursday, March 19, 2015 10:52 AM
    Wednesday, March 18, 2015 11:11 PM
  • Thanks for the explanations !
    Thursday, March 19, 2015 10:52 AM