首页 文章

将视频生成的音频写入AVAssetWriterInput,音频卡顿

提问于
浏览
3

我正在iOS上的Unity应用程序生成视频 . 我正在使用iVidCap,它使用AVFoundation来做到这一点 . 那方面都很好 . 本质上,视频是通过使用纹理渲染目标并将帧传递给Obj-C插件来渲染的 .

现在我需要为视频添加音频 . 音频将是在特定时间发生的声音效果,也可能是某些背景声音 . 正在使用的文件实际上是Unity应用程序内部的资产 . 我可能会将这些写入手机存储然后生成AVComposition,但我的计划是避免这种情况,并将音频合成为浮点格式缓冲区(从音频剪辑获取音频是浮动格式) . 我可能稍后会做一些动态音频效果 .

几个小时后,我设法录制音频并播放视频...但它结结巴巴 .

目前,我只是在每帧视频的持续时间内生成方波,并将其写入AVAssetWriterInput . 之后,我将生成我真正想要的音频 . 如果我生成一个大样本,我不会得到口吃 . 如果我用块写它(我更喜欢分配一个大型数组),那么音频块似乎互相夹住:

Glitch

我似乎无法弄清楚为什么会这样 . 我很确定我得到的音频缓冲区的时间戳是正确的,但也许我正在做这整个部分不正确 . 或者我需要一些标志才能让视频同步到音频?我无法看到这是问题,因为我可以在将音频数据提取到wav后在wave编辑器中看到问题 .

写音频的相关代码:

- (id)init
        {
            self = [super init];

        if (self) {

            // [snip]

            rateDenominator = 44100;
            rateMultiplier = rateDenominator / frameRate;       

            sample_position_ = 0;
            audio_fmt_desc_ = nil;
            int nchannels = 2;
            AudioStreamBasicDescription audioFormat;
            bzero(&audioFormat, sizeof(audioFormat));
            audioFormat.mSampleRate = 44100;
            audioFormat.mFormatID   = kAudioFormatLinearPCM;
            audioFormat.mFramesPerPacket = 1;
            audioFormat.mChannelsPerFrame = nchannels;        
            int bytes_per_sample = sizeof(float);
            audioFormat.mFormatFlags = kAudioFormatFlagIsFloat | kAudioFormatFlagIsAlignedHigh;
            audioFormat.mBitsPerChannel = bytes_per_sample * 8;
            audioFormat.mBytesPerPacket = bytes_per_sample * nchannels;            
            audioFormat.mBytesPerFrame = bytes_per_sample * nchannels; 

            CMAudioFormatDescriptionCreate(kCFAllocatorDefault, 
                                           &audioFormat, 
                                           0, 
                                           NULL,
                                           0, 
                                           NULL, 
                                           NULL, 
                                           &audio_fmt_desc_
                                           );
        }

        return self;
    }

-(BOOL) beginRecordingSession {

    NSError* error = nil;

    isAborted = false;
    abortCode = No_Abort;

    // Allocate the video writer object.  
    videoWriter = [[AVAssetWriter alloc] initWithURL:[self getVideoFileURLAndRemoveExisting:
        recordingPath] fileType:AVFileTypeMPEG4 error:&error];

    if (error) {
        NSLog(@"Start recording error: %@", error);
    }

    //Configure video compression settings.
    NSDictionary* videoCompressionProps = [NSDictionary dictionaryWithObjectsAndKeys:
                                           [NSNumber numberWithDouble:1024.0 * 1024.0], AVVideoAverageBitRateKey,
                                           [NSNumber numberWithInt:10],AVVideoMaxKeyFrameIntervalKey,
                                            nil ];

    //Configure video settings.
    NSDictionary* videoSettings = [NSDictionary dictionaryWithObjectsAndKeys:
                                   AVVideoCodecH264, AVVideoCodecKey,
                                   [NSNumber numberWithInt:frameSize.width], AVVideoWidthKey,
                                   [NSNumber numberWithInt:frameSize.height], AVVideoHeightKey,
                                   videoCompressionProps, AVVideoCompressionPropertiesKey,
                                   nil];

    // Create the video writer that is used to append video frames to the output video
    // stream being written by videoWriter.
    videoWriterInput = [[AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeVideo outputSettings:videoSettings] retain];
    //NSParameterAssert(videoWriterInput);
    videoWriterInput.expectsMediaDataInRealTime = YES;

    // Configure settings for the pixel buffer adaptor.
    NSDictionary* bufferAttributes = [NSDictionary dictionaryWithObjectsAndKeys:
                                      [NSNumber numberWithInt:kCVPixelFormatType_32ARGB], kCVPixelBufferPixelFormatTypeKey, nil];

    // Create the pixel buffer adaptor, used to convert the incoming video frames and 
    // append them to videoWriterInput.
    avAdaptor = [[AVAssetWriterInputPixelBufferAdaptor assetWriterInputPixelBufferAdaptorWithAssetWriterInput:videoWriterInput sourcePixelBufferAttributes:bufferAttributes] retain];

    [videoWriter addInput:videoWriterInput];

    // <pb> Added audio input.
    sample_position_ = 0;
    AudioChannelLayout acl;
    bzero( &acl, sizeof(acl));
    acl.mChannelLayoutTag = kAudioChannelLayoutTag_Stereo;


    NSDictionary* audioOutputSettings = nil;          

        audioOutputSettings = [NSDictionary dictionaryWithObjectsAndKeys:
                               [ NSNumber numberWithInt: kAudioFormatMPEG4AAC ], AVFormatIDKey,
                               [ NSNumber numberWithInt: 2 ], AVNumberOfChannelsKey,
                               [ NSNumber numberWithFloat: 44100.0 ], AVSampleRateKey,
                               [ NSNumber numberWithInt: 64000 ], AVEncoderBitRateKey,
                               [ NSData dataWithBytes: &acl length: sizeof( acl ) ], AVChannelLayoutKey,
                               nil];

    audioWriterInput = [[AVAssetWriterInput 
                          assetWriterInputWithMediaType: AVMediaTypeAudio 
                          outputSettings: audioOutputSettings ] retain];

    //audioWriterInput.expectsMediaDataInRealTime = YES;
    audioWriterInput.expectsMediaDataInRealTime = NO; // seems to work slightly better

    [videoWriter addInput:audioWriterInput];

    rateDenominator = 44100;
    rateMultiplier = rateDenominator / frameRate;       

    // Add our video input stream source to the video writer and start it.
    [videoWriter startWriting];
    [videoWriter startSessionAtSourceTime:CMTimeMake(0, rateDenominator)];

    isRecording = true;
    return YES;
}    

        - (int) writeAudioBuffer: (float*) samples sampleCount: (size_t) n channelCount: (size_t) nchans
        {    
            if ( ![self waitForAudioWriterReadiness]) {
                NSLog(@"WARNING: writeAudioBuffer dropped frame after wait limit reached.");
                return 0;
            }

            //NSLog(@"writeAudioBuffer");
            OSStatus status;
            CMBlockBufferRef bbuf = NULL;
            CMSampleBufferRef sbuf = NULL;

            size_t buflen = n * nchans * sizeof(float);
            // Create sample buffer for adding to the audio input.
            status = CMBlockBufferCreateWithMemoryBlock(
                                                        kCFAllocatorDefault, 
                                                        samples, 
                                                        buflen, 
                                                        kCFAllocatorNull, 
                                                        NULL, 
                                                        0, 
                                                        buflen, 
                                                        0, 
                                                        &bbuf);

            if (status != noErr) {
                NSLog(@"CMBlockBufferCreateWithMemoryBlock error");
                return -1;
            }

            CMTime timestamp = CMTimeMake(sample_position_, 44100);
            sample_position_ += n;

            status = CMAudioSampleBufferCreateWithPacketDescriptions(kCFAllocatorDefault, bbuf, TRUE, 0, NULL, audio_fmt_desc_, 1, timestamp, NULL, &sbuf);
            if (status != noErr) {
                NSLog(@"CMSampleBufferCreate error");
                return -1;
            }
            BOOL r = [audioWriterInput appendSampleBuffer:sbuf];
            if (!r) {
                NSLog(@"appendSampleBuffer error");
            }
            CFRelease(bbuf);
            CFRelease(sbuf);

            return 0;
        }

关于发生了什么的任何想法?

我应该以不同的方式创建/附加样本吗?

它与AAC压缩有关吗?如果我尝试使用未压缩的音频(它会抛出),它就无法工作 .

据我所知,我正在正确计算PTS . 为什么音频通道甚至需要它?视频不应该同步到音频时钟吗?

更新:我尝试以1024个样本的固定块提供音频,因为这是AAC压缩器使用的DCT的大小 . 没有任何区别 .

在编写任何视频之前,我已经尝试过一次性推送所有块 . 不行 .

我已尝试将CMSampleBufferCreate用于剩余的块,并仅对第一个块使用CMAudioSampleBufferCreateWithPacketDescriptions . 没变 .

我尝试过这些的组合 . 还是不对 .

SOLUTION:

看起来:

audioWriterInput.expectsMediaDataInRealTime = YES;

是必不可少的,否则它会混淆思想 . 也许这是因为视频是用这个标志设置的 . 此外,即使您将标志 kCMBlockBufferAlwaysCopyDataFlag 传递给它, CMBlockBufferCreateWithMemoryBlock 也不会复制样本数据 .

因此,可以使用此方法创建缓冲区,然后使用 CMBlockBufferCreateContiguous 进行复制,以确保获得带有音频数据副本的块缓冲区 . 否则它将引用你原来传入的内存,事情就会搞砸了 .

2 回答

  • 0

    看起来没问题,虽然我会使用 CMBlockBufferCreateWithMemoryBlock ,因为它会复制样本 . 您的代码是否正常,因为不知道audioWriterInput何时完成它们?

    不应 kAudioFormatFlagIsAlignedHigh kAudioFormatFlagIsPacked

  • 2

    CMAudioSampleBufferCreateWithPacketDescriptions(kCFAllocatorDefault,bbuf,TRUE,0,NULL,audio_fmt_desc_,1,timestamp,NULL,&sbuf);应该是CMAudioSampleBufferCreateWithPacketDescriptions(kCFAllocatorDefault,bbuf,TRUE,0,NULL,audio_fmt_desc_,n,timestamp,NULL,&sbuf);我做到了 .

相关问题