Code analysis of JAVA PCM Voice audio change processing

2020-12-20 03:38:44
OfStack

In this project, PCM voice audio data needs to be processed with sound variation. After struggling for a week, I finally found a set of frameworks for pure Java implementation, TarsosDSP. Very powerful! Real-time audio processing! Of course, I only use the file processing. The logic is actually 1

TarsosDSP GitHub address: https: / / github com JorenSix/TarsosDSP integrating it to their own projects.

Specific Java tool class code:


  /**
   *  voice 
   * @param rawPcmInputStream  The original PCM Data input stream 
   * @param speedFactor  Variable rate  (0,2)  Is greater than 1 In order to speed up the speech, is less than 1 To slow down 
   * @param rateFactor  Rate of tone change  (0,2)  Is greater than 1 To lower the pitch (deep), less than 1 For ascending tones (sharp) 
   * @return  After breaking PCM Data input stream 
   */
  public static InputStream speechPitchShift(final InputStream rawPcmInputStream,double speedFactor,double rateFactor) {
    TarsosDSPAudioFormat format = new TarsosDSPAudioFormat(16000,16,1,true,false);
    AudioInputStream inputStream = new AudioInputStream(rawPcmInputStream, JVMAudioInputStream.toAudioFormat(format),AudioSystem.NOT_SPECIFIED);
    JVMAudioInputStream stream = new JVMAudioInputStream(inputStream);
    WaveformSimilarityBasedOverlapAdd w = new WaveformSimilarityBasedOverlapAdd(WaveformSimilarityBasedOverlapAdd.Parameters.speechDefaults(speedFactor, 16000));
    int inputBufferSize = w.getInputBufferSize();
    int overlap = w.getOverlap();
    AudioDispatcher dispatcher = new AudioDispatcher(stream, inputBufferSize ,overlap);
    w.setDispatcher(dispatcher);
    AudioOutputToByteArray out = new AudioOutputToByteArray();
    dispatcher.addAudioProcessor(w);
    dispatcher.addAudioProcessor(new RateTransposer(rateFactor));
    dispatcher.addAudioProcessor(out);
    dispatcher.run();
    return new ByteArrayInputStream(out.getData());
  }

The data transcriber (AudioOutputToByteArray) code is as follows:


public class AudioOutputToByteArray implements AudioProcessor {
  private boolean isDone = false;
  private byte[] out = null;
  private ByteArrayOutputStream bos;
  public AudioOutputToByteArray() {
    bos = new ByteArrayOutputStream();
  }
  public byte[] getData() {
    while (!isDone && out == null) {
      try {
        Thread.sleep(10);
      } catch (InterruptedException ignored) {}
    }
    return out;
  }
  @Override
  public boolean process(AudioEvent audioEvent) {
    bos.write(audioEvent.getByteBuffer(),0,audioEvent.getByteBuffer().length);
    return true;
  }
  @Override
  public void processingFinished() {
    out = bos.toByteArray().clone();
    bos = null;
    isDone = true;
  }
}

Audio can be played through this tool method:


  /**
   *  play PCM
   *
   *  Do not call in a non-desktop environment... Who knows what might happen 
   * @param rawPcmInputStream  The original PCM Data input stream 
   * @throws LineUnavailableException
   */
  public static void play(final InputStream rawPcmInputStream) throws LineUnavailableException {
    TarsosDSPAudioFormat format = new TarsosDSPAudioFormat(16000,16,1,true,false);
    AudioInputStream inputStream = new AudioInputStream(rawPcmInputStream, JVMAudioInputStream.toAudioFormat(format),AudioSystem.NOT_SPECIFIED);
    JVMAudioInputStream stream = new JVMAudioInputStream(inputStream);
    AudioDispatcher dispatcher = new AudioDispatcher(stream, 1024 ,0);
    dispatcher.addAudioProcessor(new AudioPlayer(format,1024));
    dispatcher.run();
  }