본문 바로가기

audio/audio generation

(6)
[논문리뷰] Efficient Neural Audio Synthesis (ICML18) 제목: Efficient Neural Audio Synthesis 저자: Nal Kalchbrenner, Erich Elsen, Karen Simonyan, Seb Noury, Norman Casagrande, Edward Lockhart, Florian Stimberg, Aaron van den Oord, Sander Dieleman, Koray Kavukcuoglu 소속: DeepMind, Google Brain 발표: ICML 2018 논문: https://arxiv.org/abs/1802.08435 - WaveRNN - sequential model에서 sampling time을 어떻게 줄일 수 있을지 많은 고민이 담긴 논문 - 1) RNN구조를 이용하고 2) GPU 커널 코딩하고 3) weigh..
[논문리뷰] SampleRNN: An Unconditional End-to-End Neural Audio Generation Model (ICLR17) 논문제목: SampleRNN: An Unconditional End-to-End Neural Audio Generation Model 저자: Soroush Mehri, Kundan Kumar, Ishaan Gulrajani, Rithesh Kumar, Shubham Jain, Jose Sotelo Aaron Courville, Yoshua Bengio 소속: University of Montreal, IIT Kampur, SSNCE 발표: ICLR 2017 논문: https://arxiv.org/abs/1612.07837 코드: https://github.com/soroushmehr/sampleRNN_ICLR2017 오디오 샘플: https://soundcloud.com/samplernn/sets - S..
[논문리뷰] GANSynth: Adversarial Neural Audio Synthesis (ICLR19) 논문제목: GANSynth: Adversarial Neural Audio Synthesis 저자: Jesse Engel, Kumar Krishna Agrawal, Shuo Chen, Ishaan Gulrajani, Chris Donahue, Adam Roberts 소속: Google AI 발표: ICLR 2019 논문: https://arxiv.org/abs/1902.08710 코드: https://github.com/magenta/magenta/tree/main/magenta/models/gansynth 샘플오디오: https://storage.googleapis.com/magentadata/papers/gansynth/index.html - GAN으로 좋은 퀄리티의 오디오를 합성해내보자. WaveNe..
[논문리뷰] Adversarial Audio Synthesis (ICLR19) 논문제목: Adversarial Audio Synthesis 저자: Chris Donahue, Julian McAuley, Miller Puckette 소속: UC San Diego 발표: ICLR19 논문: https://arxiv.org/abs/1802.04208 코드: https://github.com/chrisdonahue/wavegan 사운드 샘플: https://chrisdonahue.com/wavegan_examples/ - GAN을 왜 오디오 생성에 사용하지 않지? 이 논문이 나온 2018-19년대에는 벌써 GAN이 나오고도 몇 년이 지나고 벌써 많은 발전이 이루어졌을 시기. 한번 GAN으로 웨이브폼 오디오를 만들어보겠음. - WaveGAN과 SpecGAN이라는 두 가지 모델을 제안. 이름..
[논문리뷰] Enabling Factorized Piano Music Modeling and Generation with the Maestro Dataset (ICLR19) 제목: Enabling Factorized Piano Music Modeling and Generation with the Maestro Dataset 저자: Curtis Hawthorne, Andriy Stasyuk, Adam Roberts, Ian Simon, Cheng-Zhi Anna Huang, Sander Dieleman, Erich Elsen, Jesse Engel, Douglas Eck 소속: Google Brain, DeepMind 발표: ICLR 2019 논문: https://arxiv.org/abs/1810.12247 블로그: https://magenta.tensorflow.org/maestro-wave2midi2wave (maestro) 추가결과: https://storage.goog..
[논문리뷰] WaveNet: A Generative Model for Raw Audio (arxiv16) 논문제목: WaveNet: A Generative Model for Raw Audio 저자: Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, Koray Kavukcuoglu 소속: Google DeepMind, Google 논문: https://arxiv.org/abs/1609.03499 웹페이지: https://www.deepmind.com/blog/wavenet-a-generative-model-for-raw-audio - 딥마인드 van den Oord가 PixelRNN[Oord16a]과 PixelCNN[Oord16b] 만들고..