Journals

  • Li Li, Hirokazu Kameoka, Shota Inoue, and Shoji Makino, "FastMVAE: A fast optimization algorithm for the multichannel variational autoencoder method," IEEE Access, vol. 8, pp. 228740-228753, Dec. 2020. (pdf) (code)
  • Li Li, Hirokazu Kameoka, and Shoji Makino, "Majorization-minimization algorithm for discriminative non-negative matrix factorization," IEEE Access, vol. 8, pp. 227399-227408, Dec, 2020. (pdf)
  • Riki Takahashi, Li Li, Shoji Makino, and Takeshi Yamada, "VMInNet: Interpolation of virtual microphones in optimal latent space explored by auto encoder," Journal of Signal Processing, vol. 25, no. 6, pp. 245-250, Nov. 2021. (link)
  • Naoya Murashima, Hirokazu Kameoka, Li Li, Seki Shogo, and Shoji Makino, "Single-channel multispeaker separation with variational autoencoder spectrogram model," Journal of Signal Processing, vol. 25, no. 4, pp. 145-149, Jul. 2021. (link)
  • Shogo Seki, Hirokazu Kameoka, Li Li, Tomoki Toda, and Kazuya Takeda, "Underdetermined source separation based on generalized multichannel variational autoencoder," IEEE Access, vol. 7, No. 1, pp. 168104-168115, Nov. 2019. (link)
  • Hirokazu Kameoka, Li Li, Shota Inoue, and Shoji Makino, "Supervised determined source separation with multichannel variational autoencoder," Neural Computation, vol. 31, no. 9, pp. 1891-1914, Sep. 2019. (pdf) (preprint) (demo) (code)
  • Hirokazu Kameoka, Takuya Higuchi, Mikihiro Tanaka, and Li Li, "Non-negative matrix factorization with basis clustering using cepstral distance regularization," IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 26, No. 6, pp. 1025-1036, Jun. 2018. (link) (demo)
  • Preprints

  • Li Li, Hirokazu Kameoka, and Shoji Makino, "FastMVAE2: On improving and accelerating the fast variational autoencoder-based source separation algorithm for determined mixtures," arXiv:2109.13496, Sep. 2021. (pdf)
  • Peer-reviewed International Conference Papers

  • Yuka Hashizume, Li Li, Tomoki Toda "Music similarity calculation of individiual instrumental sounds using metric learning," in Proc. The 14th annual conference of Asia-Pacific Signal and Information Processing Association (APSIPA2022), Nov. 2022. (accepted)
  • Rui Wang, Li Li, Tomoki Toda "Direction-aware target speaker extraction with a dual-channel system based on conditional variational autoencoders in underdetermined conditions," in Proc. The 14th annual conference of Asia-Pacific Signal and Information Processing Association (APSIPA2022), Nov. 2022. (accepted)
  • Shuhei Yamaji, Taishi Nakashima, Nobutaka Ono, Li Li, Hirokazu Kameoka "Encoder re-training with mixture signals on FastMVAE method," in Proc. The 14th annual conference of Asia-Pacific Signal and Information Processing Association (APSIPA2022), Nov. 2022. (accepted)
  • Kana Goto, Tetsuya Ueda, Li Li, Takeshi Yamada, Shoji Makino, "Accelerating online algorithm using geometrically constrained independent vector analysis with iterative source steering," in Proc. The 14th annual conference of Asia-Pacific Signal and Information Processing Association (APSIPA2022), Nov. 2022. (accepted)
  • Shinya Furunaga, Kana Goto, Tetsuya Ueda, Li Li, Takeshi Yamada, Shoji Makino, "Numerical investigation of weight parameters for geometrically constrained independent vector analysis using vectorwise coordinate descent or iterative source steering," in Proc. The 17th International Workshop on Acoustic Signal Enhancement (IWAENC2022), F-06, Sep. 2022.
  • Kana Goto, Tetsuya Ueda, Li Li, Takeshi Yamada, Shoji Makino, "Geometrically constrained independent vector analysis with auxiliry function approach and iterative source steering," in Proc. The 2022 European Signal Processing Conference (EUSIPCO2022), pp. 757-761, Aug. 2022.
  • Li Li, Hirokazu Kameoka, and Shogo Seki, "HBP: An efficient block permutation solver using Hungarian algorithm and spectrogram inpainting for multichannel audio source separation," in Proc. 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2022), pp. 516-520, May 2022. (pdf) (poster)
  • Hirokazu Kameoka, Shogo Seki, Li Li, and Chihiro Watanabe, "AttentionPIT: Soft permutation invariant training for audio source separation with attention mechanism," in Proc. 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2022), pp. 706-710, May 2022. (link)
  • Shogo Seki, Hirokazu Kameoka, and Li Li, "Investigation and comparison of optimization methods for variational autoencoder-based underdetermined multichannel source separation," in Proc. 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2022), pp. 511-515, May 2022. (link)
  • Hanako Segawa, Li Li, Shoji Makino, and Takeshi Yamada, "Extension of virtual microphone technique to multiple real microphones and investigation of the impact of phase and amplitude interpolation on speech enhancement," in Proc. The 13th annual conference of Asia-Pacific Signal and Information Processing Association (APSIPA2021), pp. 591-602, Dec. 2021. (link)
  • Sotara Nakaoka, Li Li, Shoji Makino, and Takeshi Yamada, "Reducing algorithmic delay using low-overlap window for online Wave-U-Net," in Proc. The 13th annual conference of Asia-Pacific Signal and Information Processing Association (APSIPA2021), pp. 1210-1214, Dec. 2021. (link)
  • Shota Inoue, Hirokazu Kameoka, Li Li, and Shoji Makino, "SepNet: A deep separation matrix prediction network for multichannel audio source separation," in Proc. 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2021), pp. 191-195, Jun. 2021. (link)
  • Sotaro Nakaoka, Li Li, Shota Inoue, and Shoji Makino, "Teacher-student learning for low-latency online speech enhancement using Wave-U-Net," in Proc. 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2021), pp. 661-665, Jun. 2021. (link)
  • Riki Takahashi, Li Li, Shoji Makino, and Takeshi Yamada, "VMInNet: Interpolation of virtual microphones in optimal latent space explored by autoencoder," in Proc. RISP Internaonal Workshop on Nonlinear Circuits, Communicaions and Signal Processing (NCSP2021), pp. 93-96, Mar. 2021.
  • Naoya Murashima, Hirokazu Kameoka, Li Li, Shogo Seki, and Shoji Makino, "Single-channel multi-speaker separation via discriminative training of variational autoencoder spectrogram model," in Proc. RISP Internaonal Workshop on Nonlinear Circuits, Communicaions and Signal Processing (NCSP2021), pp. 149-152, Mar. 2021.
  • Kana Goto, Li Li, Riki Takahashi, Shoji Makino, and Takeshi Yamada, "Study on geometrically constrained IVA with auxiliary function approach and VCD for in-car communication," in Proc. The 12th annual conference of Asia-Pacific Signal and Information Processing Association (APSIPA2020), pp. 858-862, Dec. 2020. (pdf)
  • Li Li, Kazuhito Koishida, and Shoji Makino, "Online directional speech enhancement using geometrially constrained independent vector analysis," in Proc. The 21th Annual Conference of the International Speech Communication Association (Interspeech2020), pp. 61-65, Oct. 2020. (pdf)
  • Li Li, Hirokazu Kameoka, and Shoji Makino, "Determined audio source separation with multichannel star generative adversarial network," in Proc. The 30th IEEE International Workshop on Machine Learning for Signal Processing (MLSP2020), Sep. 2020. (pdf)
  • Li Li, and Kazuhito Koishida, "Geometrically constrained independent vector analysis for directional speech enhancement," in Proc. 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2020), pp. 846-850, May 2020. (pdf) (presentation)
  • Riki Takahashi, Kouei Yamaoka, Li Li, Shoji Makino, Takeshi Yamada, and Mitsuo Matsumoto, "Underdetermined multichannel speech enhancement using time-frequency-bin-wise switching beamformer and gated CNN-based time-frequency mask for reverberant environments," in Proc. RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP2020), Feb. 2020.
  • Li Li, Tomoki Toda, Kazuho Morikawa, Kazuhiro Kobayashi, and Shoji Makino, "Improving singing aid system for laryngectomees with statistical voice conversion and VAE-SPACE," in Proc. 20th International Society for Music Information Retrieval Conference (ISMIR2019), pp. 784-790, Nov. 2019. (pdf) (slide&demo)
  • Shogo Seki, Hirokazu Kameoka, Li Li, Tomoki Toda, and Kazuya Takeda, "Generalized multichannel variational autoencoder for underdetermined source separation," in Proc. The 2019 European Signal Processing Conference (EUSIPCO2019), pp. 1973-1977, Sep. 2019. (link) (preprint)
  • Kouei Yamaoka, Li Li, Nobutaka Ono, Shoji Makino, and Takeshi Yamada, "CNN-based virtual microphone signal estimation for MPDR Beamforming in underdetermined situations," in Proc. The 2019 European Signal Processing Conference (EUSIPCO2019), pp. 1049-1053, Sep. 2019. (link)
  • Li Li, Kouei Yamaoka, Yuki Koshino, Mitsuo Matsumoto, and Shoji Makino, "Voice activity detection under high levels of noise using gated convolutional neural networks," in Proc. International Congress on Acoustics (ICA2019), pp. 2862-2869, Sep. 2019. (pdf)
  • Shota Inoue, Li Li, Hirokazu Kameoka, and Shoji Makino, "Joint separation, dereverberation and classification of mixed sources using multichannel variational autoencoder with auxiliary classifier," in Proc. International Congress on Acoustics (ICA2019), pp.6988-6995, Sep. 2019. (pdf)
  • Li Li, Hirokazu Kameoka, and Shoji Makino, "Fast MVAE: Joint separation and classification of mixed sources based on multichannel variational autoencoder with auxiliary classifier," in Proc. 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2019), pp. 546-550, May 2019. [IEEE Signal Processing Society Japan Student Conference Paper Award] (pdf) (preprint) (poster) (demo)
  • Shota Inoue, Hirokazu Kameoka, Li Li, Shogo Seki, and Shoji Makino, "Joint separation and dereverberation of reverberant mixtures with multichannel variational autoencoder," in Proc. 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2019), pp. 56-60, May 2019. (link) (poster)
  • Li Li, and Hirokazu Kameoka, "Deep clustering with gated convolutional networks," in Proc. 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2018), pp. 16-20, Apr. 2018. (pdf) (slide)
  • Li Li, Hirokazu Kameoka, and Shoji Makino, "Mel-generalized cepstral regularization with discriminative non-negative matrix factorization," in Proc. The 27th IEEE International Workshop on Machine Learning for Signal Processing (MLSP2017), Sep. 2017. (pdf)
  • Li Li, Hirokazu Kameoka, Tomoki Toda, and Shoji Makino, "Speech enhancement using non-negative spectrogram models with mel-generalized cepstral regularization," in Proc. The 18th Annual Conference of the International Speech Communication Association (Interspeech2017), pp. 1998-2002, Aug. 2017. (pdf)
  • Li Li, Hirokazu Kameoka, and Shoji Makino, "Discriminative non-negative matrix factorization with majorization-minimization," in Proc. The 5th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA2017), pp. 141-145, Mar. 2017. (pdf)
  • Li Li, Hirokazu Kameoka, Takuya Higuchi, and Hiroshi Saruwatari, "Semi-supervised joint enhancement of spectral and cepstral sequences of noisy speech," in Proc. The 17th Annual Conference of the International Speech Communication Association (Interspeech 2016), pp. 3753-3757, Sep. 2016. (pdf)
  • Invited talks

  • 李莉, “信号の独立性に基づく多チャンネル音源分離,” 電気・電子・情報関係学会 東海支部連合大会, 【OS2】音響学の次世代を担う若手研究者による異分野融合セッション, I6-1, Aug. 2022.
  • Non-reviewed Domestic Conference Papers

  • 李莉, 関翔悟, 亀岡弘和, “再帰ニューラルネットワーク型音源モデルに基づく高速多チャンネル変分自己符号化器法,” 日本音響学会2022年秋季研究発表会講演論文集, 1-Q-24, pp.333-334, Sep. 2022.
  • 後藤加奈, 上田哲也, 李莉, 山田武志, 牧野昭二, “幾何学的制約付き独立ベクトル分析を用いたオンライン指向性音声強調のIterative source steeringによる高速化,” 日本音響学会2022年秋季研究発表会講演論文集, 1-2-2, pp.157-160, Sep. 2022.
  • 山地修平, 中嶋大志, 小野順貴, 李莉, 亀岡弘和, “混合信号による符号化器の再学習を用いたFasMVAE法に基づく音源分離,” 日本音響学会2022年秋季研究発表会講演論文集, 1-Q-30, pp. 355-358, Sep. 2022.
  • 近藤祐斗, 李莉, 関翔悟, 亀岡弘和, “FastMVAE法におけるブロックパーミュテーションを軽減する音源モデル学習,” 日本音響学会2022年秋季研究発表会講演論文集, 2-2-2, pp. 179-182, Sep. 2022.
  • Rui Wang, Li Li, Tomoki Toda, “Direction-aware target speaker extraction with conditional variational autoencoders and its sensitivity to direction-of-arrival error,” 日本音響学会2022年秋季研究発表会講演論文集, 2-2-6, pp.195-196, Sep. 2022.
  • 橋爪優果,李莉,戸田智基, “各楽器音に着目した楽曲間類似度学習の評価,” 日本音響学会2022年秋季研究発表会講演論文集, 3-1-5, pp. 1517-1518, Sep. 2022.
  • 橋爪優果,李莉,戸田智基, “各楽器音に着目した楽曲間類似度学習,” 情報処理研報, Vol. 2022-MUS-134, No. 46, pp. 1-6, June 2022.
  • 橋爪優果,李莉,戸田智基, “各楽器音源に着目した距離学習に基づく楽曲間類似度計算,” 日本音響学会2022年春季研究発表会講演論文集, 2-9-12, pp. 1207-1208, Mar. 2022.
  • Rui Wang, Li Li, Tomoki Toda, “Target speaker extraction based on conditional variational autoencoder and directional information in underdetermined condition,” 電子情報通信学会技術研究報告, vol. 121, no. 383, EA2021-76, pp. 76-81, Mar. 2022.
  • 李莉, 亀岡弘和, 牧野昭二, “ChimeraACVAEによる高速多チャンネル変分自己符号化器法,” 日本音響学会2021年秋季研究発表会講演論文集, 1-1-6, pp. 129-132, Sep. 2021. [粟屋潔学術奨励賞]
  • 李莉, 亀岡弘和, 関翔悟, “ハンガリー法と欠損帯域補完に基づく周波数領域ブロックパーミュテーション解決法,” 日本音響学会2021年秋季研究発表会講演論文集, 1-1-7, pp. 133-136, Sep. 2021.
  • 瀬川華子, 李莉, 牧野昭二, 山田武志, “ヴァーチャルマイクロフォンの内挿における位相及び振幅補間の音声強調性能への影響の評価,” 日本音響学会2021年秋季研究発表会講演論文集, 1-1-11, pp. 147-150, Sep. 2021.
  • 中岡想太郎, 李莉, 牧野昭二, 山田武志, “Low-overlap window を用いたオンラインWave-U-Net のアルゴリズム遅延の削減,” 日本音響学会2021年秋季研究発表会講演論文集, 2-1Q-10, pp.337-340, Sep. 2021.
  • 村島允也, 牧野昭二, 亀岡弘和, 李莉, 関翔悟, “識別的変分自己符号化器学習による特定話者モノラル音声分離,” 日本音響学会2021年春季研究発表会講演論文集, 2-1-1, pp. 205-208, Mar. 2021.
  • 井上翔太, 亀岡弘和, 李莉, 牧野昭二, “SepNet: 高速多チャンネル音源分離のための分離行列予測ネットワーク,” 日本音響学会2021年春季研究発表会講演論文集, 2-1-5, pp. 221-224, Mar. 2021.
  • 中岡想太郎, 井上翔太, 李莉, 牧野昭二, “Teacher-Student学習を用いたWave-U-netによる低遅延リアルタイム音声強調,” 日本音響学会2021年春季研究発表会講演論文集, 2-1-6, pp. 225-228, Mar. 2021.
  • 後藤加奈, 李莉, 高橋理希, 牧野昭二, 山田武志, “補助関数法に基づく幾何学的制約付き独立ベクトル分析の車室内音声強調への適用,” 日本音響学会2021年春季研究発表会講演論文集, 2-1-13, pp. 249-252, Mar. 2021.
  • 瀬川華子, 髙橋理希, 李莉, 陣在遼河, 牧野昭二, 山田 武志, “車室内の三角マイクロフォンアレイへのヴァーチャルマイクロフォン技術の適用,” 日本音響学会2021年春季研究発表会講演論文集, 2-1-14, pp. 253-256, Mar. 2021.
  • 樋口隼太, 李莉, 井上翔太, 牧野昭二, 山田武志 “車室内環境を想定したWave-U-Netによる雑音除去の検討,” 電子情報通信学会総合大会論文集, p. A-5-1, Mar. 2021.
  • 多賀遥香,関翔悟,李莉,武田一哉,戸田智基, “一般化指令応答モデルを用いた変分自己符号化器に基づく歌唱F0パターンの生成,” 日本音響学会2020年秋季研究発表会講演論文集,1-2-16,pp. 731-732, Sep.2020.
  • 李莉,亀岡弘和,井上翔太,牧野昭二, “多チャンネル変分自己符号化器法による任意話者の音源分離,” 電子情報通信学会技術研究報告, vol. 119, no. 334, EA2019-77, pp. 79-84,Dec. 2019. [学生研究奨励賞] (link)
  • 李莉,越野ゆき,松本光雄,牧野昭二, “Gated CNNを用いた劣悪な雑音環境下における音声区間検出,” 電子情報通信学会技術研究報告, vol. 118, no. 495, EA2018-102, pp. 19-24,Mar. 2019. (link)
  • 関翔悟,亀岡弘和,李莉,戸田智基,武田一哉, “多チャンネル変分自己符号化器を用いた劣決定音源分離の評価,” 電子情報通信学会技術研究報告, vol. 118, no. 495, EA2018-154, pp. 323-328,Mar. 2019. (link)
  • 李莉,亀岡弘和,牧野昭二, “音源クラス識別器つき多チャンネル変分自己符号化器を用いた高速セミブラインド音源分離,” 日本音響学会2019年春季研究発表会講演論文集,1-6-10,pp. 201-204, Mar.2019. (pdf)
  • 髙橋理希,山岡洸瑛,李莉,牧野昭二,山田武志, “時間周波数スイッチングビームフォーマとGated CNNを用いた時間周波数マスクの組み合わせによる劣決定音声強調,” 日本音響学会2019年春季研究発表会講演論文集,1-6-5,pp. 181-184, Mar.2019.
  • 関翔悟,亀岡弘和,李莉,戸田智基,武田一哉, “多チャンネル変分自己符号化器を用いた劣決定音源分離,” 日本音響学会2019年春季研究発表会講演論文集,1-6-20,pp. 229-230, Mar.2019.
  • 井上翔太,亀岡弘和,李莉,関翔悟,牧野昭二, “多チャンネル変分自己符号化器を用いた音源分離と残響除去の統合的アプローチ,” 日本音響学会2019年春季研究発表会講演論文集,2-Q-32,pp. 399-402, Mar.2019.
  • 李莉, 亀岡弘和, “ゲート付きCNNを用いた深層クラスタリングによる音源分離,” 日本音響学会2018年春季研究発表会講演論文集, 1-4-17, pp. 453-456, Mar. 2018. (pdf)
  • 李莉, 亀岡弘和, 牧野昭二, “補助関数法による識別的NMFの基底学習アルゴリズム,” 日本音響学会2017年春季研究発表会講演論文集, 1-P-4, pp. 519-522, Mar. 2017. [IEEE Signal Processing Society Tokyo Joint Chapter Student Award] (pdf)
  • 鄒雲漢,李莉,亀岡弘和, “Vocal tract spectrogram estimation with formant frequency contour factorization,” 日本音響学会2017年春季研究発表会講演論文集, 1-Q-41, pp. 323-326, Mar. 2017.
  • 李莉,亀岡弘和,樋口卓哉,猿渡洋,牧野昭二, “音声のスペクトル領域とケプストラム領域における同時強調,” 電子情報通信学会技術研究報告,vol. 116,no. 189,SP2016-32,pp. 29-32,Aug. 2016. (link)
  • 李莉, 亀岡弘和, 樋口卓哉,猿渡洋, “ケプストラム距離正則化半教師ありNMF による音声強調,” 日本音響学会2016年春季研究発表会講演論文集, 1-P-27, pp. 721-724, Mar. 2016. [学生優秀発表賞] (pdf)
  • Theses

  • Ph.D: "Study on audio source separation algorithms under various conditions, ranging from determined to more realistic conditions," Mar. 2021.
  • M.E.: "Cepstral Regularization and Discriminative Training Algorithm for Non-negative Audio Spectrogram Factorization," Mar. 2018.

  • Page Top