Ethan Chern

I am a PhD student at the Generative AI Research Lab (GAIR), Shanghai Jiao Tong University, advised by Prof. Pengfei Liu. My research interests are in LLM alignment, factuality, evaluation, and multimodal learning.

  • Alignment: Building empirical approaches to align LLMs for enhanced reasoning (math, code), honesty, and safety. Abel, Align on the fly, Alignment for Honesty
  • Factuality: Devising monitoring and evaluation techniques to mitigate factual errors generated by LLMs. FacTool, FELM
  • Evaluation: Establishing scalable evaluation methodologies to facilitate the alignment between existing LLM-based evaluation metrics and human expectations. ScaleEval, OlympicArena
  • Multimodality: Building VLM models to understand and generate multimodal contexts.

I hold a Master’s degree in Artificial Intelligence from the Language Technologies Institute, School of Computer Science at Carnegie Mellon University. During my Master’s years at CMU, I was fortunate to work with Prof. Graham Neubig and Dr. Pengfei Liu on the factuality of LLMs and summarization models.

I also had the opportunity to collaborate with Prof. Yu Tsao on multi-modal speech enhancement and voice conversion.

During my undergraduate years, I worked with Prof. Ching-Yi Lai on quantum error correction and information theory.

Publications

OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI
Zhen Huang, Zengzhi Wang, … , Ethan Chern, … , Pengfei Liu
preprint. [arxiv] [github] [website]

BeHonest: Benchmarking Honesty of Large Language Models
Steffi Chern, Zhulin Hu, Yuqing Yang, Ethan Chern, Yuan Guo, Jiahe Jin, Binjie Wang, Pengfei Liu
preprint. [arxiv]

Reformatted Alignment
Run-Ze Fan, Xuefeng Li, Haoyang Zou, Junlong Li, Shwai He, Ethan Chern, Jiewen Hu, Pengfei Liu
preprint. [arxiv]

Can Large Language Models be Trusted for Evaluation? Scalable Meta-Evaluation of LLMs as Evaluators via Agent Debate
Steffi Chern, Ethan Chern, Graham Neubig, Pengfei Liu.
preprint. [arxiv] [github]

Align on the Fly: Adapting Chatbot Behavior to Established Norms
Chunpu Xu, Steffi Chern, Ethan Chern, Ge Zhang, Zekun Wang, Ruibo Liu, Jing Li, Jie Fu, Pengfei Liu.
preprint. [arxiv] [github] [website]

Alignment for Honesty
Yuqing Yang, Ethan Chern, Xipeng Qiu, Graham Neubig, Pengfei Liu
preprint. [arxiv] [github]

Generative AI for Math: Abel
Ethan Chern *, Haoyang Zou *, Xuefeng Li *, Jiewen Hu *, Kehua Feng, Junlong Li, Pengfei Liu
preprint. [github] [website]
* = Core contributors

FELM: Benchmarking Factuality Evaluation of Large Language Models
Shiqi Chen, Yiran Zhao, Jinghan Zhang, I-Chun Chern, Siyang Gao, Pengfei Liu, Junxian He
NeurIPS 2023. [arxiv] [github] [website]

FacTool: Factuality Detection in Generative AI - A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios
I-Chun Chern, Steffi Chern, Shiqi Chen, Weizhe Yuan, Kehua Feng, Chunting Zhou, Junxian He, Graham Neubig, Pengfei Liu
arxiv preprint. [arxiv] [github] [website]

Improving Factuality of Abstractive Summarization via Contrastive Reward Learning
I-Chun Chern, Zhiruo Wang, Sanjan Das, Bhavuk Sharma, Pengfei Liu, Graham Neubig
Third Workshop on Trustworthy Natural Language Processing at ACL 2023. [arxiv]

Audio‑Visual Speech Enhancement and Separation by Leveraging Multi‑Modal Self‑Supervised Embeddings
I-Chun Chern, Kuo‑Hsuan Hung, Yi‑Ting Chen, Tassadaq Hussain, Mandar Gogate, Amir Hussain, Yu Tsao, Jen-Cheng Hou
Advances in Multi-modal Hearing Assistive Technologies (AMHAT) at ICASSP 2023. [arxiv]

Voice Direction‑of-Arrival Conversion
I‑Chun Chern, Steffi Chern, Heng‑Cheng Kuo, Huan‑Hsin Tseng, Kuo‑Hsuan Hung, Yu Tsao
MLSP 2023.

Decoding of Quantum Data‑Syndrome Codes via Belief Propagation
Kao‑Yueh Kuo, I‑Chun Chern, and Ching‑Yi Lai
ISIT 2021. [arxiv]