BBTv2: Towards a Gradient-Free Future with Large Language Models T Sun, Z He, H Qian, Y Zhou, XJ Huang, X Qiu Proceedings of the 2022 Conference on Empirical Methods in Natural Language …, 2022 | 56* | 2022 |
Diffusionbert: Improving generative masked language models with diffusion models Z He, T Sun, K Wang, X Huang, X Qiu arXiv preprint arXiv:2211.15029, 2022 | 55 | 2022 |
Moss: Training conversational language models from synthetic data T Sun, X Zhang, Z He, P Li, Q Cheng, H Yan, X Liu, Y Shao, Q Tang, ... arXiv preprint arXiv:2307.15020 7, 2023 | 37 | 2023 |
Multitask pre-training of modular prompt for chinese few-shot learning T Sun, Z He, Q Zhu, X Qiu, X Huang arXiv preprint arXiv:2210.07565, 2022 | 14 | 2022 |
Competition for gradient-free tuning of large language models: approaches, results, current challenges and future directions T Cao, L Chen, D Zhang, T Sun, Z He, X Qiu, X Xu, H Zhang National Science Review 10 (6), nwad124, 2023 | 3 | 2023 |
Can AI Assistants Know What They Don't Know? Q Cheng, T Sun, X Liu, W Zhang, Z Yin, S Li, L Li, K Chen, X Qiu arXiv preprint arXiv:2401.13275, 2024 | 2 | 2024 |
Dictionary Learning Improves Patch-Free Circuit Discovery in Mechanistic Interpretability: A Case Study on Othello-GPT Z He, X Ge, Q Tang, T Sun, Q Cheng, X Qiu arXiv preprint arXiv:2402.12201, 2024 | 1 | 2024 |
Generate Point Clouds with Multiscale Details from Graph-Represented Structures X Yang, Z He, C Jin arXiv preprint arXiv:2112.06433, 2021 | | 2021 |