META-GENERATION METHOD FOR LARGE LANGUAGE MODELS

Hoàng Nhật Dương

doi:10.34238/tnu-jst.12364

META-GENERATION METHOD FOR LARGE LANGUAGE MODELS

About this article

Received: 21/03/25 Revised: 26/06/25 Published: 28/06/25

Authors

Hoang Nhat Duong , Institute of Information Technology - Vietnam Academy of Science and Technology

Abstract

This study addresses the question: How can we enhance the accuracy and efficiency of natural language processing by optimizing the output generation process? The goal is to develop a meta-generation method that improves the quality of large language model outputs through systematic feedback and refinement steps. The research methodology is structured around a three-stage process: (1) generating an initial output from the model, (2) collecting feedback to identify errors, and (3) refining the output based on the feedback to produce a more accurate result. A key innovation of this approach lies in decomposing the problem into smaller sub-tasks, generating multiple candidate outputs, and then applying a reward model or voting mechanism to select the optimal answer. The results indicate that the meta-generation approach significantly improves model accuracy by incorporating step-by-step verification, feedback, and candidate selection. Experimental data (if available) demonstrate that the refined model outperforms single-pass generation models in terms of output quality. This approach demonstrates clear potential in enhancing reasoning performance and the output quality of language models.

Keywords

Meta-generation; Chain-of-Thought; Reinforcement learning; Generator; Fine-tuning

Full Text:

PDF (Tiếng Việt)

References

[1] W. Fedus, B. Zoph, and N. Shazeer, "Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity," arXiv preprint arXiv:2101.03961, 2022.

[2] G. Feng, B. Zhang, Y. Gu, H. Ye, D. He, and L. Wang, “Towards revealing the mystery behind chain of thought: a theoretical perspective,” Advances in Neural Information Processing Systems, vol. 36, pp. 70757–70798, 2023.

[3] M. Finlayson, J. Hewitt, A. Koller, S. Swayamdipta, and A. Sabharwal, “Closing the curious case of neural text degeneration,” arXiv preprint arXiv:2310.01693, 2023.

[4] M. Hobbahn, L. Heim, and G. Aydos, "Trends in machine learning hardware," Epoch AI, Nov. 9, 2023. [Online]. Available: https://epoch.ai. [Accessed Nov. 26, 2024].

[5] E. Tulchinskii, K. Kuznetsov, L. Kushnareva, D. Cherniavskii, S. Nikolenko, E. Burnaev, et al., “Intrinsic dimension estimation for robust detection of AI-generated texts,” Advances in Neural Information Processing Systems, vol. 36, pp. 39257–39276, 2023.

[6] M. Li, W. Wang, F. Feng, F. Zhu, Q. Wang, and T. S. Chua, “Think twice before trusting: Self-detection for large language models through comprehensive answer reflection,” arXiv preprint arXiv:2403.09972, 2024.

[7] Y. He, J. Zhang, J. Bao, F. Lin, C. Yang, B. Qin, et al., “BC-Prover: Backward Chaining Prover for Formal Theorem Proving,” in Proc. 2024 Conf. Empirical Methods in Natural Language Processing (EMNLP), Miami, Florida, Nov. 2024, pp. 3059–3077.

[8] D. Jurafsky, B. Brown, R. Ehrlich, D. Y. Fu, C. Ré, and A. Mirhoseini, "Hydragen: High-throughput LLM inference with shared prefixes," arXiv preprint arXiv:2402.03467, 2024.

[9] L. Wang, S. Chen, L. Jiang, S. Pan, R. Cai, S. Yang, and F. Yang, “Parameter-efficient fine-tuning in large language models: A survey of methodologies,” Artificial Intelligence Review, vol. 58, no. 8, 2025, Art. no. 227.

[10] B. Weng, "Navigating the landscape of large language models: A comprehensive review and analysis of paradigms and fine-tuning strategies," arXiv preprint arXiv:2404.09022, 2024.

[11] A. D. Cohen, A. Roberts, A. Molina, A. Butryna, A. Jin, A. Kulshreshtha, et al., "LaMDA: Language models for dialog applications," arXiv preprint arXiv:2201.08239, 2022.

[12] T. Kaufmann, P. Weng, V. Bengs, and E. Hüllermeier, "A survey of reinforcement learning from human feedback," arXiv preprint arXiv:2312.14925, 2023.

[13] S. Welleck, A. Bertsch, M. Finlayson, et al., "From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models," arXiv preprint arXiv:2406.16838v2, 2024.

DOI: https://doi.org/10.34238/tnu-jst.12364

Refbacks

There are currently no refbacks.



Remember me