首页 星云 工具 资源 星选 资讯 热门工具
:

PDF转图片 完全免费 小红书视频下载 无水印 抖音视频下载 无水印 数字星空

10篇代码生成的论文,包括代码评估、代码搜索、代码生成、survey、代码或bug分类

人工智能 7.68MB 31 需要积分: 1
立即下载

资源介绍:

题目 类型 分区 摘要 精读链接 Comparing large language models and humanprogrammers for generating programming code 代码评估 arxiv 评估七种LLMs在生成编程代码方面的性能,探讨不同提示策略对LLMs编码性能的影响,直接比较LLMs与人类程序员的编程能力,评估LLMs在不同编程语言之间生成和翻译代码的能力,以及考察LLMs的计算效率和从过去错误中学习的能力。 A Comparison of the Effectiveness of ChatGPT andCo-Pilot for Generating Quality Python Code 代码评估 会议 包括评估ChatGPT和Copilot在解决LeetCode编程问题上的有效性,探讨ChatGPT在接收到反馈后纠正代码的能力,以及其在提高代码质量和性能方面的潜力。 Program Code Generation with Generative AIs 代码评估 MDPI水刊-Algorithms非SCI 比较了人类生成的代码
Vol.:(0123456789)
Software Quality Journal (2024) 32:985–1005
https://doi.org/10.1007/s11219-024-09675-3
1 3
RESEARCH
LLM‑BRC: Alarge language model‑based bug report
classification framework
XiaotingDu
1,2
· ZhihaoLiu
3
· ChenglongLi
3
· XiangyueMa
3
· YingzhuoLi
1
·
XinyuWang
1
Accepted: 23 April 2024 / Published online: 24 May 2024
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024
Abstract
Deep learning frameworks serve as the cornerstone for constructing robust deep learning
systems. However, bugs within these frameworks can have severe consequences, nega-
tively affecting various applications. Accurately classifying and understanding these
bugs is essential to ensure framework reliability. By doing so, developers can proactively
take appropriate measures to mitigate potential risks associated with specific bug types
in both current and future software releases. Despite the significance of bug report clas-
sification, existing methods fall short in terms of performance, rendering them impractical
for real-world applications. To address this limitation, we propose a bug report classifi-
cation framework for deep learning frameworks, called LLM–BRC, leveraging OpenAI’s
latest embedding model, text-embedding-ada-002. OurLLM–BRCframework achieves an
impressive accuracy range of 92% to 98.75% in bug report classification for three deep
learning frameworks: TensorFlow, MXNET, and PaddlePaddle. This represents a substan-
tial improvement of 17.21% to 69.15% compared to existing methods. Furthermore, we
conduct a comprehensive investigation into the impact of different bug report components
and different models.
Keywords Bug report classification· Deep learning framework· Large-language model
* Xiaoting Du
duxiaoting@bupt.edu.cn
Chenglong Li
li_chenglong@buaa.edu.cn
1
School ofComputer Science (National Pilot Software Engineering School), Beijing University
ofPosts andTelecommunications, Beijing, China
2
Shanghai Key Laboratory ofTrustworthy Computing (East China Normal University),
Shanghai200062, China
3
School ofAutomation Science andElectrical Engineering, Beihang University, Beijing, China
986
Software Quality Journal (2024) 32:985–1005
1 3
1 Introduction
Deep learning frameworks play a crucial role in building robust deep learning systems
(Zhang etal., 2020). With the rapid advancement of deep learning technology, the demand
for deep learning frameworks has experienced exponential growth (Guo etal., 2018). This
expansion encompasses the incorporation of new interfaces, the enhancement of function-
alities, and the optimization of compatibility with a wide array of hardware devices and
underlying drivers. Throughout this evolutionary process, the continuous iteration of code
and version updates inevitably introduces bugs into deep learning frameworks (Zhang etal.,
2018). Bugs in deep learning frameworks can have a significant and wide-reaching impact
on a larger user base compared to specific deep learning models. Particularly in safety- and
security-critical domains like autonomous driving (Chen et al., 2015) and healthcare (Cai
etal., 2014), the consequences of these bugs can be more severe. Therefore, ensuring the
reliability of deep learning frameworks is of utmost importance.
Numerous studies have been conducted to gain insights into the characteristics of bugs
in deep learning frameworks and provide assistance in their resolution. For instance, Jia
et al. (2021) conducted an analysis of bugs in TensorFlow based on 202 bug fixes. The
findings revealed that bugs in TensorFlow can be classified into 6 distinct categories based
on symptoms and 11 distinct categories based on root causes. In (Islam etal., 2019), Islam
etal. examined five deep learning libraries, namely Caffe (Jia etal., 2014), Keras (Lux &
Bertini, 2019), TensorFlow (Girija,2016), Theano (Team etal.,2016) and Torch (Collobert
et al., 2002). They analyzed 2,716 posts from Stack Overflow and 500 bug fix commits
from GitHub to identify commonly occurring bug types in deep learning frameworks.
According to the classification results, there are five different bug types, including API
bugs, Coding bugs, Data bugs, Structural bugs, and Non model structural bugs. In (Du
et al., 2022), we conducted a classification of bug reports in TensorFlow, MXNET, and
PaddlePaddle based on fault-triggering conditions. Bugs were categorized into Bohrbugs
(BOHs) and Mandelbugs (MANs), taking into account the conditions of fault activation
and error propagation. Moreover, within the MAN category, bugs were further classified as
either non-aging related Mandelbugs (NAMs) or aging-related bugs (ARBs).
However, the bug classification process in the aforementioned studies was all performed
manually. As the number of bug reports in deep learning frameworks continues to increase,
manually classifying all bug reports becomes impractical. Therefore, the development
of bug report classification methods becomes essential. In (Xia etal., 2014), the authors
employed the bag-of-words model to represent bug reports and utilized machine learn-
ing classifiers to classify them. However, the bag-of-words model neglects the contextual
semantic information present in bug reports, resulting in inadequate classification results.
To address this limitation and effectively utilize the semantic information embed-
ded within bug reports, we proposed the DeepSIM method in Du et al. (2021). Deep-
SIM employed a word2vec semantic model that was trained based on over two million
bug reports. However, the effectiveness of DeepSIM is hindered by the constrained size
of the training corpus utilized for the semantic model. To address the aforementioned
issues, we propose a Large Language Model-based Bug Report Classification framework
(LLM–BRC) for deep learning frameworks. Large language models (LLMs), particularly
GPT-3 and GPT-4 (Brown etal., 2020; Radford etal., 2018, 2019) have proven transforma-
tive in numerous fields and have made remarkable contributions in domains ranging from
mathematics (Frieder etal., 2023) and communication (Guo etal., 2023) to even medicine
(Nov etal., 2023). In particular, the prowess of LLMs lies in their ability to revolutionize
987
Software Quality Journal (2024) 32:985–1005
1 3
text processing across diverse tasks, substantially propelling the fields of natural language
understanding and generation to new heights (Ray, 2023). One of the core strengths of
LLMs is their mastery of language representation through dense vector embeddings. By
capturing intricate semantic meaning and contextual information, these embeddings allow
for a more nuanced understanding of language and context-aware language processing.
In our framework, we leverage the text-embedding-ada-002 model, which is the second-
generation embedding model announced by OpenAI on December 15, 2022, to represent
bug reports and facilitate bug report classification. Based on this model, bug reports can
be transformed into embeddings of a dimension size of 1,536. These embedding vectors
are then fed into a feed-forward neural network (FFN) for bug report classification. Unlike
traditional machine learning classifiers, FFN excels at capturing intricate patterns and
dependencies within the data, enabling it to learn highly representative and discriminative
features. This allows for enhanced accuracy of bug report classification and the ability to
handle high-dimensional input data efficiently. Finally, the effectiveness ofLLM–BRCis
evaluated on bug reports from three deep learning frameworks.
In summary, this article makes the following main contributions.
1. We present LLM–BRC, a Large Language Model-based Bug Report Classification
framework that combines a large language model with a deep learning classifier.
Through this method, we achieved accurate classification of bugs in deep learning
frameworks, with an accuracy ranging from 92% to 98.75%.
2. We explore the factors influencings classification results, including information from
different components of bug reports and types of language models, to further promote
the practical application of this method.
3. In order to facilitate bug report classification research, we have open-sourced both the
data and the method, which can be accessed at the following webpage: https:// sites.
google. com/ view/ llmbp/.
The rest of the paper is organized as follows. Section II presents the proposed approach.
Section III provides an overview of the experimental setup. Section IV describes the evalu-
ation and analysis of the results. In section V, we discuss the threats to validity. Section VI
presents the related work. Finally, the last section concludes the paper.
2 Our approach
In this section, we propose a bug report classification framework called LLM–BRC.
The overall procedure of LLM–BRC is depicted in Fig. 1. As shown in the figure,
LLM–BRCcomprises three sequential steps: data preparation, LLM-based bug report rep-
resentation, and bug report classification. In the data preparation phase, we start by extract-
ing information from bug reports in deep learning frameworks’ GitHub repositories, using
a custom-designed web crawl tool. Next, the preprocessed bug reports are fed into the
OpenAI’s text-embedding-ada-002 model, which transforms the natural language text into
dense embedding vector representations. These embeddings capture the semantic meaning
and contextual information present in the bug reports. Finally, a FFN is constructed and
trained using labeled bug reports. The FFN utilizes the learned embeddings to perform the
bug report classification task. In the subsequent parts of this section, we provide a detailed
explanation of each step of LLM–BRC.
988
Software Quality Journal (2024) 32:985–1005
1 3
2.1 Data preparation
We initiate the data preparation process by crawling bug reports based on their Bug-ID
from the GitHub repositories of TensorFlow, MXNET, and PaddlePaddle. This crawl
phase considers a total of 3,110 bug reports from these three deep learning frame-
works, which were previously labeled in our previous work (Du etal., 2022). Since text
is the dominant feature contained in bug reports, we collect natural language informa-
tion including title, description, and comments from each bug report. Among them, the
title provides a concise summary of the entire bug report, offering a brief overview of
the entire bug report. The description section contains a detailed account of the issue,
including observed software anomalies, the software runtime environment, reproduction
steps, and other relevant details. Furthermore, the comment section comprises discus-
sions and exchanges among developers, the report submitter, and other interest parties.
These comments provide valuable insights and additional information related to the
reported issue.
2.2 LLM‑based bug report representation
After extracting bug reports, we obtain a corpus of text data. To represent these texts effec-
tively, we utilize a powerful pre-trained large language model called text-embedding-ada-002.
By applying text-embedding-ada-002 to the texts, we obtain dense and low-dimensional
embedding vectors that serve as compact representations of the original bug reports.
Specifically, text-embedding-ada-002 model employs the Transformer architecture
(Ashish etal., 2017) to convert input into a 1,536-dimensional vector. Firstly, each input
bug report is tokenized and segmented into tokens. Next, the tokens pass through 96
decoder layers, each comprising a masked multi-head self-attention mechanism and a feed-
forward neural network. The multi-head self-attention layer computes self-attention on the
input sequential data, generating feature representations for each position in the sequence.
The feed-forward network performs fully connected calculations on the feature vectors at
each position, producing new feature representations. Its crucial role is to provide nonlin-
ear transformations.
Fig. 1 Detailed structure of LLM-BRC
989
Software Quality Journal (2024) 32:985–1005
1 3
The decoder layers start by applying
h
different linear projections to the Query, Key, and
Value. The resulting attention values for each head i are calculated as follows:
where
Q
,
K
, and
V
represent the query vector, key vector, and value vector, respectively.
The attention mechanism used in the transformer employs scaled dot-product attention,
which can be defined as:
where,
d
k
represents the dimension of the query/key vectors.
The resulting attention values from all the heads are concatenated together, resulting in
a single multi-head attention output:
where
W
O
is a weight matrix used to combine the multi-head attention outputs.
Additionally, the decoder includes an additional masked multi-head self-attention layer.
This layer prevents the model from seeing future information during sequence prediction.
Hence, the final output of the decoder can be represented as:
where
y
represents the input sequential data,
x
refers to the output sequence from the
encoder,
denotes the multi-head self-attention layer,
FFN
represents the feed forward
layer,
LN
represents the layer normalization layer, and
M
MHA
signifies the masked multi-
head self-attention layer.
Finally, the output of the attention layer undergoes processing through a feed-forward
neural network. The position-wise feed-forward network is a fully connected feed-forward
neural network where each word at a position passes through the same network indepen-
dently. It essentially consists of two fully connected layers. After passing through all the
decoder layers, the final output is generated by the last decoder layer. This output contains
the contextual information of the bug reports and serves as the ultimate embedding vector
representation for bug reports. This embedding vector will be used for subsequent classifi-
cation tasks.
2.3 Bug report classification
In this section, we conduct the bug report classification task at three levels, as depicted
in Fig.2. At the first level, we classify bug reports into two categories: bugs and non-
bugs. As depicted in Herzig et al. (2013), not all bug reports contain actual bugs.
Therefore, bug reports related to requests for new features or enhancements, documen-
tation issues (e.g., missing information, outdated documentation, or harmless warn-
ing outputs), compile-time issues (e.g., cmake errors or linking errors), operator errors
or duplicate reports are considered non-bugs and should be filtered out. Based on the
complexity of fault activation and/or error propagation conditions, we predict bugs into
Bohrbugs (BOHs) and Mandelbugs (MANs) in the second level (Grottke & Trivedi,
2005). Finally, within the MAN category, we further differentiate between aging-related
(1)
head
i
= attention(QW
Q
i
, KW
K
i
, VW
V
i
)
(2)
Attention(Q, K, V)=softmax(
QK
T
(d
k
)
)V
(3)
MultiHead
(
Q
,
K
,
V
)=
concat
(
head
1
, ...,
head
h
)
W
O
(4)
DecoderLayer
(
y
)=
LN
(
y
+
M
MHA(y)
+
MHA
(
y, x
)+
FFN
(
y))

资源文件列表:

代码生成论文_20241021.zip 大约有28个文件
  1. 代码生成论文_20241021/
  2. 代码生成论文_20241021/代码或bug分类/
  3. 代码生成论文_20241021/.DS_Store 6KB
  4. __MACOSX/代码生成论文_20241021/._.DS_Store 120B
  5. 代码生成论文_20241021/代码生成/
  6. 代码生成论文_20241021/代码评估/
  7. 代码生成论文_20241021/代码搜索/
  8. 代码生成论文_20241021/代码模型survey/
  9. 代码生成论文_20241021/代码或bug分类/.DS_Store 6KB
  10. __MACOSX/代码生成论文_20241021/代码或bug分类/._.DS_Store 120B
  11. 代码生成论文_20241021/代码或bug分类/LLMBRC A large language model-based bug report classification framework.pdf 2.56MB
  12. __MACOSX/代码生成论文_20241021/代码或bug分类/._LLMBRC A large language model-based bug report classification framework.pdf 418B
  13. 代码生成论文_20241021/代码评估/.DS_Store 6KB
  14. __MACOSX/代码生成论文_20241021/代码评估/._.DS_Store 120B
  15. 代码生成论文_20241021/代码评估/Program Code Generation with Generative AIs.pdf 480.83KB
  16. __MACOSX/代码生成论文_20241021/代码评估/._Program Code Generation with Generative AIs.pdf 425B
  17. 代码生成论文_20241021/代码评估/A_Comparison_of_the_Effectiveness_of_ChatGPT_and_Co-Pilot_for_Generating_Quality_Python_Code_Solutions.pdf 352.52KB
  18. __MACOSX/代码生成论文_20241021/代码评估/._A_Comparison_of_the_Effectiveness_of_ChatGPT_and_Co-Pilot_for_Generating_Quality_Python_Code_Solutions.pdf 510B
  19. 代码生成论文_20241021/代码评估/Comparing large language models and human programmers for generating programming code.pdf 2.04MB
  20. __MACOSX/代码生成论文_20241021/代码评估/._Comparing large language models and human programmers for generating programming code.pdf 340B
  21. 代码生成论文_20241021/代码搜索/.DS_Store 6KB
  22. __MACOSX/代码生成论文_20241021/代码搜索/._.DS_Store 120B
  23. 代码生成论文_20241021/代码搜索/Multimodal Representation for Neural Code Search.pdf 1019.4KB
  24. __MACOSX/代码生成论文_20241021/代码搜索/._Multimodal Representation for Neural Code Search.pdf 340B
  25. 代码生成论文_20241021/代码模型survey/A Survey on Large Language Models for Code Generation .pdf 2.33MB
  26. __MACOSX/代码生成论文_20241021/代码模型survey/._A Survey on Large Language Models for Code Generation .pdf 340B
  27. 代码生成论文_20241021/代码模型survey/.DS_Store 6KB
  28. __MACOSX/代码生成论文_20241021/代码模型survey/._.DS_Store 120B
0评论
提交 加载更多评论
其他资源 javaweb项目选课管理系统spring+springMVC+mybatis+mysql-java课程设计毕业设计
本项目是一个基于JavaWeb的选课管理系统,采用Spring、SpringMVC、MyBatis和MySQL技术栈开发,旨在为在校大学生提供高效、便捷的选课服务。系统涵盖课程管理、学生管理、教师管理等核心功能,帮助学校和学生实现科学化的选课管理。 该源码适合用于Java课程设计和毕业设计,帮助学生深入理解JavaWeb开发的关键技术和实践应用。无论是初学者还是有一定基础的Java技术爱好者,都可以通过本项目获取丰富的学习资源和实际经验。通过研究和修改该系统,用户能够提升自己的编程能力、系统设计能力以及项目管理技能。
基于Springboot的旅游网站的设计与实现
基于Springboot的旅游网站的设计与实现,主要采用Springboot,mybatis,vue,mysql,jdk等技术,采用B/S架构,分为前台用户端系统和后台管理员端系统。
基于Springboot的旅游网站的设计与实现 基于Springboot的旅游网站的设计与实现 基于Springboot的旅游网站的设计与实现
javaweb项目学生信息管理系统spring+springMVC+mybatis+mysql-java课程设计毕业设计
本项目是一个基于JavaWeb的学生信息管理系统,采用Spring、SpringMVC、MyBatis和MySQL技术栈开发,旨在为在校大学生提供高效、系统的学生信息管理解决方案。系统包含学生管理、课程管理、教师管理等功能模块,帮助学校和学生实现信息的快速查询与管理。 该源码特别适合用于Java课程设计和毕业设计的参考,帮助学生深入理解JavaWeb开发的关键技术和应用场景。无论是初学者还是有一定基础的Java技术爱好者,都可以通过本项目获取宝贵的学习资料和实践经验。通过对该系统的分析与改进,用户能够提升编程能力、数据库管理能力及项目实施能力。
javaweb项目在线试衣系统spring+springMVC+mybatis+mysql-java课程设计毕业设计
本项目是一个基于JavaWeb的在线试衣系统,采用Spring、SpringMVC、MyBatis和MySQL技术栈开发 该源码特别适合用于Java课程设计和毕业设计的学习参考,帮助在校大学生深入理解JavaWeb开发的核心技术和实际应用。无论是Java初学者还是有一定基础的技术爱好者,都可以通过本项目获取丰富的学习资料和实践经验。通过分析和修改该系统,用户能够提升自己的编程能力、前端开发技能和项目管理能力。
javaweb项目校园社团管理系统spring+springMVC+mybatis+mysql-java课程设计毕业设计
本项目是一个校园社团管理系统,采用Spring、SpringMVC、MyBatis和MySQL技术栈开发,旨在为在校大学生提供高效的社团管理解决方案。系统包括社团信息管理、学生信息管理、通知信息管理、活动信息管理等功能模块,方便社团组织者和成员进行日常管理和信息交流。 该源码适合用于Java课程设计和毕业设计,帮助学生深入理解JavaWeb开发的核心技术及应用。无论是Java初学者还是有一定基础的技术爱好者,都可以通过本项目获取宝贵的学习资料和实践经验。通过分析和改进该系统,用户能够提升自己的编程能力和项目管理技能。
STM32F407 HAL库:双DAC的信号发生器+双ADC采集
源代码
javaweb项目鲜花商城管理系统spring+springMVC+mybatis+mysql-java课程设计毕业设计
采用了Spring、Spring MVC、MyBatis和MySQL**等热门开发框架,专为在校大学生的Java课程设计和毕业设计提供学习参考和实践指导。 通过本项目,您将能够学习到如何使用Spring框架搭建高效的后端服务,如何运用MyBatis进行数据持久化操作,以及如何通过Spring MVC实现前后端的交互。
yolov5实现人群计数
本项目是一个使用 YOLOv5 模型实现的人群计数 Python 应用。YOLOv5 是一个流行的目标检测模型,以其速度快和准确性高而闻名。通过这个项目,你可以快速部署一个能够识别图像中人数的系统。 功能特点: 高精度人群计数:利用 YOLOv5 模型的高效目标检测能力,实现对人群的精确计数。 实时图像处理:支持从摄像头或视频文件中实时读取图像,并进行人群计数。 易于集成:代码结构清晰,易于与其他系统或应用集成。 跨平台支持:兼容主流操作系统,包括 Windows、Linux 和 macOS。 技术栈: Python:编程语言。 YOLOv5:目标检测模型。 OpenCV:用于图像处理和显示。
yolov5实现人群计数 yolov5实现人群计数 yolov5实现人群计数