1. What contributions have the authors mentioned in the paper "Combining graph-based learning with automated data collection for code vulnerability detection" ?
This paper presents FUNDED, a novel learning framework for building vulnerability detection models.. To provide sufficient training data to build an effective deep learning model, the authors combine probabilistic learning and statistical assessments to automatically gather high-quality training samples from opensource projects.. This provides many real-life vulnerable code training samples to complement the limited vulnerable code samples available in standard vulnerability databases.
read more
2. What are the future works mentioned in the paper "Combining graph-based learning with automated data collection for code vulnerability detection" ?
Naturally, there is room for future work and further improvement.. The authors leave this as their future work.. Their future work will explore a language model that is specifically built for modeling program source code like code2vec [ 66 ].. Providing a theoretical proof of the underlying working mechanism of FUNDED is their future work.
read more
3. How does the embedding layer compute the representation vector of a graph node?
Their 100-dimensional embedding vector, hv , of a graph node, v, is computed by the embedding layer through recursively aggregating and transforming the representation vectors of its neighboring nodes.
read more
4. What is the effect of a higher classification threshold?
Lowering the classification threshold (i.e., a higher FPR) increases the likelihood for labeling more samples as vulnerability-relevant, thus increasing both true and false positives.
read more





