We are observing an exponential growth of scientific literature since the last few decades. Tapping on the advancement of web-enabled tools and technologies, millions of articles are stored and indexed in the digital libraries. Among this archived scientific literature, thousands of newly emerging algorithms, mostly illustrated with pseudo-codes, are published in the area of Computer Science and other related computational fields every year. An array of techniques has been deployed to effectively retrieve information related to these algorithms by indexing their pseudo-codes and metadata from a vast pool of scholarly documents. Unfortunately, existing search engines are only limited to indexing a textual description of each pseudo-code and are unable to provide simple algorithm-specific information such as run-time complexity, evaluation performance (such as precision, recall, or f-measure), and the size of the data set it can effectively process, etc..
In this paper, we propose a set of algorithms that extract information pertaining to the performance of algorithm(s) presented or discussed in the research article. Specifically, sentences in the paper that convey information about the efficiency of the corresponding algorithm are identified and extracted, using the Recurrent Convolutional Neural Network (RCNN) model. Using a dataset of 258 manually annotated scholarly documents by four experts, originally downloaded from CiteseerX, our proposed RCNN based model achieves encouraging 77.65% f-measure and 76.35% accuracy
Saeed-Ul Hassan, Iqra safder,Junaid sarfraz, Mohsin Ali and Suppawong Tuarob “Detecting Target Text related to Algorithmic Efficiency in Full Text Digital Libraries using Recurrent Convolutional Neural Network Model “