Machine Learning Approach upon Text from Varied Publishing Formats
- 1Department of Computer Science and Engineering, Bhilai Institute of Technology, Durg, CG, India
- 2Department of Computer Science and Engineering, Bhilai Institute of Technology, Durg, CG, India
- 3Department of Computer Science and Engineering, Bhilai Institute of Technology, Durg, CG, India
Res. J. Computer & IT Sci., Volume 4, Issue (11), Pages 16-20, November,20 (2016)
The paper aims toward reporting the approach to challenges for conversion of documents from varied publishing formats to machine readable formats. This research objective falls in the field of information retrieval. In order to represent the documents available in machine readable format from different publishing format, there is a need to identify, access, process and finally represent the information in such a manner which makes it ready for easy machine access. The above mentioned task involves different types of challenges as discussed in detail below. These challenges are mentioned with approach as proposed to solve information retrieval task in huge text corpora.
- Hassan Tamir (2003)., A PDF TO HTML conversion., Third Year Project, University of Warwick, Coventry, West Midlands, U.K.
- Konar Amit (2000)., Artificial Intelligence and Soft Computing: Behavioral and Cognitive Modeling of Human Brain., CRC press, USA, ISBN:0-8493-1385-6.
- Silberschatz Abharam, Galvin Peter Baer and Gagne Greg (2005)., Operating System and Concepts., 7th Edition, John Wiley and Sons. Inc., USA, p:xvii, ISBN:0-471-69466-5
- NCERT (2014), India and the Contemporary World-I: A Textbook in History for class IX., Nation Council of Educational Research and Training, India, 6, ISBN:8-7450-536-9.