Recognition of Automated Hand-written Digits on Document Images Making Use of Machine Learning Techniques

— The purpose of this study is to create an automated framework that can recognize similar handwritten digit strings. For starting the experiment, the digits were separated into different numbers. The process of defining handwritten digit strings is then concluded by recognizing each digit recognition module's segmented digit. This research utilizes various machine learning techniques to produce a strong performance on the digit string recognition challenge, including SVM, ANN, and CNN architectures. These approaches use SVM, ANN, and CNN models of HOG feature vectors to train images of digit strings. Deep learning methods organize the pictures by moving a fixed-size monitor over them while categorizing each sub-image as a digit pass or fail. Following complete segmentation, complete recognition of handwritten digits is accomplished. To assess the methods' results, data must be used for machine learning training. Following that, the digit data is evaluated using the desired machine learning methodology. The Experiment findings indicate that SVM and ANN also have disadvantages in precision and efficiency in text picture recognition. Thus, the other process, CNN, performs better and is more accurate. This paper focuses on developing an effective system for automatically recognizing handwritten digits. This research would examine the adaptation of emerging machine learning and deep learning approaches to various datasets, like SVM, ANN, and CNN. The test results undeniably demonstrate that the CNN approach is significantly more effective than the ANN and SVM approaches, ranking 71% higher. The suggested architecture is composed of three major components: image pre-processing, attribute extraction, and classification. The purpose of this study is to enhance the precision of handwritten digit recognition significantly. As will be demonstrated, pre-processing and function extraction are significant elements of this study to obtain maximum consistency.


I. INTRODUCTION
Pattern detection and machine learning have faced significant problems due to the discovery of modern technologies and streaming news sources [1]. Handwriting character recognition is becoming a popular study area due to technological advances such as handwriting capture systems and powerful handheld computers [2], [3]. Due to the arbitrary nature of handwriting, it is challenging to create a high-reliability recognition system capable of identifying any handwritten character input to an application. This paper addresses the difficulty of reading handwritten digits, i.e., numbers ranging from 0 to 9. Handwritten digit recognition is a required functionality in various practical applications, including administration and finance [4]. These companies need a substantial degree of recognition and the highest possible level of dependability. With remarkable results, unregulated handwritten number recognition has been applied to reference numbers on receipts, handwritten records such as tax returns, and postal postcodes on postcards [5]- [7]. The term "constraint awareness" applies to an individual's belief that forces outside their influence constrain their actions. Also, there are various elements in an unrestricted recognition method: pre-processing, feature extraction, definition, validation, and verification. OCR is a subfield of artificial intelligence, and character interpretation covers a wide variety of research fields [8]. OCR is equipped with a slew of functions. For reference, images of authentication codes, automatic recognition of license plates, and text content retrieval [9]. Studies conducted on OCR systems have also identified several features for digital handwriting recognition. Although the majority of parts are standardized, others use unique qualities to increase classification efficiency. This involves graphical methods, shadow-based and gradient-based features [10]. Simultaneously, some researchers examined images of specific manuscript numbers, only a few mentioned pre-processing of images. For example, [11] proposed a hybrid model that incorporated two superior classifications: the CNN and the Support Vector Machine (SVM), which were evaluated without preprocessing on the MNIST database recognition accuracy of 94.4% with a 5.6% rejection rate. Methodologies of image pre-processing such as scanning, segmentation, normalization, thinning, and inverted rectification can significantly impact image characteristics and results. The majority of image pre-processing techniques can suppress noise and restore photos, allowing for easier manipulation of the image and further improving OCR accuracy.
Additionally, the writings of various individuals are more or less slanted. To resolve this, the process of turning an image makes use of elastic compression, which allows for the possible similarity between two samples representing the same digit. For decades, handwriting digit recognition has been extensively studied in OCR using several frameworks and classification algorithms. This includes, but is not limited to, the SVM, CNN, and Random Forest (RF) datasets. Even so, the studies' precision in identifying individuals is usually about 95%. Due to the inability of specific classifiers to correctly handle the original images or descriptions, feature extraction is a primary treatment method for reducing the size of the data and abstracting the actual material [12], [13].
Handwritten character recognition is a growing area of study that includes artificial intelligence, machine vision, and pattern recognition. A handwriting recognition algorithm can learn and detect characteristics from images and touch-screen devices and translate them to a machine-readable format.
Handwriting recognition systems are classified into two types: online and off-line. Both styles may be used in apps to learn dynamically based on customer input while still doing off-line learning on data in parallel. Statistical techniques, structural methods, neural networks, and syntactic methods have also been used in online and off-line handwriting recognition. Any identification systems identify strokes, while others recognize a single character or whole phrases. Hand-written Character Recognition Method Focused on Neural Networks with Feature Extraction [14].
The main aim is to use machine learning methods to explore automatic handwritten integrated services digital recognition standards. This study examines the field of recognition of twisted handwritten numbers. Given that handwritten numbers provide a broad range of choices for the manuals used this study is primarily concerned with classifying handwritten numbers. While machine learning algorithms do very well with handwritten digits when the digits are well segmented, optimization methods have a low segmentation performance, decreasing recognition when digit strings are considered. As a result, accurate document digit string recognition techniques are critical for increasing the rate of recognition of handwritten digit strings [15]. The water storage system success in the manual recognition of digits encouraged us to use the water storage system to assess vertical cuts for digit string segmentation by training the regions between numbers [16].
"Machine learning is a subset of software engineering that allows machines to understand indirectly," according to Arthur Samuel. This study helps to predict and learn from different datasets through the use of algorithms applied. For which coding tasks are challenging to complete, machine learning algorithms are used. These activities include cybercrime monitoring, machine vision, global population prediction, email control, weather forecasting (OCR), diagnosis, and decision-making in real-time. Machine learning concepts are classified into three separate groups: Supervised Education, Unsupervised Education Learning Without Supervision Pattern Recognition Learning Under Supervision: Considering the following situation: a dataset is used to provide guidance, and assumptions regarding the presentation of the output data may be drawn. There is a relationship between the inlet and outlet outcomes in guided learning. The performance can be estimated with the specified data [17], [18].

II. RELATED WORKS
The list is a compilation of terminology and meanings used in this report. Many similar studies were influenced by our work on digital identification using machine learning techniques such as an SVM (support vector machine), an artificial neural network, and a convolutions neural network [19]. Thus, the identification of noisy digits is enhanced by adding these three classifiers (SVM, ANN, and CNN). It showed that SVM, ANN, and CNN systems could accurately recognize handwritten digits on recorded photographs [20]. On the other hand, these techniques are being used in this study to determine the best algorithm for handwritten digit recognition. The area of science has discovered certain pitfalls. Thus, pre-studies are necessary to understand prior studies of segmentation techniques and the limits of current machine learning methods [21]. The literature review results show a considerable increase in existing research on digit recognition pre-processing, segmentation, feature extraction utilizing specific techniques, and classification. The paper's authors [22] researched "Handwritten Word Recognition by Multi-view Analysis." The challenge of correctly separating handwritten terms from a small vocabulary is addressed in this study, contributing significantly. The authors devised a method for evaluating words at three different estimation thresholds to arrive at a statistical response affected by the human learning process. The authors of the paper [24] conducted a study on the topic of "Handwriting Recognition on Type Documents." The author used the Freeman Chain Code, which divides a field into nine subregions, histogram normalization of chain code as a method for feature extraction, and Artificial Neural Networks to categorize the characters on the type text. The writers published a study on "Neural Networks for Handwritten English Alphabet Recognition" in their paper [25], [26]. They used neural networks to create a machine that could interpret handwritten English alphabets. Each letter of the alphabet is expressed by binary values fed into an essential feature extraction scheme, provided into the neural network method. The authors of the article [27] deduced the properties of numerical and logical operators. They used SVM to identify and remove noise from the results. A feature extraction technique was used on the NIST dataset, including uppercase, lowercase, and fused upper-and lowercase characters. The authors of the article [28] "Sunspot drawings handwritten character recognition scheme based on deep learning" described a framework for using deep learning to recognize handwritten characters in scanned sunspot drawings. To perform the recognition paradigm for handwriting text images, a Convolutional Neural Network is used. Convolutional Neural Networks (CNNs) are a class of deep learning algorithms that are particularly passionate about teaching multi-layer neural networks. The Chinese Academy, Yunnan's purpose of the proposed method, and the experimental findings show that the suggested method achieves a high degree of identification precision. Those who developed [29]. "A novel method for segmentation and acknowledgment of unconstrained handwritten numeral strings" has introduced a novel scheme for segmentation and recognition of unconstrained handwritten numeral strings. By mixing original image objects, the proposed computer segments enter digits. The authors of this article [30] offer a method for removing features from English handwritten characters. The data are classified according to their resemblance to the vector feature used in data training and analysis. The authors of the paper [31] "Modern, efficient algorithm for recognizing handwritten Hindi digits" suggest a new algorithm for recognizing handwritten Hindi is based on extracting a range of features from the topological and analytical attributes of the given digits. We used pattern recognition to conduct off-line Chinese handwritten character recognition in this paper [32], [33] titled "Post-processing for off-line Chinese handwritten character string recognition." The challenging problems discussed in this analysis are an accessible writing style, a significant degree of variety in character types, and distinct geometric features of recognition. To address this issue, post-processing was used to improve the precision of character recognition.

A. Character Recognition Algorithms
Character recognition algorithms are classified into three types: image pre-processing, feature extraction, and classification. They are typically used in seriesidea preprocessing facilitates feature extraction, needed for accurate variety. They operate as shown in Fig. 1:

B. Image Pre-processing
For accurate character prediction, image pre-processing is critical in the recognition pipeline. Noise reduction, image segmentation, cropping, scaling, and other techniques are often used. As an initial input, the recognition device recognizes a scanned file. JPG or BMT files are appropriate.
The digital capturing and transfer of an image also creates noise, making it difficult to determine what is a part of the object of interest. Given the issue of character recognition, we want to reduce as much noise as possible while retaining the character strokes necessary for proper classification.

C. Segmentation
A series of characters is segmented into a sub-image of an actual character during the segmentation period. Each character has been resized to 30×20 pixels.

D. Feature Extraction
The characteristics of input data are the observable properties of observations used to analyze or distinguish certain data instances. Feature extraction aims to find relevant features that distinguish cases that are unrelated to one another.

E. Classification and Recognition
The recognition system's decision-making stage is at this point. The classifier consists of two hidden layers and is trained using a log sigmoid activation feature.

F. Constant Recognition of Hand-written Words Using Neural Networks
As a phrase is divided into triplets, a method for continuous identification of handwritten words is extracted (containing three letters). Two subsequent triplets share a pair of letters. The major challenge that recognition systems face is running operations at constant time intervals. Each word is further subdivided into three-letter triplets. Two neighboring triplets sometimes contain two common letters, indicating letter overlap. As a result of this form of duplication, the recognition rate is increased.

III. METHODOLOGY
This segment discusses the various methods and strategies used to create models and how models are learned and tested. This section would go through the algorithms used and illustrate the suggested scheme's Fig. 2 Schematic The block diagram of a proposed process. Technique Two distinct modes of analysis were used in this study: The experiment focused on reviewing the literature.
Initially, a literature review was carried out to address RQ1 and determine the kind of data required to train and test machine learning methods. The aim of conducting a literature review is to gain information about machine learning data sets and familiarize oneself with the various machine learning techniques available to train the data set. A review of the literature was conducted to familiarize we with the multiple databases used for research and examination. Additionally, the author acquired an understanding of the various data preprocessing, segmentation, and machine learning methods used in the analysis. The following steps were taken before performing the literature review: "Handwritten Digit Recognition," "Handwritten Digit Segmentation," "Handwritten Digit Classification," "Machine Learning Methods," "Deep Learning," "Image Analysis on Text Files," "Support Vector Machine," "Artificial Neural Networks," "Conventional Neural Networks," and "Preprocessing Handwritten Digits." Step 1: "Handwritten Digit Recognition," "Handwritten Digit Segmentation," "Handwritten Digit Class Certain keywords were identified before generating the search string.
Step 2: Primary keywords were selected from the list of keywords and used to build the search string.
Step 3: The following search strings were generated to execute the quest in different digital repositories: "Automatic handwriting recognition system and tool" is the first search string. The second quest string is "Hand-written digit identification." The third quest string is "Hand-written digit recognition neural network." The fourth quest string is "classifier methods handwriting digit identification." The fifth quest string is "Hand-written digit segmentation and identification." The sixth quest string is "Hand-written digit recognition utilizing Deep Learning techniques." The seventh quest string is "most powerful strategies for identifying handwritten." Step 4: After compiling a list of articles, journals, and conference papers, inclusion and exclusion criteria were used to narrow the scope of the results. The following criteria were used to choose pieces: they must be from the past two decades; the title and summary of the journal must refer to the problem domain. The document must be fully accessible in English. Exclusion Criteria: No full-text paper is available online. The article is not published in English. Articles that do not pertain to computer science are rejected. Reports on the identification of single digits. They were extended to handwriting digit identification using a biologically motivated hierarchical temporal memory model. Handwritten digits focused on mathematical anatomy and a slew of articles on rule-based decision fusion.
Step 5: The experiment's findings section details the different machine learning methods applied to our study. While capturing related and background work, diverse data sets were analyzed, pre-processed, and segmented.
Following the collection of necessary data from the literature review, data processing is performed. We will do a literature review and analyze the data using narrative synthesis. During the literature review, data were collected and summarized in a paragraph. The data collection results were recorded and incorporated into the research testing process.  Recognition of Handwritten Digits The proposed handwritten digit recognition scheme is focused on the input picture, pre-processing the image, segmenting the image, extracting the attribute of the image, and classifying the digit based on the extracted feature [34], [35]. Following is a brief description of the feature generation process, classifiers, and digit recognition technique. Fig. 3: A method for recognizing handwritten digits.

A. Water Reservoir Segmentation
In [36], a new system for segmenting handwriting touching numerals was introduced based on a water reservoir principle. This term is used to identify the points where two or more images overlap or intersect. Reservoir points are produced by dropping water from the top and bottom of a frame, and the places where the water accumulates are used to build water reservoirs. Reservoir points are the names given to the deposited sites. These reservoir points are used to calculate segmentation points without the need to normalize or thin the data. The reservoir point is used to decide whether the obtained picture is attached or separated, and all associated sub-images are removed. In this scenario, the touching numerical has been segmented. A large reservoir area is formed when two digits come into touch with each other. The river determines the breaking lines. Reservoir attributes such as the center of gravity and height are considered when selecting a particular cutting point [37]. Initially, reservoirs are determined by looking at the shape of an image through touching digits. These reservoirs are divided into top and bottom reservoirs [38], [39]. The shape of the reservoir and the position of the reservoir's foundation define the touching positions of the digits. Height, touching location, center of gravity, and reservoir closed loops are all factors that go into determining the best cutting point [40].

B. Extraction of Characteristics
A technique for extracting character characteristics from a reference image is known as feature extraction [41]. Role extraction may be divided into two categories: 1. Statistical feature extraction In quantitative feature extraction, the feature vector is the sum of all the features extracted from each character and the correlation feature vectors between feature positions in the character image matrix.

Structural feature extraction
When a character's morphological features are extracted from an image matrix, it is referred to as structural feature extraction. It accounts for ranges, circumference, regions, and other factors. Feature extraction functions are used for indexing and naming the dataset and classifying and identifying handwritten digits.

C. Dataset Used
The dataset is needed for both training and testing purposes [42]. Datasets include colored images used to represent data images. The dataset contains a total of 9096 images. We included 70% of the photographs in the available data for classification and the remainder 30% for inspection.

D. Feature Generation
There have been many suggested strategies to feature generation [43], [44]. By minimizing intra-class variability and maximizing inter-class variability, we use various features in the proposed scheme to increase the recognition rate of handwritten digits. Certain global numbers, projection-based features, and features computed from the digit's contour and skeleton were included in the study. Creating characteristics Related digits that have been recognized All photos are segmented into a sequence of related digit recognition, and each one is considered as a segmentation hypothesis, which is expected to include a digit or a fragment of a digit [45], [46]. If GC is accepted, it is considered a digit; otherwise, it is considered a non-digit. To coordinate segments using GCA and DRV at the same time, the following perceptual rules are applied {Accept if GC intersects median line and fmax(XGC) >= tf XGC {Reject fmax(XGC) >= tf. Segmentation-recognition statements that are right and wrong to determine if our system's segmentation and identification are functioning improperly or adequately, see Fig. 5.

IV. RESULTS AND ANALYSIS
This portion covers all techniques for digital image segmentation and digital image recognition. A grayscale picture is measured after digitization. Using Otsu's approach to produce binary images, the form of the picture is enlarged to a set aspect ratio so that all images have the same length and width. Finally, the morphological operator was used to eliminate the image's noise. Following completion of the preprocessing phase, the segmentation phase is implemented. We compared recognition techniques using water reservoir algorithms. That combines the idea of a drop-fall algorithm is used as a reference point for comparison. Then, in the preprocessing module, the digit recognition module is used to generate classifier-compatible files. This module primarily consists of a mass extraction and normalization center. Finally, the digit classifier element is used to identify segmented digits. This section compares support vector machines, artificial neural networks, and classical neural networks to determine the best algorithm for highperformance recognition with a good track record. The modules discussed previously will be defined in detail in this chapter before the tests are completed.

A. Digits Segmentation Custom Dataset
This function segmentation is achieved by utilizing a water reservoir, and a dataset of digits and non-digits is often used to validate the segmentation algorithms. As a result, the custom dataset is generated by extracting sample images from the CVL-Strings repository. We used Software Program to create an article on processing time. This module processes input images using pre-processing algorithms used in the module's primary segmentation and recognition algorithms. A window slides across the image with a horizontal resolution of ten pixels. A window with a height of 100 pixels and a width of 10 pixels is used, so the digit string always standardizes the photographs to a height of 100 pixels. When it breaks, the consumer gets the name of the image that stretches the window.

B. Digit Recognition Custom Dataset
The custom database is used for performance reasons for digital recognition algorithms. Because the experiments are carried out with the CVL Strings database, reconnaissance algorithms are chosen to evaluate with individual digits practiced with the CVL Strings database and are a real contrast between SVM, ANN, and the CNN classifier performance. As the central database is a digit string library, this database has been adequately segmented and developed. The outputs of the segmentation module are then transferred to the digital detection module trained with the individual database and classified individual numbers. Then, adequately labeled numbers are saved into an a.mat file and create a new custom collection of numbers. SVM, ANN, with HOG attribute extraction and CNN are used for the digital recognition module, as they have the best accuracy score.

C. Pre-processing
In the pre-processing of the dataset, colorful input images were first transformed into a grayscale by the handwritten digits. The picture is then dimensioned to maintain the aspect ratio. For the segmentation and classification modules to function correctly, the input images vary in vertical length

Feature generation/ Features
Raw Data since the segmentation module generates a series of images through windows with the same dimension and equalizing the sizes of the same size will suffice. The height is then set to h=100, and the horizontal length is adjusted to the aspect ratio of the picture. The gray image threshold is dependent on the Otsu threshold [47], [48]. Finally, morphological processing techniques are applied to eliminate noise, and step by step, we see the figures below. 1. Converts RGB to Gray Picture To get the binary picture, we must transform the RGB image to a Gray image.
2. Apply the 3×3 scale blur filter. A blur filter must be used to eliminate the Gaussian noise from the gray picture.
3. Transform the Gray picture into a binary picture. We need to translate the gray picture to the front and background image. The foreground would then be the letters we receive from the picture. To transform the Gray picture into a binary picture. A thresholding approach should be used since it is the easiest way to create a binary picture. We use the Adaptive Thresholding approach, thus. A picture threshold aims to identify the pixels as "black" or "light." Regenerative braking is a method of thresholding that takes spatial changes in lighting into account. We introduce a technique for adaptive thresholds in real-time using the integral input image. 4. Exclude areas that are smaller than 150 pixels in the field.
5. Merge the different letters by reimbursing block scale 8*4 pixels. 6. Remove a few more noises like a line. To improve the precision of identification, we need to have the essential background areas. Still, the input images have several noises in the foreground, as can be seen from the test images. We would therefore delete certain noises from the pictures to obtain the positive results as follows.
Have the histogram by lateral axis. Suppose the peak amount of the histogram is greater than 80. In that case, we can infer that the picture contains the line variable and goes before the histogram value is decreased in the top direction from the maximum histogram index.
If the histogram value is greater than or equivalent to the initial histogram value, then we pick the picture point we would cut.
We crop the picture with the y-axis obtained. The findings can be seen below.
7. Use the search method of the 8-connection path to classify any message. As seen in the figure below, the letter classifies three symbols.
8. Create a separate letter image for each letter in the alphabet. Finally, duplicate each letter picture to the X-axis of each letter as the letter sequence can be changed by characterizing the letters.

V. CONTRIBUTION
The primary objective of this research is to use a recognition system to identify distorted handwritten digits. Sub-images of attached digits, disjoint digits, and overlapped digits were developed. Then, using sub-images, a classifier was trained. The classifier was then used to classify individual pixels in a picture. This research demonstrates that machine learning approaches can be used to focus mainly on handwritten digits.
Consequently, potential research studies in automated handwriting digit recognition or hand recognition and classification will use our study as a starting point. Apart from that, the thesis's main contribution is introducing certain basic principles to the segmentation and interpretation of data. The experimental results reveal that a traditional neural network is the best algorithm or classifier for automatically identifying handwritten digits on known photos. Besides that, the accuracy rate is high as compared to SVM and ANN algorithms.

VI. CONCLUSIONS
This study aims to display automated handwritten digits on registered photos utilizing machine learning approaches. This is accomplished by using a variety of techniques such as HOG, Ostus, Segmentation, and Recognition. For segmentation, we used various techniques, including vertical projection histograms, components analysis (PCA, segmented component analysis, and digit recognition. When segmenting handwritten digits with joined, coincided, and disjoint digits, this approach has the benefit of being able to attain precise segment information. Three distinct variables decide the accuracy of the suggested digital handwritten digits on captured images. The mixture problem is with some heuristic laws when using SVM, ANN, and CNN for Digit Recognition. The indefinite length of digits or pictures makes it harder for classifiers to recall. On the other side, the suggested system helps machine learning methods to mistake automated handwritten digits by mixing HVP and CA on recorded images. The main objective of this research is to construct an automatic digit recognition system for text pictures. Because of divided numbers, conflicting numbers, identical numbers, digit ambiguity, undefined string duration, and over and under segmentation, it is a big hurdle to segment adjacent numbers. Above that, the architecture of the device is easy to comprehend. Water is used as a reservoir for the segmentation of two linked handwritten digits using SVM, ANN, and CNN methods to decide the proper methodology for the isolation of 2 distinct digits. Handwriting is a threestep process. The water reservoir mechanism allows for the segmentation of connected digits, the generation of features for split digits, and the recognition of segmentation is performed digits The limitations included classifier reliability, the absence of obvious segmentation points on attached digits, and the preparation time required for performing statistical analysis on all possible combinations of the experiment variables. When using SVM, ANN, and CNN for Digit Recognition, the mixture issue is specific heuristic laws. In certain instances, the indefinite duration of the digits or photos renders the job more complex for classifiers to recognize [49], [50]. However, through integrating HVP and CA, the suggested framework enables the automated handwriting of digits on registered photos to be confused with machine learning techniques.
While a new solution to cutting or segmenting digit strings has been proposed, the method also has several shortcomings that need to be addressed. As a result, further analysis is warranted, including the following: To improve segmentation precision, multiple classification models can be used simultaneously. To simplify the process, it is preferable to minimize the number of hypotheses; to speed up the calculation, it is advisable to use more advanced filters that exclude redundant segmentation hypotheses; however, Handwritten Digit Recognition Is Challenging.