From the very beginning, one of our goals at Lexunit was to become a major player in the field of Artificial Intelligence in Hungary. Ádám and István have been developing their expertise in AI ever since the possibilities of this field began to take form in contemporary IT.
After playing around with our own Pac-Man and Pong clones, we began to look for a real-life challenge. If we keep our eyes peeled, it's not difficult to find opportunities where Artificial Intelligence can provide help. It didn’t take long for the first idea to come up.
Our First AI Idea
At that time, we were developing a cloud-based system for accounting firms with features that allow their clients to register, a payment module and all sorts of other useful built-in tools. This is where the idea came from: "Let's create an invoice interpretation system for this package!"
Our theory was that a large chunk of accountants' working time is consumed by the very monotonous task of entering invoice data collected in various different formats into their own system, so they practically have to re-type everything. If we could make sure that our software recognizes all data types smoothly through scanning and digital image analysis and sorts them into the right categories, we could make accountants' job much easier.
Invoices typically contain the following data:
● A unique invoice number (e.g. SA2019-01)
● Invoice date (e.g. 02/15/2019)
● Supply date (e.g. 02/15/2019)
● Payment deadline (e.g. 02/23/2019)
● Currency (e.g. HUF, EUR)
● Payment method (e.g. cash, bank transfer)
● Tax ID number (e.g. 25926585-2-42)
● Net amount (e.g. HUF 100,000,000)
● VAT rate (e.g. 27%)
● Total VAT (e.g. HUF 27,000,000)
● Gross amount (e.g. 127 000 000 HUF)
● Company name (e.g. Lexunit Group Kft.)
Practice and Testing in Software Development
Our goal at that time was only to practice and test problem-solving methods, so we had no idea what a jackpot we'd hit with this concept.
We received roughly thirty responses from contractor accountants to our in-house online survey, and we saw amazing numbers: It's not uncommon for accountants to process thousands or even ten thousand invoices a month, and the administration can take up to half of their working hours. Sure, they can hire an intern or use other half-solutions, but what about high-workload periods such as tax return deadlines, or high-profile projects where accountants would want to take personal responsibility for all the data?
In practice, the solution is to process the digital images of electronic invoices; and as for printed ones, industrial scanning is required first. During the procedure, an image file is created, which is processed by our software once it's uploaded, and its data is sorted in the appropriate categories. This is double-checked and corrected by the accountant where necessary – because there'll always be a few errors due to the AI-based NLP technology, which we'll discuss later.
But why do we need this process in the first place? Invoice processing could be a simple programming task, but only in an ideal, imaginary world where invoice formats are perfectly standardized, i.e. where all official documents look exactly the same and the only different thing about them is their data content; they have the same blanks, the same font, the same colors, and so on. Then we could extract all the data with an “if/else” solution, which would be little different from pressing “Ctrl-F” when browsing an online text.
In reality, however, a stack of invoices passing through the hands of an average accountant is a stochastic mess. You can pretty much know what they look like, but actually, they might all look different.
It's better to process such a material with Artificial Intelligence because, despite the few errors we mentioned previously, in the long run, the number of errors would be higher if a human did it.
And, of course, there’s an added benefit, the reason why we started the whole thing: The time spent on invoice processing is reduced four or five times using our software, including every step, even revisions and fixes.
How Can the Process Be Refined to Decrease the Number of Errors?
This is where machine learning comes in, when we “rewire the threads of the neural network”. (Software engineers can come up with such wonderful metaphors if they spend enough time near the Black Box)
But let's start with the basics. The data content of an ordinary invoice includes not only the numbers and letters on it, but also their size and location, as these dimensions also carry information from which the software can deduce where to categorize that data.
Chargrid: Towards Understanding 2D Documents
Scaling Up Machine Learning Algorithm for Form Recognition
In the past, a possible solution was using heuristics: They tried to collect rules such as the relative position of words and numbers, or that if a number is the sum of any other previously recorded numbers, it's probably a grand total. The problem with this is that even if we make millions of such rules, the process will never become fully automated, and it takes tedious work to maintain.
The point is to look for identification opportunities: is this a concrete number or a cardinal? Do these numbers correspond to a date format? Where is this word located? Does it stand alone or is it followed by other words? These are called “features”: the characteristics of a given data item.
From the feature data, we generate sequences, which are processed by the machine learning model, the central element of which is the neural network. The model renders a probability distribution to each word as to which data type it might correspond to. Tax number or grand total? VAT or invoice number?
The end result can, of course, be edited by the accountant before approval, and the great thing about this method is that corrections provide new training data for the system. Each manual correction increases the accuracy of the following results.
The Success of Prediction
One of the keys to the success of prediction is data purity, and this often gives us plenty to think about: three-quarters of working hours are spent trying to ensure that the input data is of the highest possible quality. We've already mentioned that in an ideal world, every invoice would have the exact same format; they would look the same. Let's take the basic invoice format of a Hungarian invoice management website, 'szamlazz.hu.', as a starting point. If all invoices were like this, a 100% recognition accuracy could be reached fairly quickly, because there is no noise filtered into the input data; there is no variance. But if we only train our system with 'szamlazz.hu' invoices, despite recognizing such data very skillfully, it won't be successful with other formats.
For the sake of example, let's suppose we're developing a system that recognizes vehicles on photos. If we show the AI 1000 images during its training, out of which 950 depict cars, and then test the system with 10 images, 8 of which show helicopters, it will perform poorly. If we show 8 cars and 2 helicopters, we can achieve an accuracy of up to 80 percent, which is a false result since it missed all the helicopters and recognized the cars only. This is why we have to train the system with an appropriate distribution.
(Eventually we got hired to do a project exactly like that: our software needed to analyze photos of cars as part of an insurance claiming process. You can read more about that here.)
But where can we get invoices in as many different formats as possible? The real invoices we were able to obtain didn't seem to be enough. We found 50 different types, but we felt that we need ten thousands for training.
Therefore, we produced templates from the 50 types of invoices and created new ones with randomized company data, allowing us to multiply the material needed for training.
As mentioned at the beginning of the article, the results were very encouraging, and it seemed like our software was handling the problem at the expected level, and its accuracy would only improve with regular use.
But what can we do with it? In software development, solving a problem is just part of our job, we also have to develop a product. Even though the end users of our system are accountants, the target audience might be the people who develop softwares for accountants. Hence, we need fitting marketing and sales techniques, and then continuous software iteration to optimally integrate our AI feature into already existing accounting softwares. This would require continuous maintenance and UX tasks, which is not the main profile of Lexunit. We're producing digital solutions to engineering problems. We had two options:
1. We make our project open source and trust that it will take off, spreading the name of Lexunit all over the world.
2. We make it available on our website as a live demo and case study.
We opted for the latter. If it were a final product, the software could be supplemented with a number of other obvious functions, such as real-time tax number verification with online databases or mathematical checks. We consider these to be “post-processing” functions, as the primary task of the project from our point of view was to demonstrate the practical abilities of the neural network. It succeeds in that, and what else we can do with the results is a different question.
The main point here is simply that accountants invest heaps of human resources into a low-skill task: data recording. With software solutions like ours, a lot of extra energy can be freed up, costs are reduced and revenue is increased: Two-thirds of the working time can be saved compared to human data recording.
Another key takeaway is that this, of course, works not just with invoices, but medical documents, delivery orders, CVs or notices of loss as well… This technology can be used in any situation where data needs to be extracted from a mass of printed or scanned documents in a similar format.
The “Proof of Concept” Period
Therefore, one of the things we tried was to get in touch with companies that listed NLP as a necessary expertise in their job advertisements – this shows that they might have a problem our project can solve. We found a company that downloads web content and categorizes articles based on title, date, content type and other parameters, for lack of a better method, manually. Their service is based on this plethora of content, but they didn’t have the means to reliably categorize it.
They have the treasure map, but we have the key to the chest.
Since we have already explored all the dead ends and issues that might come up in connection with such features (address, date, etc.) with our software, SzamlAI, we were pretty confident we could help them.
But we had no guarantee, of course. One of the characteristics of this industry is that no one can know beforehand whether an innovative solution will work or not… We can only promise that we are able to examine a problem professionally.
How can we sign a contract to do business then? This is what the “Proof of Concept” period was invented for. The promise here is that we work on the project and showcase our progress at certain intervals: where we stand, what we’ve achieved… and the client can end the collaboration after any such presentation.
Vision and Open Business Management Approach to Artificial Intelligence
For a team like Lexunit, this is not a bad solution, because we usually only have trouble convincing clients to let us through their doors. Once we’re inside, the work usually speaks for itself and is convincing enough. The reason why it's hard to knit up an argument beforehand is because it’s very experimental. A lot of companies, of course, don’t have the knowledge to be able to track whether the project is going well now or not, whether the company really needs it.
Therefore, during our presentations, we place great emphasis on conveying the knowledge that will help them make the right decisions. Even though NLP is a tried-and-tested tool in appropriate environments, the mapping of AI uses is still at an early stage, making its possibilities limitless. That’s why you often need a vision and open business management approach that allows clients to assess their own professional boundaries and be able to take some controlled risk in order to gain value with these innovative solutions.
Are you interested in starting a similar project but would like to ask some questions first? Make the most of our free consultation service and contact us today. Click here.