The previous attempts to automate processing - the last remaining manual step - based on early OCR and invoice exchange technologies have not quite delivered. However, new AI-based services like Oracle's 'Document Understanding' have stolen the show and promise to help free finance teams from the baggage of manual invoice processing!
From creation to transfer, processing and storage, while catering for physical and/or digital documents, the holy grail of end-to-end invoice workflow automation has arrived.
Hiring is difficult at the best of times, but even more so recently. Quite simply, you need to make the best use of time with the employees you already have.
A drive to get rid of human error leads to diminishing returns. More and more time is spent training and retraining, with increasing levels of oversight and pressure on team members. When the job is actively unenjoyable, people leave. As a result, certain levels of human error are accepted when trying to find a good balance.
Hiring, training, office space, equipment, software, checks and balances / overlapping workflows - all to copy data from one screen to another.
Together, these issues are difficult to overcome. Every invoice and business is different, and creating traditional software that caters to all differences is tricky.
SaaS solutions were a step in the right direction, but ultimately, a rigid and costly route to go down.
Another option is to integrate with exchange frameworks like Pan-European Public Procurement Online (PEPPOL) in the EU or (coming soon) Digital Business Networks Alliance (DBNA) in the US. Unfortunately, smaller suppliers cannot pivot to these technologies easily, making this a non-starter for larger organisations that interface with those same smaller suppliers.
Whether using traditional software, exchange frameworks or going completely manual, we have many problems with no universally agreed-upon solution. Deciding on the best route forward is about picking the 'least worse' option for your specific needs - whatever solution is implemented will ultimately still leave unfulfilled requirements and gaps in your process.
Now, AI-based services are available to architects and developers; simpler to implement than ever before with pre-trained models, cheap enough, and increasing efficiency as products have matured - there's never been a better time to take advantage.
Oracle's Document Understanding is such a service. Accurate, consistent, efficient, scales to demand, and crucially - well priced. It's everything we need in one place without compromise.
How you implement can be as straightforward or as complex as you need. You can mould the solution to your business processes, not the other way around.
So what is Document Understanding? It's an AI-based service accessible via API that uses pre-trained or custom (customer-trained) models, with features like 'Key value extraction' tailored specifically to interpret text-based files.
There are pre-trained models to extract text from receipts, invoices, driver's licences, and passports, which are usable out of the box.
The Oracle Cloud console provides a UI to demo Document Understanding features, allowing us to see how it works first-hand with either the provided samples or your own real-world invoices.
Here is an example invoice presented in the Oracle Cloud console. On the left, the 'Key value extraction' feature has been selected, and an Oracle-provided example has been used for the input. We could just as easily have selected one of our own files from our local computer or one previously uploaded to Object Storage.
The results are on the right. As this example uses the pre-trained model, the value labels are defined by Oracle and already cover all relevant data points for standard invoices. This list of predefined labels may be extended by Oracle over time, as standards around invoices change.
An API, on its own, doesn't support a workflow, so where exactly does Document Understanding fit in? As it's an Oracle Cloud service, there are other cloud services that we can use to make the technology employed on either side interchangeable.
With that in mind, our latest example can be thought of as three silos:
We'll begin by quickly covering Microsoft Graph and Oracle Cloud Infrastructure.
Microsoft Graph offers a window into Microsoft's vast ecosystem of services through a single endpoint. It combines common user-centric functionality such as managing Teams messages and Outlook emails, administrator functionality such as organisation and identity management, but also functionality previously restricted to command line tools such as domain configuration.
It's clear that Microsoft Graph will be the de facto framework that internally allows disparate Microsoft services to exchange information and, externally, enables customers to interact with any part of the Microsoft ecosystem programmatically.
Oracle Cloud Infrastructure (OCI) is, as the name suggests, Oracle's cloud computing platform, providing infrastructure-as-a-service technologies across its global network of data centres.
Over 100 individual services, delivered in 48 regions across 24 countries. Dubbed 'Gen 2 Cloud', OCI offers cost-efficient and high-performance compute, database, function, AI and other services at scale, and continues to grow rapidly due to customer demand at enterprise and start-up levels and everything in between.
An interesting point to note, Oracle and Microsoft are working closely together to create direct data links between their respective datacentres that exist in the same regions and Microsoft now deploy Oracle hardware directly into their own datacentres to offer Oracle services natively through the Azure console.
With that context in place, when looking at our invoice processing solution again, we've used Microsoft Graph to configure a 'subscription', essentially a customisable webhook that points to an endpoint that we expose to notify us when a new email is delivered to a specific inbox. The subscription was created using application credentials, so no user authentication or authorisation is needed.
Once we've received a notification letting us know a new email has been received, we use the Microsoft Graph endpoint to retrieve additional information about the email, such as the from address, subject, content, and the exact time it was received, as there can be a short delay between the email arriving and the notification being triggered. We also capture any files attached to the email and store them locally.
Now we have the email details and attachments, the attachments are sent to an OCI Object Storage Bucket for processing. An OCI Event service monitors the 'input' Object Storage Bucket for new files and, once triggered, starts an OCI Function, passing in details of the received file.
The Function makes use of the OCI Python SDK, but Java, Ruby, Go, Node.js and C# languages are also supported. The Function code calls the Document Understanding API, passing in the input file location, processing configuration and the output location that we want Document Understanding to use when outputting the resulting JSON data.
As before, an OCI Event service monitors the 'output' Object Storage Bucket for new files, but this time, once triggered, invokes a webhook configured to notify our system a new JSON data file is waiting. Once retrieved, the JSON is parsed and inserted into tables, ready to be presented through our APEX-built application UI for additional processing.
In this blog, we've discussed the pain of invoice processing for many businesses today, highlighted previous attempts at moving the situation forward and where those solutions were lacking before bringing some of Oracle's latest services into the picture to review how they work together to build out a viable long-term solution.
Look out for part 2 and our upcoming webinar, where we'll be bringing Oracle APEX into the mix, reviewing how the solution looks and works from the perspective of users and the strategic enhancements this approach can bring to businesses.
Remember to follow us on LinkedIn. We publish insight blogs on the latest technology developments every week.