From Static PDFs to Structured Data: An AI Case Study in Logistics

The AI Moment That Started With a Simple PDF

When this client first came to us, they were not asking for AI. They were not chasing buzzwords or trying to add a flashy feature to their platform.

They had a very simple problem.

Their shipping line users were spending too much time filling out instruction forms.

On the surface, their container survey management platform was working well. Shipping line users created survey instructions. Surveyors performed inspections. Reviewers checked and approved reports. The workflow was structured and role-based.

But the first step, creating the survey instruction, was completely manual.

Every time a Bill of Lading arrived, someone had to open the PDF, search for key fields, and type everything into the system. BL number, shipper, consignee, container number, size, type, port of loading, port of discharge, cargo description, UN number, IMO class, vessel name, voyage number. Shipment after shipment, the same effort.

It was not complex work. It was repetitive work.

And that repetition was costing them time and accuracy.

The Manual Data Entry Trap AI Was Meant to Solve

When we studied their workflow more closely, we realised something important. The issue was not that their platform was outdated. It was modern and web-based. The problem was that the data entry process still felt like paperwork.

Users were not thinking strategically. They were copying from a PDF and pasting into a form. After a few entries, fatigue would naturally set in. Small errors would creep in. A container digit missed. A wrong port selected. A date formatted incorrectly.

Multiply that by dozens of shipments every day, and you start seeing operational impact.

That is when I asked my team a simple question. What if the Bill of Lading could fill the form on its own?

That question shaped the entire solution.

Why Basic OCR Was Not Enough for True AI Automation

OCR, or Optical Character Recognition, can extract text from PDFs or scanned images. But anyone who has worked in logistics knows that Bills of Lading do not follow one global template.

Some are clean digital exports. Some are scanned copies with low clarity. Some have merged tables. Some stretch across multiple pages. Field labels vary from carrier to carrier. One document says BL No., another says B/L Ref., another says Document Number.

If we only extracted text, we would still have to guess which part of the text mapped to which field. That would not solve the problem properly.

We needed contextual understanding.

After evaluating multiple options, we chose Mistral AI for contextual understanding (explore our AI development services for more). What impressed us was not just its ability to read text, but its ability to interpret structured information from semi-structured documents. It could understand that a value next to a certain label represented a container number or a vessel name.

That made a big difference.

Designing an AI-Powered OCR Flow Inside an Existing Platform

We were clear about one thing. We did not want to disturb the existing platform workflow.

The process still had to feel familiar.

So we built it in a way that fit naturally into their system.

When a shipping line user uploads a Bill of Lading PDF, the backend sends the document to Mistral’s model. The model returns structured data in JSON format. Our .NET Core layer then validates and cleans the extracted data. Finally, the Angular frontend auto-populates the instruction form.

The user can review, edit if needed, and submit.

What earlier took several minutes of manual typing now happens in seconds.

But getting there was not straightforward.

Real-World AI Challenges in Document Extraction

In early testing, we saw variations in accuracy. Different layouts confused extraction in some cases. Port names were returned in slightly different formats. Some container numbers were misread due to scan quality. Multi-page documents sometimes caused partial extraction.

For a brief period, it felt like we had replaced manual errors with AI uncertainty.

That was when we decided to strengthen the system around the model instead of expecting the model to do everything perfectly.

We refined our prompts carefully. We defined strict JSON output schemas. We provided contextual examples to improve consistency. On the backend, we added regex validation for container numbers, date normalization logic, port mapping, commodity mapping, and confidence scoring. Only high-confidence values were auto-filled.

On the frontend, we highlighted AI-filled fields so users could easily review them. Manual edits were always allowed. Nothing was locked.

This hybrid approach gave users confidence.

You can explore more real-world implementations like this in our portfolio of AI-driven projects

The Business Impact of AI-Driven Bill of Lading Processing

Once deployed, the impact was visible quite quickly.

Instruction creation time reduced by more than 60 percent. Manual typing errors reduced significantly. Surveyors started receiving more structured and consistent instructions. Reviewers spent less time correcting data.

But the most important feedback was from the users themselves. They felt relief. The process felt smoother. Instead of dreading another long form, they could upload the document and move ahead.

For them, it was not about AI. It was about saving time and avoiding repetitive work.

Build an AI OCR Engine for Real-World Logistics

From Static PDFs to Intelligent Data

As founders, projects like this remind us why we build software in the first place. Real transformation does not always come from adding new dashboards or complex analytics. Sometimes it comes from removing a small but persistent friction point.

By combining OCR, contextual AI from Mistral AI, backend validation in .NET Core, and a clean Angular interface, we helped our client convert static PDFs into structured, actionable data.

It was not about making the platform smarter on paper. It was about making daily work easier for real people.

And in logistics, where volume and precision matter so much, that kind of improvement creates real business value.

How We Built an AI OCR Engine for Real-World Logistics