You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An intelligent invoice data extractor built with **OCI Generative AI**, **LangChain**, and **Streamlit**. Upload any invoice PDF and this app will extract structured data like REF. NO., POLICY NO., DATES, etc. using multimodal LLMs.
| 🧠 OCI Generative AI | Vision + Text LLMs for extraction |
22
+
| 🧱 LangChain | Prompt orchestration and LLM chaining |
23
+
| 📦 Streamlit | Interactive UI and file handling |
24
+
| 🖼️ pdf2image | Convert PDFs into JPEGs |
25
+
| 🧾 Pandas | CSV creation & table rendering |
26
+
| 🔐 Base64 | Encodes image bytes for prompt injection|
27
+
28
+
---
29
+
30
+
## 🧠 How It Works
31
+
32
+
1.**User Uploads Invoice PDF**
33
+
The file is uploaded and converted into an image using `pdf2image` (Ensure you upload one page documents ONLY)
34
+
35
+
2.**Initial Header Detection (LLaMA-3.2 Vision)**
36
+
The first page is passed to the multimodal LLM which returns a list of fields that are likely to be useful (e.g., "Policy No.", "Amount", "Underwriter").
37
+
38
+
3.**User Selects Fields and Types**
39
+
A UI allows the user to pick 3 fields from the detected list, and specify their data types (Text, Number, etc.).
40
+
41
+
4.**Prompt Generation (Cohere Command R+)**
42
+
The second LLM generates a custom system prompt to extract those fields as JSON.
43
+
44
+
5.**Full Invoice Extraction (LLaMA-3.2 Vision)**
45
+
Each page image is passed into the multimodal LLM using the custom prompt, returning JSON values for the requested fields.
46
+
47
+
6.**Data Saving & Display**
48
+
All data is shown in a `st.dataframe()` and saved to CSV.
49
+
50
+
---
51
+
52
+
## 📁 File Structure
53
+
54
+
```bash
55
+
.
56
+
├── app.py # Main Streamlit app
57
+
├── requirements.txt # Python dependencies
58
+
└── README.md # This file
59
+
```
60
+
61
+
---
62
+
63
+
## 🔧 Setup
64
+
65
+
1.**Clone the repository**
66
+
67
+
```bash
68
+
git clone <repository-url>
69
+
cd<repository-folder>
70
+
```
71
+
72
+
2.**Install dependencies**
73
+
74
+
```bash
75
+
pip install -r requirements.txt
76
+
```
77
+
78
+
3.**Run the app**
79
+
80
+
```bash
81
+
streamlit run app.py
82
+
```
83
+
84
+
> ⚠️ **Important Configuration:**
85
+
>
86
+
> - Replace all instances of `<YOUR_COMPARTMENT_OCID_HERE>` with your actual **OCI Compartment OCID**
87
+
> - Ensure you have access to **OCI Generative AI Services** with correct permissions
# Generate appropriate prompt based on selected or input fields
122
+
ifelements:
123
+
system_message_cohere=SystemMessage(
124
+
content=f"""
125
+
Based on the following set of elements {elements}, with their respective types, extract their values and respond only in valid JSON format (no explanation):
126
+
{', '.join([f'- {e[0]}'foreinelements])}
127
+
For example:
128
+
{{
129
+
{elements[0][0]}: "296969",
130
+
{elements[1][0]}: "296969",
131
+
{elements[2][0]}: "296969"
132
+
}}
133
+
"""
134
+
)
135
+
ai_response_cohere=system_message_cohere
136
+
else:
137
+
system_message_cohere=SystemMessage(
138
+
content=f"""
139
+
Generate a system prompt to extract fields based on user-defined elements: {user_prompt}.
0 commit comments