add OCR script, use direct API call instead of built-in image tool
This commit is contained in:
parent
5729338167
commit
05a1d6ca13
2 changed files with 67 additions and 1 deletions
|
|
@ -16,7 +16,13 @@ Log receipt images and expense details into the `wip.work_expenses` collection i
|
|||
|
||||
## Steps
|
||||
|
||||
1. **Extract receipt info** — If the user sent an image, use the image tool to read the date, vendor, and amount. If the image tool fails, read the image with the `read` tool and try to extract the info visually. If you cannot confidently read the date, **ask the user** — never guess. If they only provided text, use that.
|
||||
1. **Extract receipt info** — If the user sent an image, run the OCR script to read the date, vendor, and amount:
|
||||
|
||||
```bash
|
||||
python3 scripts/ocr_receipt.py /path/to/image.jpg
|
||||
```
|
||||
|
||||
The script is located at `~/notes/skills/log-work-expense/scripts/ocr_receipt.py`. It returns JSON with `date`, `vendor`, and `amount` fields. If the output doesn't contain a confident date, **ask the user** — never guess. If they only provided text, use that.
|
||||
|
||||
2. **Upload the receipt image to S2** — If the user provided an image, upload it to S2 and get the public URL. If they only provided a file or no image, skip this step.
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue