Gabriel Muñoz Gabriel Muñoz - 21 days ago 14
JSON Question

Send a string of OCR text to resAPI

I am trying to work with RestfulAPI's on python.

After OCR a pdf, I want to send the text to an restfulAPI to get back retrieve specific words along with their position within the text. I have not manage to send the string of text to the API yet.

Code follows:

import requests
import PyPDF2
import json

url = "http://xxapi.xxapi.org/xxx.util.json"

pdfFileObj = open('/Users/xxx/pdftoOCR.pdf','rb')
pdfReader = PyPDF2.PdfFileReader(pdfFileObj)
pageObj = pdfReader.getPage(1) # To try with the text found in the first page

data = {"text": pageObj.extractText()}
data_json = json.dumps(data)
params = {'text':'string'}


r = requests.post(url, data=data_json, params=params)
r1 = json.loads(r.text)


Although I get a response 200 from the request, The data should come in Json format with the need to poll some token URL (Which I don`t know how to do it either) Also I don't think the request is correct as when I paste the token url to the browser I see an empty Json file (No words, no position) even if I know the piece of text I'm trying to send contains the desired words.

Thanks in advance! I work with OS X , python 3.5

Answer

Well, many thanks to @Jose.Cordova.Alvear for resolving this issue

import json
import requests

 pdf= open('test.pdf','rb')
 url = "http://xxapi.xxapi.org/xxx.util.json"

 payload = {
   'file' :pdf

   }

 response = requests.post(url, files=payload)

 print response.json()