How to convert PDF file into audio file?

To convert PDF files into an audio file, we use PyPDF2 and Pyttsx3 libraries. First, we use PyPDF2 for reading text from PDF and then use Pyttsx3 to convert extracted text into audio files.

Let’s start with the PyPDF2 library.

PyPDF2: PyPDF2 is a library in python that is used to read text from PDF. A Pure-Python library built as a PDF toolkit. It is capable of extracting document information, splitting documents page by page, merging documents page by page, etc.

Pyttxs3: Pyttx3 is a text-to-speech library. It has many functions which will help the machine to communicate with us. It will help the machine to speak to us.

How to install the above libraries?

Pip install PyPDF2

Pip install pyttsx3


Example:


import pyttsx3,PyPDF2

pdfObj = open('sample.pdf','rb')

pdfreader = PyPDF2.PdfFileReader(pdfObj)

speaker = pyttsx3.init()

for page_num in range(pdfreader.numPages): 

    

   text = pdfreader.getPage(page_num).extractText()  ## extracting text from the PDF

    cleaned_text = text.strip().replace('\n',' ')  ## Removes unnecessary spaces and break lines

    print(cleaned_text)         

     #speaker.say(cleaned_text)        ## Let The Speaker Speak The Text

    speaker.save_to_file(cleaned_text,'story.mp3')  ## Saving Text In a audio file 'story.mp3'

    speaker.runAndWait()

    

speaker.stop()

Output: 




In the above script we used,

1.   

  • pdfFileObj = open ('sample.pdf', 'rb')  We opened the sample.pdf in binary mode. and saved the file object as pdfFileObj.

  •                    pdfreader = PyPDF2.PdfFileReader(pdfFileObj)

         Here, we create an object of PdfFileReader class of the PyPDF2 module and pass the pdf file object & get a pdf reader object.

  •                   speaker = pyttsx3.init()  

  •          Gets a reference to an engine instance that will use the given driver. If the requested driver is already in use by another engine instance, that engine is returned. Otherwise, a new engine is created.

  •      for page_num in range(pdfreader.numPages):

         Iterate pdf from the first page to the last page.

  •          text = pdfreader.getPage(page_num).extractText() 
           Extract text from each page and store into text.

  •                    cleaned_text = text.strip().replace('\n',' ')

         Removes unnecessary spaces and break lines.

  •         speaker.say(cleaned_text) 

         This function will convert the text to speech 

  •                speaker.save_to_file(cleaned_text,' sample.mp3')

         Saving text in an audio file ' sample.mp3'

  •                speaker.runAndWait()

         This function will make the speech audible in the system, if you don't write this command then the speech will not be audible to you.

  •              speaker.stop()

        To stop the speaker object.


PdfFileReader(): PdfFileReader() to read the PDF. We just must give the path of the PDF as the argument. PdfFileReader class provides lots of methods or functions to interact with PDF.

getNumPages (): Calculates the number of pages in this PDF file.

decrypt(password): When using an encrypted/secured PDF file with the PDF Standard encryption handler, this function will allow the file to be decrypted. It checks the given password against the document’s user password and owner password and then stores the resulting decryption key if either password is correct.

getDocumentInfo (): Read-only property that accesses the getDocumentInfo() function.

getPageNumber(page): Retrieve page number of a given PageObject.

extractText (): Extracting text from page.


By Adding Little Bit of Web Scraping, The Same Script Can Be Used To Read Text From Sites Like Wikipedia

import requests

from bs4 import BeautifulSoup

import pyttsx3

 

url ='https://en.wikipedia.org/wiki/Kabaddi'

headers = {

    'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'

    }

 

r = requests.get(url)

soup = BeautifulSoup(r.content,'lxml')

speaker = pyttsx3.init()

headings = soup.find_all('p')

speaker.setProperty('rate', 100)

for i in headings:

    text = i.text

    cleaned_text = text.strip().replace('\n',' ') 

    speaker.say(cleaned_text)       

    speaker.save_to_file(cleaned_text,'Kabaddi_Wiki.mp3')

    speaker.runAndWait()

speaker.stop()



Thank you 😊 for reading. Please read other blogs. And also share with your friends and family.


Comments

Popular posts from this blog

Pillow Libary in Python.

How to perform operations on emails and folders using imap_tools?