This repository contains a Flask web application that allows users to upload PDF documents, extract text from them, generate preview images, and engage in a chat-like interface for asking questions about the content of the PDF. The application utilizes the PyMuPDF library for PDF processing and the Hugging Face Transformers library for question-answering.
- Upload PDF documents.
- Extract text from PDFs and store it for further processing.
- Generate preview images for all pages in the PDF.
- Implement a chat interface for users to ask questions about the PDF content.
- Use a pre-trained question-answering model from Hugging Face to provide answers based on the extracted text.
- Python 3.x
- Pip (Python package installer)
- Clone the repository to your local machine:
git clone https://github.com/Angitha10/Document-Ask.git
- Navigate to the project directory:
cd Document-Ask - Create a virtual environment (optional but recommended):
python -m venv venv
- Activate the virtual environment:
- On Windows:
venv\Scripts\activate
- On Unix or MacOS:
source venv/bin/activate
- On Windows:
- Install the required dependencies:
pip install -r requirements.txt
- Run the Flask application:
python app.py
- Open your web browser and go to http://127.0.0.1:5000/ to access the application.
- Upload a PDF document using the provided interface.
- The application will extract text from the PDF and generate preview images for each page.
- Access the chat interface by navigating to the /process/ route.
- Ask questions about the PDF content in the chat interface.
- The application will utilize a pre-trained question-answering model to provide answers based on the extracted text.
- This project uses the Flask web framework, PyMuPDF for PDF processing, and Hugging Face Transformers for question-answering.
Feel free to contribute, report issues, or suggest improvements!

