Getting Started with elsAi - Document QA


Pre-requisites:

1. Docker and Docker Compose
2. AWS Bedrock
3.AWS S3
4.AWS CLI

Procedure:

  1. Configure your AWS CLI with the AWS account you used to subscribe to the product. Use the command aws configure and enter your AWS account access and secret keys.

  1. Use the following command to authenticate to Amazon Elastic Container Registry and download the container images.
    aws ecr get-login-password \
        --region us-east-1 | docker login \
        --username AWS \
        --password-stdin 709825985650.dkr.ecr.us-east-1.amazonaws.com
    docker pull 709825985650.dkr.ecr.us-east-1.amazonaws.com/optisol-business-solutions/elsai-docqa:v1.1

  1. Ensure that you have pulled the container image by entering docker images

  1. Configure Amazon Bedrock:
    1. Navigate to Amazon Bedrock in your AWS console.
    1. Select the Request Model Access button the overview page
    1. Select the below mentioned models
      1. Titan Text Embeddings V2
      1. Mixtral 8x7B Instruct
    1. Review and submit the request

      (if you wish to choose different models, feel free to pick them. Note down the corresponding model IDs of the chosen models using this link and mention them in the .env file)

    1. Generate AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_SESSION_TOKEN from AWS IAM with the AmazonBedrockFullAccess permission. Enter them in the .env file mentioned below.

  1. Configure Amazon S3:
    1. Navigate to S3 in AWS console
    1. Create a new bucket
    1. Upload the files (pdf) here
    1. Create a Cognito Identity Pool ID and associate an IAM role with permissions to get objects from S3. Enter the ID in the .env file mentioned below.


  1. Create a .env file and paste the following content in it:
    BEDROCK_REGION= <your aws bedrock region>
    BEDROCK_MISTRAL_MODEL_ID= 'mistral.mixtral-8x7b-instruct-v0:1'
    BEDROCK_TITAN_TEXT_EMBED_MODEL_ID= 'amazon.titan-embed-text-v2:0'
    AWS_ACCESS_KEY_ID= <access key for bedrock aws account>
    AWS_SECRET_ACCESS_KEY= <secret key for bedrock aws account>
    AWS_SESSION_TOKEN= <session token for bedrock aws account>
    S3_BUCKET_NAME= <your s3 bucket name>
    S3_REGION= <your s3 region>
    COGNITO_IDENTITY_POOL_ID= <your cognito identity pool ID>

  1. Create a docker-compose.yml file and paste the following content in it:
    version: "3"
    services:
       genaiml-api-prod:
         image: 709825985650.dkr.ecr.us-east-1.amazonaws.com/optisol-business-solutions/elsai-docqa:v1.1
         env_file:
           - .env
         ports:
           - 4202:4202
         restart: always
       

    Make sure you have both docker-compose.yml and .env files in the same directory.

  1. Enter the command docker-compose up . The application will be up and running on port 4202


    API Reference

    First, upload the PDF file to an S3 bucket. Then, use the following endpoint to get answers to any questions about the file.

    API Endpoint POST request:

    http://127.0.0.1:4202/doc_pipeline/pdf/s3/chat

    Sample Request Payload:

    {
      "pdf_files": ["test-folder/content.pdf"],
    	"question": "what percentage of CTC is given as retention bonus?"
    }

    Send the name of the file uploaded to S3 as the value for pdf_files. You can send more than one files.

    You will receive the following attributes as response:

    questionyour question
    answeranswer to your question
    file_namesyour files
    referenceparts of the document from where the model derived the answer
    generated_questionsfollow-up question
    typetype of question