OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

Using a different chain, i.e., create_retrieval_chain in custom tools due to RetrievalQA deprecation

  • Thread starter Thread starter Skyward
  • Start date Start date
S

Skyward

Guest
I am using RetrievalQA to define custom tools for my RAG. According to the official documentation, RetrievalQA will be deprecated soon, and it is recommended to use other chains such as create_retrieval_chain. Could you provide guidance on the correct way to use create_retrieval_chain in custom tools? I am currently encountering errors.

I am copying below the example code available as part of the official documentation.

Code:
from langchain.chains import RetrievalQA
from langchain_community.document_loaders import PyPDFLoader
from langchain_community.vectorstores import FAISS
from langchain_core.tools import Tool
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_text_splitters import CharacterTextSplitter
from langchain.agents import AgentType, initialize_agent
from pydantic import BaseModel, Field

class DocumentInput(BaseModel):
    question: str = Field()

llm = ChatOpenAI(temperature=0, model="gpt-3.5-turbo-0613")

tools = []
files = [
    # https://abc.xyz/investor/static/pdf/2023Q1_alphabet_earnings_release.pdf
    {
        "name": "alphabet-earnings",
        "path": "/Users/harrisonchase/Downloads/2023Q1_alphabet_earnings_release.pdf",
    },
    # https://digitalassets.tesla.com/tesla-contents/image/upload/IR/TSLA-Q1-2023-Update
    {
        "name": "tesla-earnings",
        "path": "/Users/harrisonchase/Downloads/TSLA-Q1-2023-Update.pdf",
    },
]

for file in files:
    loader = PyPDFLoader(file["path"])
    pages = loader.load_and_split()
    text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
    docs = text_splitter.split_documents(pages)
    embeddings = OpenAIEmbeddings()
    retriever = FAISS.from_documents(docs, embeddings).as_retriever()

    # Wrap retrievers in a Tool
    tools.append(Tool(args_schema=DocumentInput,name=file["name"],
            description=f"useful when you want to answer questions about {file['name']}",
            func=RetrievalQA.from_chain_type(llm=llm, retriever=retriever),))

agent = initialize_agent(agent=AgentType.OPENAI_FUNCTIONS,tools=tools,llm=llm,verbose=True,)

agent({"input": "User;s Question"})

I have incorporated the create_retrieval_chain as below, but I am getting an error.

Code:
tools = []
system_prompt = ("Use the the provided context to answer the question. If you don't know the answer, please state that you don't know."
"Keep the answer to-the-point and concise and use three sentence maximum."
"Context: {context}"
"Question: {input}")
        # Create a chat prompt template
        prompt = ChatPromptTemplate.from_messages([("system", system_prompt), ("human", "{input}"), ])# User's question        
        # Create the question-answering chain
        question_answer_chain = create_stuff_documents_chain(llm, prompt)
        retrieval_chain = create_retrieval_chain(retriever, question_answer_chain)

        tools.append(Tool(args_schema=DocumentInput,name=file["name"],
                description=f"useful when you want to answer questions about {file['name']}",
                func=retrieval_chain))

Error Log:
---------------------------------------------------------------------------
ValidationError                           Traceback (most recent call last)
Cell In[1], line 71
     68     question_answer_chain = create_stuff_documents_chain(llm, prompt)
     69     retrieval_chain = create_retrieval_chain(retriever, question_answer_chain)
---> 71     tools.append(Tool(
     72             args_schema=DocumentInput,
     73             name=file["name"],
     74             description=f"useful when you want to answer questions about {file['name']}",
     75             func=retrieval_chain
     76         ))
     78     print(f'{file["name"]} loaded and a new collection {collection_name} created successfully! \n')
     79 else:

File ~\AppData\Local\anaconda3\envs\CSVpython311\Lib\site-packages\langchain_core\tools.py:671, in Tool.__init__(self, name, func, description, **kwargs)
    667 def __init__(
    668     self, name: str, func: Optional[Callable], description: str, **kwargs: Any
    669 ) -> None:
    670     """Initialize tool."""
--> 671     super(Tool, self).__init__(  # type: ignore[call-arg]
    672         name=name, func=func, description=description, **kwargs
    673     )

File ~\AppData\Local\anaconda3\envs\CSVpython311\Lib\site-packages\pydantic\main.py:341, in pydantic.main.BaseModel.__init__()

ValidationError: 1 validation error for Tool
func
  bound=RunnableAssign(mapper={
  context: RunnableBinding(bound=RunnableLambda(lambda x: x['input'])
           | VectorStoreRetriever(tags=['Chroma', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x0000020DB54AE850>), config={'run_name': 'retrieve_documents'})
})
| RunnableAssign(mapper={
    answer: RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
              context: RunnableLambda(format_docs)
            }), config={'run_name': 'format_inputs'})
            | ChatPromptTemplate(input_variables=['context', 'input'], messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'input'], template="Use the mentioned description for this tool and the provided context to answer the question. If you don't know the answer, please state that you don't know.For questions related to Elm company, please note that 'Elm Company' is the updated name for 'AL ELM INFORMATION SECURITY COMPANY.' Both names can be used interchangeably to retrieve information.Keep the answer to-the-point and concise and use three sentence maximum.Context: {context}Question: {input}")), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], template='{input}'))])
            | ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x0000020DB463B450>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x0000020DB4C739D0>, model_name='gpt-4', temperature=1.0, model_kwargs={'top_p': 0.7}, openai_api_key=SecretStr('**********'), openai_proxy='')
            | StrOutputParser(), config={'run_name': 'stuff_documents_chain'})
  }) config={'run_name': 'retrieval_chain'} is not callable (type=type_error.callable; value=bound=RunnableAssign(mapper={
  context: RunnableBinding(bound=RunnableLambda(lambda x: x['input'])
           | VectorStoreRetriever(tags=['Chroma', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x0000020DB54AE850>), config={'run_name': 'retrieve_documents'})
})
| RunnableAssign(mapper={
    answer: RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
              context: RunnableLambda(format_docs)
            }), config={'run_name': 'format_inputs'})
            | ChatPromptTemplate(input_variables=['context', 'input'], messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'input'], template="Use the mentioned description for this tool and the provided context to answer the question. If you don't know the answer, please state that you don't know.For questions related to Elm company, please note that 'Elm Company' is the updated name for 'AL ELM INFORMATION SECURITY COMPANY.' Both names can be used interchangeably to retrieve information.Keep the answer to-the-point and concise and use three sentence maximum.Context: {context}Question: {input}")), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], template='{input}'))])
            | ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x0000020DB463B450>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x0000020DB4C739D0>, model_name='gpt-4', temperature=1.0, model_kwargs={'top_p': 0.7}, openai_api_key=SecretStr('**********'), openai_proxy='')
            | StrOutputParser(), config={'run_name': 'stuff_documents_chain'})
  }) config={'run_name': 'retrieval_chain'})

I tried defining a custom chain that uses create_retrieval_chain. This chain was used as a replacemnt to RetrievalQA.
<p>I am using <code>RetrievalQA</code> to define custom tools for my RAG. According to the official documentation, <code>RetrievalQA</code> will be deprecated soon, and it is recommended to use other chains such as <code>create_retrieval_chain</code>. Could you provide guidance on the correct way to use <code>create_retrieval_chain</code> in custom tools? I am currently encountering errors.</p>
<p>I am copying below the example code available as part of the <a href="https://python.langchain.com/v0.2/docs/integrations/toolkits/document_comparison_toolkit/" rel="nofollow noreferrer">official documentation</a>.</p>
<pre><code>from langchain.chains import RetrievalQA
from langchain_community.document_loaders import PyPDFLoader
from langchain_community.vectorstores import FAISS
from langchain_core.tools import Tool
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_text_splitters import CharacterTextSplitter
from langchain.agents import AgentType, initialize_agent
from pydantic import BaseModel, Field

class DocumentInput(BaseModel):
question: str = Field()

llm = ChatOpenAI(temperature=0, model="gpt-3.5-turbo-0613")

tools = []
files = [
# https://abc.xyz/investor/static/pdf/2023Q1_alphabet_earnings_release.pdf
{
"name": "alphabet-earnings",
"path": "/Users/harrisonchase/Downloads/2023Q1_alphabet_earnings_release.pdf",
},
# https://digitalassets.tesla.com/tesla-contents/image/upload/IR/TSLA-Q1-2023-Update
{
"name": "tesla-earnings",
"path": "/Users/harrisonchase/Downloads/TSLA-Q1-2023-Update.pdf",
},
]

for file in files:
loader = PyPDFLoader(file["path"])
pages = loader.load_and_split()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(pages)
embeddings = OpenAIEmbeddings()
retriever = FAISS.from_documents(docs, embeddings).as_retriever()

# Wrap retrievers in a Tool
tools.append(Tool(args_schema=DocumentInput,name=file["name"],
description=f"useful when you want to answer questions about {file['name']}",
func=RetrievalQA.from_chain_type(llm=llm, retriever=retriever),))

agent = initialize_agent(agent=AgentType.OPENAI_FUNCTIONS,tools=tools,llm=llm,verbose=True,)

agent({"input": "User;s Question"})
</code></pre>
<p>I have incorporated the create_retrieval_chain as below, but I am getting an error.</p>
<pre><code>tools = []
system_prompt = ("Use the the provided context to answer the question. If you don't know the answer, please state that you don't know."
"Keep the answer to-the-point and concise and use three sentence maximum."
"Context: {context}"
"Question: {input}")
# Create a chat prompt template
prompt = ChatPromptTemplate.from_messages([("system", system_prompt), ("human", "{input}"), ])# User's question
# Create the question-answering chain
question_answer_chain = create_stuff_documents_chain(llm, prompt)
retrieval_chain = create_retrieval_chain(retriever, question_answer_chain)

tools.append(Tool(args_schema=DocumentInput,name=file["name"],
description=f"useful when you want to answer questions about {file['name']}",
func=retrieval_chain))

Error Log:
---------------------------------------------------------------------------
ValidationError Traceback (most recent call last)
Cell In[1], line 71
68 question_answer_chain = create_stuff_documents_chain(llm, prompt)
69 retrieval_chain = create_retrieval_chain(retriever, question_answer_chain)
---> 71 tools.append(Tool(
72 args_schema=DocumentInput,
73 name=file["name"],
74 description=f"useful when you want to answer questions about {file['name']}",
75 func=retrieval_chain
76 ))
78 print(f'{file["name"]} loaded and a new collection {collection_name} created successfully! \n')
79 else:

File ~\AppData\Local\anaconda3\envs\CSVpython311\Lib\site-packages\langchain_core\tools.py:671, in Tool.__init__(self, name, func, description, **kwargs)
667 def __init__(
668 self, name: str, func: Optional[Callable], description: str, **kwargs: Any
669 ) -> None:
670 """Initialize tool."""
--> 671 super(Tool, self).__init__( # type: ignore[call-arg]
672 name=name, func=func, description=description, **kwargs
673 )

File ~\AppData\Local\anaconda3\envs\CSVpython311\Lib\site-packages\pydantic\main.py:341, in pydantic.main.BaseModel.__init__()

ValidationError: 1 validation error for Tool
func
bound=RunnableAssign(mapper={
context: RunnableBinding(bound=RunnableLambda(lambda x: x['input'])
| VectorStoreRetriever(tags=['Chroma', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x0000020DB54AE850>), config={'run_name': 'retrieve_documents'})
})
| RunnableAssign(mapper={
answer: RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
context: RunnableLambda(format_docs)
}), config={'run_name': 'format_inputs'})
| ChatPromptTemplate(input_variables=['context', 'input'], messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'input'], template="Use the mentioned description for this tool and the provided context to answer the question. If you don't know the answer, please state that you don't know.For questions related to Elm company, please note that 'Elm Company' is the updated name for 'AL ELM INFORMATION SECURITY COMPANY.' Both names can be used interchangeably to retrieve information.Keep the answer to-the-point and concise and use three sentence maximum.Context: {context}Question: {input}")), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], template='{input}'))])
| ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x0000020DB463B450>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x0000020DB4C739D0>, model_name='gpt-4', temperature=1.0, model_kwargs={'top_p': 0.7}, openai_api_key=SecretStr('**********'), openai_proxy='')
| StrOutputParser(), config={'run_name': 'stuff_documents_chain'})
}) config={'run_name': 'retrieval_chain'} is not callable (type=type_error.callable; value=bound=RunnableAssign(mapper={
context: RunnableBinding(bound=RunnableLambda(lambda x: x['input'])
| VectorStoreRetriever(tags=['Chroma', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x0000020DB54AE850>), config={'run_name': 'retrieve_documents'})
})
| RunnableAssign(mapper={
answer: RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
context: RunnableLambda(format_docs)
}), config={'run_name': 'format_inputs'})
| ChatPromptTemplate(input_variables=['context', 'input'], messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'input'], template="Use the mentioned description for this tool and the provided context to answer the question. If you don't know the answer, please state that you don't know.For questions related to Elm company, please note that 'Elm Company' is the updated name for 'AL ELM INFORMATION SECURITY COMPANY.' Both names can be used interchangeably to retrieve information.Keep the answer to-the-point and concise and use three sentence maximum.Context: {context}Question: {input}")), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], template='{input}'))])
| ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x0000020DB463B450>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x0000020DB4C739D0>, model_name='gpt-4', temperature=1.0, model_kwargs={'top_p': 0.7}, openai_api_key=SecretStr('**********'), openai_proxy='')
| StrOutputParser(), config={'run_name': 'stuff_documents_chain'})
}) config={'run_name': 'retrieval_chain'})

I tried defining a custom chain that uses create_retrieval_chain. This chain was used as a replacemnt to RetrievalQA.
</code></pre>
 

Latest posts

I
Replies
0
Views
1
impact christian
I
Top