I am working on the RAG application which provides Custom GPT Bot service, I am storing the file URLs that GPT is using to answer the user query.

I am storing the Embedding against each bot_id separately. Following are the embeddings for each bot stored separately which are retreived based on the bot_id in use.

When the user changes the File URLs I delete the existing ChromaDB folder for that bot and recreate embeddings on a new file URL and it shows the following error while recreating embeddings:
Traceback (most recent call last): File "/home/ubuntu/.local/lib/python3.10/site-packages/chromadb/api/client.py", line 438, in _validate_tenant_database self._admin_client.get_tenant(name=tenant) File "/home/ubuntu/.local/lib/python3.10/site-packages/chromadb/api/client.py", line 486, in get_tenant return self._server.get_tenant(name=name) File "/home/ubuntu/.local/lib/python3.10/site-packages/chromadb/api/segment.py", line 140, in get_tenant return self._sysdb.get_tenant(name=name) File "/home/ubuntu/.local/lib/python3.10/site-packages/chromadb/db/mixins/sysdb.py", line 125, in get_tenant with self.tx() as cur: File "/home/ubuntu/.local/lib/python3.10/site-packages/chromadb/db/impl/sqlite.py", line 131, in tx return TxWrapper(self._conn_pool, stack=self._tx_stack) File "/home/ubuntu/.local/lib/python3.10/site-packages/chromadb/db/impl/sqlite.py", line 31, in __init__ self._conn = conn_pool.connect() File "/home/ubuntu/.local/lib/python3.10/site-packages/chromadb/db/impl/sqlite_pool.py", line 141, in connect new_connection = Connection( File "/home/ubuntu/.local/lib/python3.10/site-packages/chromadb/db/impl/sqlite_pool.py", line 20, in __init__ self._conn = sqlite3.connect( sqlite3.OperationalError: unable to open database file During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/ubuntu/.local/lib/python3.10/site-packages/flask/app.py", line 1463, in wsgi_app response = self.full_dispatch_request() File "/home/ubuntu/.local/lib/python3.10/site-packages/flask/app.py", line 872, in full_dispatch_request rv = self.handle_user_exception(e) File "/home/ubuntu/.local/lib/python3.10/site-packages/flask/app.py", line 870, in full_dispatch_request rv = self.dispatch_request() File "/home/ubuntu/.local/lib/python3.10/site-packages/flask/app.py", line 855, in dispatch_request return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args) # type: ignore[no-any-return] File "/home/ubuntu/chatbot/main.py", line 460, in qa message = storeEmbeddings(embedding_model, raw_text, bot_id) File "/home/ubuntu/chatbot/embeddings.py", line 12, in storeEmbeddings db = Chroma.from_documents( File "/home/ubuntu/.local/lib/python3.10/site-packages/langchain_community/vectorstores/chroma.py", line 778, in from_documents return cls.from_texts( File "/home/ubuntu/.local/lib/python3.10/site-packages/langchain_community/vectorstores/chroma.py", line 714, in from_texts chroma_collection = cls( File "/home/ubuntu/.local/lib/python3.10/site-packages/langchain_community/vectorstores/chroma.py", line 120, in __init__ self._client = chromadb.Client(_client_settings) File "/home/ubuntu/.local/lib/python3.10/site-packages/chromadb/__init__.py", line 274, in Client return ClientCreator(tenant=tenant, database=database, settings=settings) File "/home/ubuntu/.local/lib/python3.10/site-packages/chromadb/api/client.py", line 144, in __init__ self._validate_tenant_database(tenant=tenant, database=database) File "/home/ubuntu/.local/lib/python3.10/site-packages/chromadb/api/client.py", line 447, in _validate_tenant_database raise ValueError( ValueError: Could not connect to tenant default_tenant. Are you sure it exists? Seems like it's still trying to access the old ChromaDB for that bot even though the folder was deleted successfully. I have deleted by folder using:
import shutil shutil.rmtree("Embeddings/1001") Function that creates and store embeddings:
def storeEmbeddings(embedding, text, bot_id, embedding_folder): try: text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200) texts = text_splitter.create_documents([text]) db = Chroma.from_documents( texts, embedding, persist_directory=embedding_folder+"//"+bot_id, client_settings=Settings(anonymized_telemetry=False,is_persistent=True,), ) return sucessMessage except Exception as e: return str(e) And the most strange thing is that, when I stop-and-start the python app at this point, it recreate the embeddings for this bot.
What's the best way to delete existing ChromaDB embeddings and create for new documents?
Источник: https://stackoverflow.com/questions/780 ... -open-data