提交 e4949603 作者: 孙俊华

删除多余代码,重新绘制流程图

上级 9e1e4d70
# 贡献指南
欢迎!我们是一个非常友好的社区,非常高兴您想要帮助我们让这个应用程序变得更好。但是,请您遵循一些通用准则以保持组织有序。
1. 确保为您要修复的错误或要添加的功能创建了一个[问题](https://github.com/imClumsyPanda/langchain-ChatGLM/issues),尽可能保持它们小。
2. 请使用 `git pull --rebase` 来拉取和衍合上游的更新。
3. 将提交合并为格式良好的提交。在提交说明中单独一行提到要解决的问题,如`Fix #<bug>`(有关更多可以使用的关键字,请参见[将拉取请求链接到问题](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue))。
4. 推送到`dev`。在说明中提到正在解决的问题。
---
# Contribution Guide
Welcome! We're a pretty friendly community, and we're thrilled that you want to help make this app even better. However, we ask that you follow some general guidelines to keep things organized around here.
1. Make sure an [issue](https://github.com/imClumsyPanda/langchain-ChatGLM/issues) is created for the bug you're about to fix, or feature you're about to add. Keep them as small as possible.
2. Please use `git pull --rebase` to fetch and merge updates from the upstream.
3. Rebase commits into well-formatted commits. Mention the issue being resolved in the commit message on a line all by itself like `Fixes #<bug>` (refer to [Linking a pull request to an issue](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue) for more keywords you can use).
4. Push into `dev`. Mention which bug is being resolved in the description.
# 基于本地知识库的 ChatGLM 等大语言模型应用实现 # 基于本地知识库的 ChatGLM 等大语言模型应用实现
## 介绍
🌍 [_READ THIS IN ENGLISH_](README_en.md)
🤖️ 一种利用 [langchain](https://github.com/hwchase17/langchain) 思想实现的基于本地知识库的问答应用,目标期望建立一套对中文场景与开源模型支持友好、可离线运行的知识库问答解决方案。
💡 受 [GanymedeNil](https://github.com/GanymedeNil) 的项目 [document.ai](https://github.com/GanymedeNil/document.ai)[AlexZhangji](https://github.com/AlexZhangji) 创建的 [ChatGLM-6B Pull Request](https://github.com/THUDM/ChatGLM-6B/pull/216) 启发,建立了全流程可使用开源模型实现的本地知识库问答应用。现已支持使用 [ChatGLM-6B](https://github.com/THUDM/ChatGLM-6B) 等大语言模型直接接入,或通过 [fastchat](https://github.com/lm-sys/FastChat) api 形式接入 Vicuna, Alpaca, LLaMA, Koala, RWKV 等模型。
✅ 本项目中 Embedding 默认选用的是 [GanymedeNil/text2vec-large-chinese](https://huggingface.co/GanymedeNil/text2vec-large-chinese/tree/main),LLM 默认选用的是 [ChatGLM-6B](https://github.com/THUDM/ChatGLM-6B)。依托上述模型,本项目可实现全部使用**开源**模型**离线私有部署**
⛓️ 本项目实现原理如下图所示,过程包括加载文件 -> 读取文本 -> 文本分割 -> 文本向量化 -> 问句向量化 -> 在文本向量中匹配出与问句向量最相似的`top k`个 -> 匹配出的文本作为上下文和问题一起添加到`prompt`中 -> 提交给`LLM`生成回答。
📺 [原理介绍视频](https://www.bilibili.com/video/BV13M4y1e7cN/?share_source=copy_web&vd_source=e6c5aafe684f30fbe41925d61ca6d514)
![实现原理图](img/langchain+chatglm.png) ![实现原理图](img/langchain+chatglm.png)
从文档处理角度来看,实现流程如下:
![实现原理图2](img/langchain+chatglm2.png)
🚩 本项目未涉及微调、训练过程,但可利用微调或训练对本项目效果进行优化。
🐳 Docker镜像:registry.cn-beijing.aliyuncs.com/isafetech/chatmydata:1.0 (感谢 @InkSong🌲 )
💻 运行方式:docker run -d -p 80:7860 --gpus all registry.cn-beijing.aliyuncs.com/isafetech/chatmydata:1.0 
🌐 [AutoDL 镜像](https://www.codewithgpu.com/i/imClumsyPanda/langchain-ChatGLM/langchain-ChatGLM)
📓 [ModelWhale 在线运行项目](https://www.heywhale.com/mw/project/643977aa446c45f4592a1e59)
## 变更日志
参见 [版本更新日志](https://github.com/imClumsyPanda/langchain-ChatGLM/releases)
## 硬件需求 ## 硬件需求
...@@ -149,22 +117,6 @@ $ pnpm i ...@@ -149,22 +117,6 @@ $ pnpm i
$ npm run dev $ npm run dev
``` ```
VUE 前端界面如下图所示:
1. `对话` 界面
![](img/vue_0521_0.png)
2. `知识库问答` 界面
![](img/vue_0521_1.png)
3. `Bing搜索` 界面
![](img/vue_0521_2.png)
WebUI 界面如下图所示:
1. `对话` Tab 界面
![](img/webui_0521_0.png)
2. `知识库测试 Beta` Tab 界面
![](img/webui_0510_1.png)
3. `模型配置` Tab 界面
![](img/webui_0510_2.png)
Web UI 可以实现如下功能: Web UI 可以实现如下功能:
1. 运行前自动读取`configs/model_config.py``LLM``Embedding`模型枚举及默认模型设置运行模型,如需重新加载模型,可在 `模型配置` Tab 重新选择后点击 `重新加载模型` 进行模型加载; 1. 运行前自动读取`configs/model_config.py``LLM``Embedding`模型枚举及默认模型设置运行模型,如需重新加载模型,可在 `模型配置` Tab 重新选择后点击 `重新加载模型` 进行模型加载;
...@@ -207,55 +159,3 @@ Web UI 可以实现如下功能: ...@@ -207,55 +159,3 @@ Web UI 可以实现如下功能:
>3. 增加数据量:可以使用更大的数据集来训练 ChatGLM-6B,提高模型的表现。 >3. 增加数据量:可以使用更大的数据集来训练 ChatGLM-6B,提高模型的表现。
>4. 引入更多的评估指标:可以引入更多的评估指标来评估模型的表现,从而发现 ChatGLM-6B 存在的不足和局限性。 >4. 引入更多的评估指标:可以引入更多的评估指标来评估模型的表现,从而发现 ChatGLM-6B 存在的不足和局限性。
>5. 改进模型架构:可以改进 ChatGLM-6B 的模型架构,提高模型的性能和表现。例如,可以使用更大的神经网络或者改进的卷积神经网络结构。 >5. 改进模型架构:可以改进 ChatGLM-6B 的模型架构,提高模型的性能和表现。例如,可以使用更大的神经网络或者改进的卷积神经网络结构。
## 路线图
- [ ] Langchain 应用
- [x] 接入非结构化文档(已支持 md、pdf、docx、txt 文件格式)
- [x] jpg 与 png 格式图片的 OCR 文字识别
- [x] 搜索引擎接入
- [ ] 本地网页接入
- [ ] 结构化数据接入(如 csv、Excel、SQL 等)
- [ ] 知识图谱/图数据库接入
- [ ] Agent 实现
- [x] 增加更多 LLM 模型支持
- [x] [THUDM/chatglm2-6b](https://huggingface.co/THUDM/chatglm2-6b)
- [x] [THUDM/chatglm-6b](https://huggingface.co/THUDM/chatglm-6b)
- [x] [THUDM/chatglm-6b-int8](https://huggingface.co/THUDM/chatglm-6b-int8)
- [x] [THUDM/chatglm-6b-int4](https://huggingface.co/THUDM/chatglm-6b-int4)
- [x] [THUDM/chatglm-6b-int4-qe](https://huggingface.co/THUDM/chatglm-6b-int4-qe)
- [x] [ClueAI/ChatYuan-large-v2](https://huggingface.co/ClueAI/ChatYuan-large-v2)
- [x] [fnlp/moss-moon-003-sft](https://huggingface.co/fnlp/moss-moon-003-sft)
- [x] [bigscience/bloomz-7b1](https://huggingface.co/bigscience/bloomz-7b1)
- [x] [bigscience/bloom-3b](https://huggingface.co/bigscience/bloom-3b)
- [x] [baichuan-inc/baichuan-7B](https://huggingface.co/baichuan-inc/baichuan-7B)
- [x] [lmsys/vicuna-13b-delta-v1.1](https://huggingface.co/lmsys/vicuna-13b-delta-v1.1)
- [x] 支持通过调用 [fastchat](https://github.com/lm-sys/FastChat) api 调用 llm
- [x] 增加更多 Embedding 模型支持
- [x] [nghuyong/ernie-3.0-nano-zh](https://huggingface.co/nghuyong/ernie-3.0-nano-zh)
- [x] [nghuyong/ernie-3.0-base-zh](https://huggingface.co/nghuyong/ernie-3.0-base-zh)
- [x] [shibing624/text2vec-base-chinese](https://huggingface.co/shibing624/text2vec-base-chinese)
- [x] [GanymedeNil/text2vec-large-chinese](https://huggingface.co/GanymedeNil/text2vec-large-chinese)
- [x] [moka-ai/m3e-small](https://huggingface.co/moka-ai/m3e-small)
- [x] [moka-ai/m3e-base](https://huggingface.co/moka-ai/m3e-base)
- [ ] Web UI
- [x] 基于 gradio 实现 Web UI DEMO
- [x] 基于 streamlit 实现 Web UI DEMO
- [x] 添加输出内容及错误提示
- [x] 引用标注
- [ ] 增加知识库管理
- [x] 选择知识库开始问答
- [x] 上传文件/文件夹至知识库
- [x] 知识库测试
- [x] 删除知识库中文件
- [x] 支持搜索引擎问答
- [ ] 增加 API 支持
- [x] 利用 fastapi 实现 API 部署方式
- [ ] 实现调用 API 的 Web UI Demo
- [x] VUE 前端
## 项目交流群
<img src="img/qr_code_47.jpg" alt="二维码" width="300" height="300" />
🎉 langchain-ChatGLM 项目微信交流群,如果你也对本项目感兴趣,欢迎加入群聊参与讨论交流。
...@@ -108,7 +108,7 @@ async def upload_file( ...@@ -108,7 +108,7 @@ async def upload_file(
knowledge_base_id: str = Form(..., description="Knowledge Base Name", example="kb1"), knowledge_base_id: str = Form(..., description="Knowledge Base Name", example="kb1"),
): ):
if not validate_kb_name(knowledge_base_id): if not validate_kb_name(knowledge_base_id):
return BaseResponse(code=403, msg="Don't attack me", data=[]) return BaseResponse(code=403, msg="知识库不存在", data=[])
saved_path = get_doc_path(knowledge_base_id) saved_path = get_doc_path(knowledge_base_id)
if not os.path.exists(saved_path): if not os.path.exists(saved_path):
...@@ -125,7 +125,7 @@ async def upload_file( ...@@ -125,7 +125,7 @@ async def upload_file(
f.write(file_content) f.write(file_content)
vs_path = get_vs_path(knowledge_base_id) vs_path = get_vs_path(knowledge_base_id)
vs_path, loaded_files = local_doc_qa.init_knowledge_vector_store([file_path], vs_path) vs_path, loaded_files = local_doc_qa.init_knowledge_vector_store([file_path], vs_path, knowledge_base_id)
if len(loaded_files) > 0: if len(loaded_files) > 0:
file_status = f"文件 {file.filename} 已上传至新的知识库,并已加载知识库,请开始提问。" file_status = f"文件 {file.filename} 已上传至新的知识库,并已加载知识库,请开始提问。"
return BaseResponse(code=200, msg=file_status) return BaseResponse(code=200, msg=file_status)
...@@ -158,7 +158,7 @@ async def upload_files( ...@@ -158,7 +158,7 @@ async def upload_files(
filelist.append(file_path) filelist.append(file_path)
if filelist: if filelist:
vs_path = get_vs_path(knowledge_base_id) vs_path = get_vs_path(knowledge_base_id)
vs_path, loaded_files = local_doc_qa.init_knowledge_vector_store(filelist, vs_path) vs_path, loaded_files = local_doc_qa.init_knowledge_vector_store(filelist, vs_path, knowledge_base_id)
if len(loaded_files): if len(loaded_files):
file_status = f"documents {', '.join([os.path.split(i)[-1] for i in loaded_files])} upload success" file_status = f"documents {', '.join([os.path.split(i)[-1] for i in loaded_files])} upload success"
return BaseResponse(code=200, msg=file_status) return BaseResponse(code=200, msg=file_status)
...@@ -175,9 +175,7 @@ async def list_kbs(): ...@@ -175,9 +175,7 @@ async def list_kbs():
folder folder
for folder in os.listdir(KB_ROOT_PATH) for folder in os.listdir(KB_ROOT_PATH)
if os.path.isdir(os.path.join(KB_ROOT_PATH, folder)) if os.path.isdir(os.path.join(KB_ROOT_PATH, folder))
and os.path.exists(os.path.join(KB_ROOT_PATH, folder, "vector_store", "index.faiss"))
] ]
return ListDocsResponse(data=all_doc_ids) return ListDocsResponse(data=all_doc_ids)
...@@ -315,23 +313,28 @@ async def local_doc_chat( ...@@ -315,23 +313,28 @@ async def local_doc_chat(
), ),
): ):
vs_path = get_vs_path(knowledge_base_id) vs_path = get_vs_path(knowledge_base_id)
if not os.path.exists(vs_path): # if not os.path.exists(vs_path):
# return BaseResponse(code=404, msg=f"Knowledge base {knowledge_base_id} not found") # # return BaseResponse(code=404, msg=f"Knowledge base {knowledge_base_id} not found")
return ChatMessage( # return ChatMessage(
question=question, # question=question,
response=f"Knowledge base {knowledge_base_id} not found", # response=f"Knowledge base {knowledge_base_id} not found",
history=history, # history=history,
source_documents=[], # source_documents=[],
) # )
else: # else:
resp = {}
for resp, history in local_doc_qa.get_knowledge_based_answer( for resp, history in local_doc_qa.get_knowledge_based_answer(
query=question, vs_path=vs_path, chat_history=history, streaming=True query=question, knowledge_base_id=knowledge_base_id, chat_history=history, streaming=True
): ):
pass pass
source_documents = []
if len(resp) > 0:
source_documents = [ source_documents = [
f"""出处 [{inum + 1}] {os.path.split(doc.metadata['source'])[-1]}:\n\n{doc.page_content}\n\n""" f"""出处 [{inum + 1}] {os.path.split(doc.metadata['source'])[-1]}:\n\n{doc.page_content}\n\n"""
f"""相关度:{doc.metadata['score']}\n\n""" # f"""相关度:{doc.metadata['score']}\n\n"""
for inum, doc in enumerate(resp["source_documents"]) for inum, doc in enumerate(resp["source_documents"])
] ]
return ChatMessage( return ChatMessage(
...@@ -418,7 +421,7 @@ async def stream_chat(websocket: WebSocket): ...@@ -418,7 +421,7 @@ async def stream_chat(websocket: WebSocket):
last_print_len = 0 last_print_len = 0
for resp, history in local_doc_qa.get_knowledge_based_answer( for resp, history in local_doc_qa.get_knowledge_based_answer(
query=question, vs_path=vs_path, chat_history=history, streaming=True query=question, knowledge_base_id=knowledge_base_id, chat_history=history, streaming=True
): ):
await asyncio.sleep(0) await asyncio.sleep(0)
await websocket.send_text(resp["result"][last_print_len:]) await websocket.send_text(resp["result"][last_print_len:])
......
from langchain.embeddings.huggingface import HuggingFaceEmbeddings from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from vectorstores import MyFAISS from vectorstores import MyFAISS, MyMilvus
from langchain.document_loaders import UnstructuredFileLoader, TextLoader, CSVLoader from langchain.document_loaders import UnstructuredFileLoader, TextLoader, CSVLoader
from configs.model_config import * from configs.model_config import *
import datetime import datetime
from textsplitter import ChineseTextSplitter from textsplitter import ChineseTextSplitter
from typing import List from typing import List, Optional
from utils import torch_gc from utils import torch_gc
from tqdm import tqdm from tqdm import tqdm
from pypinyin import lazy_pinyin from pypinyin import lazy_pinyin
...@@ -18,6 +18,9 @@ from langchain.docstore.document import Document ...@@ -18,6 +18,9 @@ from langchain.docstore.document import Document
from functools import lru_cache from functools import lru_cache
from textsplitter.zh_title_enhance import zh_title_enhance from textsplitter.zh_title_enhance import zh_title_enhance
from langchain.chains.base import Chain from langchain.chains.base import Chain
import json
import re
from sinan import Sinan
# patch HuggingFaceEmbeddings to make it hashable # patch HuggingFaceEmbeddings to make it hashable
...@@ -58,7 +61,6 @@ def tree(filepath, ignore_dir_names=None, ignore_file_names=None): ...@@ -58,7 +61,6 @@ def tree(filepath, ignore_dir_names=None, ignore_file_names=None):
def load_file(filepath, sentence_size=SENTENCE_SIZE, using_zh_title_enhance=ZH_TITLE_ENHANCE): def load_file(filepath, sentence_size=SENTENCE_SIZE, using_zh_title_enhance=ZH_TITLE_ENHANCE):
if filepath.lower().endswith(".md"): if filepath.lower().endswith(".md"):
loader = UnstructuredFileLoader(filepath, mode="elements") loader = UnstructuredFileLoader(filepath, mode="elements")
docs = loader.load() docs = loader.load()
...@@ -113,6 +115,12 @@ def generate_prompt(related_docs: List[str], ...@@ -113,6 +115,12 @@ def generate_prompt(related_docs: List[str],
return prompt return prompt
def extract_qa(query: str,
prompt_template: str = EXTRACT_QA, ) -> str:
prompt = prompt_template.replace("{question}", query)
return prompt
def search_result2docs(search_results): def search_result2docs(search_results):
docs = [] docs = []
for result in search_results: for result in search_results:
...@@ -145,6 +153,7 @@ class LocalDocQA: ...@@ -145,6 +153,7 @@ class LocalDocQA:
def init_knowledge_vector_store(self, def init_knowledge_vector_store(self,
filepath: str or List[str], filepath: str or List[str],
vs_path: str or os.PathLike = None, vs_path: str or os.PathLike = None,
knowledge_base_id: str = None,
sentence_size=SENTENCE_SIZE): sentence_size=SENTENCE_SIZE):
loaded_files = [] loaded_files = []
failed_files = [] failed_files = []
...@@ -189,6 +198,16 @@ class LocalDocQA: ...@@ -189,6 +198,16 @@ class LocalDocQA:
logger.info(f"{file} 未能成功加载") logger.info(f"{file} 未能成功加载")
if len(docs) > 0: if len(docs) > 0:
logger.info("文件加载完毕,正在生成向量库") logger.info("文件加载完毕,正在生成向量库")
if VECTOR_STORE == "milvus":
MyMilvus.from_texts(docs, embedding=self.embeddings, collection_name=knowledge_base_id)
# if MyMilvus.exist_collection(connection_args=MILVUS_CONNECTION_ARGS, collection_name=knowledge_base_id):
# 向milvus存数据
# milvus = MyMilvus.from_existing(embedding=self.embeddings, connection_args=None,
# collection_name=knowledge_base_id, text_field=None)
# milvus.add_documents(docs)
torch_gc()
else:
if vs_path and os.path.isdir(vs_path) and "index.faiss" in os.listdir(vs_path): if vs_path and os.path.isdir(vs_path) and "index.faiss" in os.listdir(vs_path):
vector_store = load_vector_store(vs_path, self.embeddings) vector_store = load_vector_store(vs_path, self.embeddings)
vector_store.add_documents(docs) vector_store.add_documents(docs)
...@@ -200,8 +219,8 @@ class LocalDocQA: ...@@ -200,8 +219,8 @@ class LocalDocQA:
"vector_store") "vector_store")
vector_store = MyFAISS.from_documents(docs, self.embeddings) # docs 为Document列表 vector_store = MyFAISS.from_documents(docs, self.embeddings) # docs 为Document列表
torch_gc() torch_gc()
vector_store.save_local(vs_path) vector_store.save_local(vs_path)
return vs_path, loaded_files return vs_path, loaded_files
else: else:
logger.info("文件均未成功加载,请检查依赖包或替换为其他文件再次上传。") logger.info("文件均未成功加载,请检查依赖包或替换为其他文件再次上传。")
...@@ -229,18 +248,87 @@ class LocalDocQA: ...@@ -229,18 +248,87 @@ class LocalDocQA:
logger.error(e) logger.error(e)
return None, [one_title] return None, [one_title]
def get_knowledge_based_answer(self, query, vs_path, chat_history=[], streaming: bool = STREAMING): def get_knowledge_based_answer(self, query, knowledge_base_id, chat_history=[], streaming: bool = STREAMING):
related_docs_with_score = []
if VECTOR_STORE == "milvus":
if MyMilvus.exist_collection(connection_args=MILVUS_CONNECTION_ARGS, collection_name=knowledge_base_id):
# 连接表
vector_store = MyMilvus.from_existing(embedding=self.embeddings, connection_args=None,
collection_name=knowledge_base_id, text_field=None)
import addressparser
df = addressparser.transform([query])
provice = df.loc[0][0]
city = df.loc[0][1]
year_pattern = r'\d+年'
year_extractor = re.compile(year_pattern)
match = year_extractor.search(query)
year = ""
if match:
year = match.group(0).replace("年", "")
else:
si = Sinan(query)
result = si.parse()
if result:
date_str = result["datetime"][0]
year = date_str[:4]
expr = ""
if year != "":
expr = "year == '" + year + "'"
if provice != "" and city == "":
if expr != "":
expr += " && "
expr += "province == '" + provice + "'"
if city != "":
if expr != "":
expr += " && "
expr += "city == '" + city + "'"
# 通过对问题提取年份、省、市
# pre_promot = extract_qa(query)
#
# year__result = self.llm_model_chain(
# {"prompt": pre_promot, "history": [], "streaming": streaming})
# expr = ""
# resp = ""
# for answer_result in year__result['answer_result_stream']:
# resp = answer_result.llm_output["answer"]
#
# resp = resp.replace("\n", "")
# pattern = re.compile(r'\{.+\}')
#
# match = pattern.search(resp)
# if match:
# text = match.group()
# json_obj = json.loads(text)
# if "year" in json_obj and json_obj["year"] != "":
# expr = "year == '" + json_obj["year"] + "'"
# if "province" in json_obj and json_obj["province"] != "" and json_obj["city"] == "":
# if expr != "":
# expr += " && "
# expr += "province == '" + json_obj["province"] + "'"
# if "city" in json_obj and json_obj["city"] != "":
#
# if expr != "":
# expr += " && "
# expr += "city == '" + json_obj["city"] + "'"
vector_store.chunk_size = self.chunk_size
vector_store.chunk_conent = self.chunk_conent
vector_store.score_threshold = self.score_threshold
related_docs_with_score = vector_store.similarity_search_with_score(query, k=self.top_k, expr=expr)
else:
vector_store = load_vector_store(vs_path, self.embeddings) vector_store = load_vector_store(vs_path, self.embeddings)
vector_store.chunk_size = self.chunk_size vector_store.chunk_size = self.chunk_size
vector_store.chunk_conent = self.chunk_conent vector_store.chunk_conent = self.chunk_conent
vector_store.score_threshold = self.score_threshold vector_store.score_threshold = self.score_threshold
related_docs_with_score = vector_store.similarity_search_with_score(query, k=self.top_k) related_docs_with_score = vector_store.similarity_search_with_score(query, k=self.top_k)
torch_gc() torch_gc()
if len(related_docs_with_score) > 0: if len(related_docs_with_score) > 0:
prompt = generate_prompt(related_docs_with_score, query) prompt = generate_prompt(related_docs_with_score, query)
else:
prompt = query
answer_result_stream_result = self.llm_model_chain( answer_result_stream_result = self.llm_model_chain(
{"prompt": prompt, "history": chat_history, "streaming": streaming}) {"prompt": prompt, "history": chat_history, "streaming": streaming})
...@@ -252,6 +340,27 @@ class LocalDocQA: ...@@ -252,6 +340,27 @@ class LocalDocQA:
"result": resp, "result": resp,
"source_documents": related_docs_with_score} "source_documents": related_docs_with_score}
yield response, history yield response, history
else:
resp = {"query": query,
"result": "知识库中暂无相关内容",
"source_documents": related_docs_with_score}
history = []
yield resp, history
# prompt = query
# answer_result_stream_result = self.llm_model_chain(
# {"prompt": prompt, "history": chat_history, "streaming": streaming})
#
# for answer_result in answer_result_stream_result['answer_result_stream']:
# resp = answer_result.llm_output["answer"]
# history = answer_result.history
# history[-1][0] = query
# response = {"query": query,
# "result": resp,
# "source_documents": related_docs_with_score}
# yield response, history
# query 查询内容 # query 查询内容
# vs_path 知识库路径 # vs_path 知识库路径
......
...@@ -18,11 +18,11 @@ embedding_model_dict = { ...@@ -18,11 +18,11 @@ embedding_model_dict = {
"text2vec-base": "shibing624/text2vec-base-chinese", "text2vec-base": "shibing624/text2vec-base-chinese",
"text2vec": "GanymedeNil/text2vec-large-chinese", "text2vec": "GanymedeNil/text2vec-large-chinese",
"m3e-small": "moka-ai/m3e-small", "m3e-small": "moka-ai/m3e-small",
"m3e-base": "moka-ai/m3e-base", "m3e-base": "D:/project/models/m3e-base",
} }
# Embedding model name # Embedding model name
EMBEDDING_MODEL = "text2vec" EMBEDDING_MODEL = "m3e-base"
# Embedding running device # Embedding running device
EMBEDDING_DEVICE = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu" EMBEDDING_DEVICE = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu"
...@@ -66,10 +66,16 @@ llm_model_dict = { ...@@ -66,10 +66,16 @@ llm_model_dict = {
"local_model_path": None, "local_model_path": None,
"provides": "ChatGLMLLMChain" "provides": "ChatGLMLLMChain"
}, },
"chatglm2-6b-32k": {
"name": "chatglm2-6b-32k",
"pretrained_model_name": "THUDM/chatglm2-6b-32k",
"local_model_path": "D:/project/models/chatglm2-6b-32k",
"provides": "ChatGLMLLMChain"
},
"chatglm2-6b": { "chatglm2-6b": {
"name": "chatglm2-6b", "name": "chatglm2-6b",
"pretrained_model_name": "THUDM/chatglm2-6b", "pretrained_model_name": "THUDM/chatglm2-6b",
"local_model_path": None, "local_model_path": "D:/project/models/chatglm2-6b",
"provides": "ChatGLMLLMChain" "provides": "ChatGLMLLMChain"
}, },
"chatglm2-6b-int4": { "chatglm2-6b-int4": {
...@@ -212,7 +218,7 @@ llm_model_dict = { ...@@ -212,7 +218,7 @@ llm_model_dict = {
} }
# LLM 名称 # LLM 名称
LLM_MODEL = "chatglm-6b" LLM_MODEL = "chatglm2-6b-32k"
# 量化加载8bit 模型 # 量化加载8bit 模型
LOAD_IN_8BIT = False LOAD_IN_8BIT = False
# Load the model with bfloat16 precision. Requires NVIDIA Ampere GPU. # Load the model with bfloat16 precision. Requires NVIDIA Ampere GPU.
...@@ -236,12 +242,6 @@ LLM_DEVICE = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mp ...@@ -236,12 +242,6 @@ LLM_DEVICE = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mp
# 知识库默认存储路径 # 知识库默认存储路径
KB_ROOT_PATH = os.path.join(os.path.dirname(os.path.dirname(__file__)), "knowledge_base") KB_ROOT_PATH = os.path.join(os.path.dirname(os.path.dirname(__file__)), "knowledge_base")
# 基于上下文的prompt模版,请务必保留"{question}"和"{context}"
PROMPT_TEMPLATE = """已知信息:
{context}
根据上述已知信息,简洁和专业的来回答用户的问题。如果无法从中得到答案,请说 “根据已知信息无法回答该问题” 或 “没有提供足够的相关信息”,不允许在答案中添加编造成分,答案请使用中文。 问题是:{question}"""
# 缓存知识库数量,如果是ChatGLM2,ChatGLM2-int4,ChatGLM2-int8模型若检索效果不好可以调成’10’ # 缓存知识库数量,如果是ChatGLM2,ChatGLM2-int4,ChatGLM2-int8模型若检索效果不好可以调成’10’
CACHED_VS_NUM = 1 CACHED_VS_NUM = 1
...@@ -252,10 +252,10 @@ SENTENCE_SIZE = 100 ...@@ -252,10 +252,10 @@ SENTENCE_SIZE = 100
CHUNK_SIZE = 250 CHUNK_SIZE = 250
# 传入LLM的历史记录长度 # 传入LLM的历史记录长度
LLM_HISTORY_LEN = 3 LLM_HISTORY_LEN = 0
# 知识库检索时返回的匹配内容条数 # 知识库检索时返回的匹配内容条数
VECTOR_SEARCH_TOP_K = 5 VECTOR_SEARCH_TOP_K = 1
# 知识检索内容相关度 Score, 数值范围约为0-1100,如果为0,则不生效,建议设置为500左右,经测试设置为小于500时,匹配结果更精准 # 知识检索内容相关度 Score, 数值范围约为0-1100,如果为0,则不生效,建议设置为500左右,经测试设置为小于500时,匹配结果更精准
VECTOR_SEARCH_SCORE_THRESHOLD = 500 VECTOR_SEARCH_SCORE_THRESHOLD = 500
...@@ -293,3 +293,46 @@ BING_SUBSCRIPTION_KEY = "" ...@@ -293,3 +293,46 @@ BING_SUBSCRIPTION_KEY = ""
# 通过增加标题判断,判断哪些文本为标题,并在metadata中进行标记; # 通过增加标题判断,判断哪些文本为标题,并在metadata中进行标记;
# 然后将文本与往上一级的标题进行拼合,实现文本信息的增强。 # 然后将文本与往上一级的标题进行拼合,实现文本信息的增强。
ZH_TITLE_ENHANCE = False ZH_TITLE_ENHANCE = False
#向量数据库选择
VECTOR_STORE = "milvus"
# milvus配置
MILVUS_CONNECTION_ARGS = {
"host": "10.51.0.90",
"port": "19530",
}
# 基于上下文的prompt模版,请务必保留"{question}"和"{context}"
PROMPT_TEMPLATE = """已知信息:
{context}
根据上述已知信息,简洁和专业的来回答用户的问题。如果无法从中得到答案,请说 “根据已知信息无法回答该问题” 或 “没有提供足够的相关信息”,不允许在答案中添加编造成分,答案请使用中文。 问题是:{question}"""
EXTRACT_QA = """
你是文字提取器,你要结构化的提取并补全用户描述中的年份、省、市/地级行政区,没有年份、省、市/地级行政区则使用空值("")代替。
年份(year)指:用户输入中有含义的年份信息,如:2022、2021、""等
省(province):用户输入中有含义的省一级信息,如:云南省、山西省、""等
市/地级行政区(city):用户输入中有含义的省一级信息,如:昆明市、成都市、""等
只能输出JSON格式,输出完毕后结束,不要生成新的用户输入,不要增加额外内容
示例模板:
用户输入:昆明市病虫情报第1期。
{
"年份":"",
"省":"云南省",
"市/地级行政区":"昆明市"
}
用户输入:蝗虫的发生情况。
{
"年份":"",
"省":"",
"市/地级行政区":""
}
请根据以下文本,按照模版格式输出内容。
用户输入:{question}
"""
## 变更日志
**[2023/04/15]**
1. 重构项目结构,在根目录下保留命令行 Demo [cli_demo.py](../cli_demo.py) 和 Web UI Demo [webui.py](../webui.py)
2. 对 Web UI 进行改进,修改为运行 Web UI 后首先按照 [configs/model_config.py](../configs/model_config.py) 默认选项加载模型,并增加报错提示信息等;
3. 对常见问题进行补充说明。
**[2023/04/12]**
1. 替换 Web UI 中的样例文件,避免出现 Ubuntu 中出现因文件编码无法读取的问题;
2. 替换`knowledge_based_chatglm.py`中的 prompt 模版,避免出现因 prompt 模版包含中英双语导致 chatglm 返回内容错乱的问题。
**[2023/04/11]**
1. 加入 Web UI V0.1 版本(感谢 [@liangtongt](https://github.com/liangtongt));
2. `README.md`中增加常见问题(感谢 [@calcitem](https://github.com/calcitem)[@bolongliu](https://github.com/bolongliu));
3. 增加 LLM 和 Embedding 模型运行设备是否可用`cuda``mps``cpu`的自动判断。
4.`knowledge_based_chatglm.py`中增加对`filepath`的判断,在之前支持单个文件导入的基础上,现支持单个文件夹路径作为输入,输入后将会遍历文件夹中各个文件,并在命令行中显示每个文件是否成功加载。
**[2023/04/09]**
1. 使用`langchain`中的`RetrievalQA`替代之前选用的`ChatVectorDBChain`,替换后可以有效减少提问 2-3 次后因显存不足而停止运行的问题;
2.`knowledge_based_chatglm.py`中增加`EMBEDDING_MODEL``VECTOR_SEARCH_TOP_K``LLM_MODEL``LLM_HISTORY_LEN``REPLY_WITH_SOURCE`参数值设置;
3. 增加 GPU 显存需求更小的`chatglm-6b-int4``chatglm-6b-int4-qe`作为 LLM 模型备选项;
4. 更正`README.md`中的代码错误(感谢 [@calcitem](https://github.com/calcitem))。
**[2023/04/07]**
1. 解决加载 ChatGLM 模型时发生显存占用为双倍的问题 (感谢 [@suc16](https://github.com/suc16)[@myml](https://github.com/myml)) ;
2. 新增清理显存机制;
3. 新增`nghuyong/ernie-3.0-nano-zh``nghuyong/ernie-3.0-base-zh`作为 Embedding 模型备选项,相比`GanymedeNil/text2vec-large-chinese`占用显存资源更少 (感谢 [@lastrei](https://github.com/lastrei))。
\ No newline at end of file
## Issue with Installing Packages Using pip in Anaconda
## Problem
Recently, when running open-source code, I encountered an issue: after creating a virtual environment with conda and switching to the new environment, using pip to install packages would be "ineffective." Here, "ineffective" means that the packages installed with pip are not in this new environment.
------
## Analysis
1. First, create a test environment called test: `conda create -n test`
2. Activate the test environment: `conda activate test`
3. Use pip to install numpy: `pip install numpy`. You'll find that numpy already exists in the default environment.
```powershell
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Requirement already satisfied: numpy in c:\programdata\anaconda3\lib\site-packages (1.20.3)
```
4. Check the information of pip: `pip show pip`
```powershell
Name: pip
Version: 21.2.4
Summary: The PyPA recommended tool for installing Python packages.
Home-page: https://pip.pypa.io/
Author: The pip developers
Author-email: distutils-sig@python.org
License: MIT
Location: c:\programdata\anaconda3\lib\site-packages
Requires:
Required-by:
```
5. We can see that the current pip is in the default conda environment. This explains why the package is not in the new virtual environment when we directly use pip to install packages - because the pip being used belongs to the default environment, the installed package either already exists or is installed directly into the default environment.
------
## Solution
1. We can directly use the conda command to install new packages, but sometimes conda may not have certain packages/libraries, so we still need to use pip to install.
2. We can first use the conda command to install the pip package for the current virtual environment, and then use pip to install new packages.
```powershell
# Use conda to install the pip package
(test) PS C:\Users\Administrator> conda install pip
Collecting package metadata (current_repodata.json): done
Solving environment: done
....
done
# Display the information of the current pip, and find that pip is in the test environment
(test) PS C:\Users\Administrator> pip show pip
Name: pip
Version: 21.2.4
Summary: The PyPA recommended tool for installing Python packages.
Home-page: https://pip.pypa.io/
Author: The pip developers
Author-email: distutils-sig@python.org
License: MIT
Location: c:\programdata\anaconda3\envs\test\lib\site-packages
Requires:
Required-by:
# Now use pip to install the numpy package, and it is installed successfully
(test) PS C:\Users\Administrator> pip install numpy
Looking in indexes:
https://pypi.tuna.tsinghua.edu.cn/simple
Collecting numpy
Using cached https://pypi.tuna.tsinghua.edu.cn/packages/4b/23/140ec5a509d992fe39db17200e96c00fd29603c1531ce633ef93dbad5e9e/numpy-1.22.2-cp39-cp39-win_amd64.whl (14.7 MB)
Installing collected packages: numpy
Successfully installed numpy-1.22.2
# Use pip list to view the currently installed packages, no problem
(test) PS C:\Users\Administrator> pip list
Package Version
------------ ---------
certifi 2021.10.8
numpy 1.22.2
pip 21.2.4
setuptools 58.0.4
wheel 0.37.1
wincertstore 0.2
```
## Supplement
1. The reason I didn't notice this problem before might be because the packages installed in the virtual environment were of a specific version, which overwrote the packages in the default environment. The main issue was actually a lack of careful observation:), otherwise, I could have noticed `Successfully uninstalled numpy-xxx` **default version** and `Successfully installed numpy-1.20.3` **specified version**.
2. During testing, I found that if the Python version is specified when creating a new package, there shouldn't be this issue. I guess this is because pip will be installed in the virtual environment, while in our case, including pip, no packages were installed, so the default environment's pip was used.
3. There's a question: I should have specified the Python version when creating a new virtual environment before, but I still used the default environment's pip package. However, I just couldn't reproduce the issue successfully on two different machines, which led to the second point mentioned above.
4. After encountering the problem mentioned in point 3, I solved it by using `python -m pip install package-name`, adding `python -m` before pip. As for why, you can refer to the answer on [StackOverflow](https://stackoverflow.com/questions/41060382/using-pip-to-install-packages-to-anaconda-environment):
>1. If you have a non-conda pip as your default pip but conda python as your default python (as below):
>
>```shell
>>which -a pip
>/home/<user>/.local/bin/pip
>/home/<user>/.conda/envs/newenv/bin/pip
>/usr/bin/pip
>
>>which -a python
>/home/<user>/.conda/envs/newenv/bin/python
>/usr/bin/python
>```
>
>2. Then, instead of calling `pip install <package>` directly, you can use the module flag -m in python so that it installs with the anaconda python
>
>```shell
>python -m pip install <package>
>```
>
>3. This will install the package to the anaconda library directory rather than the library directory associated with the (non-anaconda) pip
>4. The reason for doing this is as follows: the pip command references a specific pip file/shortcut (which -a pip will tell you which one). Similarly, the python command references a specific python file (which -a python will tell you which one). For one reason or another, these two commands can become out of sync, so your "default" pip is in a different folder than your default python and therefore is associated with different versions of python.
>5. In contrast, the python -m pip construct does not use the shortcut that the pip command points to. Instead, it asks python to find its pip version and use that version to install a package.
\ No newline at end of file
## 命令行工具
windows cli.bat
linux cli.sh
## 命令列表
### llm 管理
llm 支持列表
```shell
cli.bat llm ls
```
### embedding 管理
embedding 支持列表
```shell
cli.bat embedding ls
```
### start 启动管理
查看启动选择
```shell
cli.bat start
```
启动命令行交互
```shell
cli.bat start cli
```
启动Web 交互
```shell
cli.bat start webui
```
启动api服务
```shell
cli.bat start api
```
from langchain.docstore.document import Document
import feedparser
import html2text
import ssl
import time
class RSS_Url_loader:
def __init__(self, urls=None,interval=60):
'''可用参数urls数组或者是字符串形式的url列表'''
self.urls = []
self.interval = interval
if urls is not None:
try:
if isinstance(urls, str):
urls = [urls]
elif isinstance(urls, list):
pass
else:
raise TypeError('urls must be a list or a string.')
self.urls = urls
except:
Warning('urls must be a list or a string.')
#定时代码还要考虑是不是引入其他类,暂时先不对外开放
def scheduled_execution(self):
while True:
docs = self.load()
return docs
time.sleep(self.interval)
def load(self):
if hasattr(ssl, '_create_unverified_context'):
ssl._create_default_https_context = ssl._create_unverified_context
documents = []
for url in self.urls:
parsed = feedparser.parse(url)
for entry in parsed.entries:
if "content" in entry:
data = entry.content[0].value
else:
data = entry.description or entry.summary
data = html2text.html2text(data)
metadata = {"title": entry.title, "link": entry.link}
documents.append(Document(page_content=data, metadata=metadata))
return documents
if __name__=="__main__":
#需要在配置文件中加入urls的配置,或者是在用户界面上加入urls的配置
urls = ["https://www.zhihu.com/rss", "https://www.36kr.com/feed"]
loader = RSS_Url_loader(urls)
docs = loader.load()
for doc in docs:
print(doc)
\ No newline at end of file
from .image_loader import UnstructuredPaddleImageLoader
from .pdf_loader import UnstructuredPaddlePDFLoader
from .dialogue import (
Person,
Dialogue,
Turn,
DialogueLoader
)
__all__ = [
"UnstructuredPaddleImageLoader",
"UnstructuredPaddlePDFLoader",
"DialogueLoader",
]
import json
from abc import ABC
from typing import List
from langchain.docstore.document import Document
from langchain.document_loaders.base import BaseLoader
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
class Dialogue:
"""
Build an abstract dialogue model using classes and methods to represent different dialogue elements.
This class serves as a fundamental framework for constructing dialogue models.
"""
def __init__(self, file_path: str):
self.file_path = file_path
self.turns = []
def add_turn(self, turn):
"""
Create an instance of a conversation participant
:param turn:
:return:
"""
self.turns.append(turn)
def parse_dialogue(self):
"""
The parse_dialogue function reads the specified dialogue file and parses each dialogue turn line by line.
For each turn, the function extracts the name of the speaker and the message content from the text,
creating a Turn instance. If the speaker is not already present in the participants dictionary,
a new Person instance is created. Finally, the parsed Turn instance is added to the Dialogue object.
Please note that this sample code assumes that each line in the file follows a specific format:
<speaker>:\r\n<message>\r\n\r\n. If your file has a different format or includes other metadata,
you may need to adjust the parsing logic accordingly.
"""
participants = {}
speaker_name = None
message = None
with open(self.file_path, encoding='utf-8') as file:
lines = file.readlines()
for i, line in enumerate(lines):
line = line.strip()
if not line:
continue
if speaker_name is None:
speaker_name, _ = line.split(':', 1)
elif message is None:
message = line
if speaker_name not in participants:
participants[speaker_name] = Person(speaker_name, None)
speaker = participants[speaker_name]
turn = Turn(speaker, message)
self.add_turn(turn)
# Reset speaker_name and message for the next turn
speaker_name = None
message = None
def display(self):
for turn in self.turns:
print(f"{turn.speaker.name}: {turn.message}")
def export_to_file(self, file_path):
with open(file_path, 'w', encoding='utf-8') as file:
for turn in self.turns:
file.write(f"{turn.speaker.name}: {turn.message}\n")
def to_dict(self):
dialogue_dict = {"turns": []}
for turn in self.turns:
turn_dict = {
"speaker": turn.speaker.name,
"message": turn.message
}
dialogue_dict["turns"].append(turn_dict)
return dialogue_dict
def to_json(self):
dialogue_dict = self.to_dict()
return json.dumps(dialogue_dict, ensure_ascii=False, indent=2)
def participants_to_export(self):
"""
participants_to_export
:return:
"""
participants = set()
for turn in self.turns:
participants.add(turn.speaker.name)
return ', '.join(participants)
class Turn:
def __init__(self, speaker, message):
self.speaker = speaker
self.message = message
class DialogueLoader(BaseLoader, ABC):
"""Load dialogue."""
def __init__(self, file_path: str):
"""Initialize with dialogue."""
self.file_path = file_path
dialogue = Dialogue(file_path=file_path)
dialogue.parse_dialogue()
self.dialogue = dialogue
def load(self) -> List[Document]:
"""Load from dialogue."""
documents = []
participants = self.dialogue.participants_to_export()
for turn in self.dialogue.turns:
metadata = {"source": f"Dialogue File:{self.dialogue.file_path},"
f"speaker:{turn.speaker.name},"
f"participant:{participants}"}
turn_document = Document(page_content=turn.message, metadata=metadata.copy())
documents.append(turn_document)
return documents
"""Loader that loads image files."""
from typing import List
from langchain.document_loaders.unstructured import UnstructuredFileLoader
from paddleocr import PaddleOCR
import os
import nltk
class UnstructuredPaddleImageLoader(UnstructuredFileLoader):
"""Loader that uses unstructured to load image files, such as PNGs and JPGs."""
def _get_elements(self) -> List:
def image_ocr_txt(filepath, dir_path="tmp_files"):
full_dir_path = os.path.join(os.path.dirname(filepath), dir_path)
if not os.path.exists(full_dir_path):
os.makedirs(full_dir_path)
filename = os.path.split(filepath)[-1]
ocr = PaddleOCR(use_angle_cls=True, lang="ch", use_gpu=False, show_log=False)
result = ocr.ocr(img=filepath)
ocr_result = [i[1][0] for line in result for i in line]
txt_file_path = os.path.join(full_dir_path, "%s.txt" % (filename))
with open(txt_file_path, 'w', encoding='utf-8') as fout:
fout.write("\n".join(ocr_result))
return txt_file_path
txt_file_path = image_ocr_txt(self.file_path)
from unstructured.partition.text import partition_text
return partition_text(filename=txt_file_path, **self.unstructured_kwargs)
if __name__ == "__main__":
import sys
sys.path.append(os.path.dirname(os.path.dirname(__file__)))
from configs.model_config import NLTK_DATA_PATH
nltk.data.path = [NLTK_DATA_PATH] + nltk.data.path
filepath = os.path.join(os.path.dirname(os.path.dirname(__file__)), "knowledge_base", "samples", "content", "test.jpg")
loader = UnstructuredPaddleImageLoader(filepath, mode="elements")
docs = loader.load()
for doc in docs:
print(doc)
"""Loader that loads image files."""
from typing import List
from langchain.document_loaders.unstructured import UnstructuredFileLoader
from paddleocr import PaddleOCR
import os
import fitz
import nltk
from configs.model_config import NLTK_DATA_PATH
nltk.data.path = [NLTK_DATA_PATH] + nltk.data.path
class UnstructuredPaddlePDFLoader(UnstructuredFileLoader):
"""Loader that uses unstructured to load image files, such as PNGs and JPGs."""
def _get_elements(self) -> List:
def pdf_ocr_txt(filepath, dir_path="tmp_files"):
full_dir_path = os.path.join(os.path.dirname(filepath), dir_path)
if not os.path.exists(full_dir_path):
os.makedirs(full_dir_path)
ocr = PaddleOCR(use_angle_cls=True, lang="ch", use_gpu=False, show_log=False)
doc = fitz.open(filepath)
txt_file_path = os.path.join(full_dir_path, f"{os.path.split(filepath)[-1]}.txt")
img_name = os.path.join(full_dir_path, 'tmp.png')
with open(txt_file_path, 'w', encoding='utf-8') as fout:
for i in range(doc.page_count):
page = doc[i]
text = page.get_text("")
fout.write(text)
fout.write("\n")
img_list = page.get_images()
for img in img_list:
pix = fitz.Pixmap(doc, img[0])
if pix.n - pix.alpha >= 4:
pix = fitz.Pixmap(fitz.csRGB, pix)
pix.save(img_name)
result = ocr.ocr(img_name)
ocr_result = [i[1][0] for line in result for i in line]
fout.write("\n".join(ocr_result))
if os.path.exists(img_name):
os.remove(img_name)
return txt_file_path
txt_file_path = pdf_ocr_txt(self.file_path)
from unstructured.partition.text import partition_text
return partition_text(filename=txt_file_path, **self.unstructured_kwargs)
if __name__ == "__main__":
import sys
sys.path.append(os.path.dirname(os.path.dirname(__file__)))
filepath = os.path.join(os.path.dirname(os.path.dirname(__file__)), "knowledge_base", "samples", "content", "test.pdf")
loader = UnstructuredPaddlePDFLoader(filepath, mode="elements")
docs = loader.load()
for doc in docs:
print(doc)
...@@ -17,7 +17,7 @@ class ChatGLMLLMChain(BaseAnswer, Chain, ABC): ...@@ -17,7 +17,7 @@ class ChatGLMLLMChain(BaseAnswer, Chain, ABC):
max_token: int = 10000 max_token: int = 10000
temperature: float = 0.01 temperature: float = 0.01
# 相关度 # 相关度
top_p = 0.4 top_p = 0.1
# 候选词数量 # 候选词数量
top_k = 10 top_k = 10
checkPoint: LoaderCheckPoint = None checkPoint: LoaderCheckPoint = None
......
...@@ -128,8 +128,9 @@ class LoaderCheckPoint: ...@@ -128,8 +128,9 @@ class LoaderCheckPoint:
model = ( model = (
LoaderClass.from_pretrained(checkpoint, LoaderClass.from_pretrained(checkpoint,
config=self.model_config, config=self.model_config,
torch_dtype=torch.bfloat16 if self.bf16 else torch.float16, # torch_dtype=torch.bfloat16 if self.bf16 else torch.float16,
trust_remote_code=True) trust_remote_code=True)
.quantize(8)
.half() .half()
.cuda() .cuda()
) )
......
pymupdf pymupdf
paddlepaddle==2.4.2 paddlepaddle
paddleocr~=2.6.1.3 paddleocr~=2.6.1.3
langchain==0.0.174 langchain==0.0.174
transformers==4.29.1 transformers==4.29.1
...@@ -17,9 +17,9 @@ uvicorn~=0.21.1 ...@@ -17,9 +17,9 @@ uvicorn~=0.21.1
pypinyin~=0.48.0 pypinyin~=0.48.0
click~=8.1.3 click~=8.1.3
tabulate tabulate
feedparser feedparser~=6.0.10
azure-core azure-core
openai openai~=0.27.8
#accelerate~=0.18.0 #accelerate~=0.18.0
#peft~=0.3.0 #peft~=0.3.0
#bitsandbytes; platform_system != "Windows" #bitsandbytes; platform_system != "Windows"
...@@ -30,7 +30,6 @@ openai ...@@ -30,7 +30,6 @@ openai
# 实测ggml-vicuna-13b-1.1在llama-cpp-python 0.1.63上可正常兼容 # 实测ggml-vicuna-13b-1.1在llama-cpp-python 0.1.63上可正常兼容
# 不过!!!本项目模型加载的方式控制的比较严格,与llama-cpp-python的兼容性较差,很多参数设定不能使用, # 不过!!!本项目模型加载的方式控制的比较严格,与llama-cpp-python的兼容性较差,很多参数设定不能使用,
# 建议如非必要还是不要使用llama-cpp # 建议如非必要还是不要使用llama-cpp
torch~=2.0.0
pydantic~=1.10.7 pydantic~=1.10.7
starlette~=0.26.1 starlette~=0.26.1
numpy~=1.23.5 numpy~=1.23.5
...@@ -38,3 +37,4 @@ tqdm~=4.65.0 ...@@ -38,3 +37,4 @@ tqdm~=4.65.0
requests~=2.28.2 requests~=2.28.2
tenacity~=8.2.2 tenacity~=8.2.2
charset_normalizer==2.1.0 charset_normalizer==2.1.0
torch~=2.0.1+cu117
\ No newline at end of file
import sys
import os
sys.path.append(os.path.dirname(os.path.abspath(__file__)) + '/../../')
import asyncio
from argparse import Namespace
from models.loader.args import parser
from models.loader import LoaderCheckPoint
import models.shared as shared
from langchain.chains import LLMChain
from langchain.memory import ConversationBufferMemory, ReadOnlySharedMemory
from langchain.prompts import PromptTemplate
from langchain.agents import ZeroShotAgent, Tool, AgentExecutor
from typing import List, Set
class CustomLLMSingleActionAgent(ZeroShotAgent):
allowed_tools: List[str]
def __init__(self, *args, **kwargs):
super(CustomLLMSingleActionAgent, self).__init__(*args, **kwargs)
self.allowed_tools = kwargs['allowed_tools']
def get_allowed_tools(self) -> Set[str]:
return set(self.allowed_tools)
async def dispatch(args: Namespace):
args_dict = vars(args)
shared.loaderCheckPoint = LoaderCheckPoint(args_dict)
llm_model_ins = shared.loaderLLM()
template = """This is a conversation between a human and a bot:
{chat_history}
Write a summary of the conversation for {input}:
"""
prompt = PromptTemplate(
input_variables=["input", "chat_history"],
template=template
)
memory = ConversationBufferMemory(memory_key="chat_history")
readonlymemory = ReadOnlySharedMemory(memory=memory)
summry_chain = LLMChain(
llm=llm_model_ins,
prompt=prompt,
verbose=True,
memory=readonlymemory, # use the read-only memory to prevent the tool from modifying the memory
)
tools = [
Tool(
name="Summary",
func=summry_chain.run,
description="useful for when you summarize a conversation. The input to this tool should be a string, representing who will read this summary."
)
]
prefix = """Have a conversation with a human, answering the following questions as best you can. You have access to the following tools:"""
suffix = """Begin!
Question: {input}
{agent_scratchpad}"""
prompt = CustomLLMSingleActionAgent.create_prompt(
tools,
prefix=prefix,
suffix=suffix,
input_variables=["input", "agent_scratchpad"]
)
tool_names = [tool.name for tool in tools]
llm_chain = LLMChain(llm=llm_model_ins, prompt=prompt)
agent = CustomLLMSingleActionAgent(llm_chain=llm_chain, tools=tools, allowed_tools=tool_names)
agent_chain = AgentExecutor.from_agent_and_tools(agent=agent, tools=tools)
agent_chain.run(input="你好")
agent_chain.run(input="你是谁?")
agent_chain.run(input="我们之前聊了什么?")
if __name__ == '__main__':
args = None
args = parser.parse_args(args=['--model-dir', '/media/checkpoint/', '--model', 'vicuna-13b-hf', '--no-remote-model', '--load-in-8bit'])
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
loop.run_until_complete(dispatch(args))
from configs.model_config import *
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
import nltk
from vectorstores import MyFAISS
from chains.local_doc_qa import load_file
nltk.data.path = [NLTK_DATA_PATH] + nltk.data.path
if __name__ == "__main__":
filepath = os.path.join(os.path.dirname(os.path.dirname(os.path.dirname(__file__))),
"knowledge_base", "samples", "content", "test.txt")
embeddings = HuggingFaceEmbeddings(model_name=embedding_model_dict[EMBEDDING_MODEL],
model_kwargs={'device': EMBEDDING_DEVICE})
docs = load_file(filepath, using_zh_title_enhance=True)
vector_store = MyFAISS.from_documents(docs, embeddings)
query = "指令提示技术有什么示例"
search_result = vector_store.similarity_search(query)
print(search_result)
pass
...@@ -26,11 +26,13 @@ class ChineseTextSplitter(CharacterTextSplitter): ...@@ -26,11 +26,13 @@ class ChineseTextSplitter(CharacterTextSplitter):
def split_text(self, text: str) -> List[str]: ##此处需要进一步优化逻辑 def split_text(self, text: str) -> List[str]: ##此处需要进一步优化逻辑
if self.pdf: if self.pdf:
text = re.sub(r"\n{3,}", r"\n", text) text = re.sub(r"\n{4}", r"-", text)
text = re.sub('\s', " ", text) # # text = re.sub('\s', " ", text)
text = re.sub("\n\n", "", text) text = re.sub(r"\n\n", r"\n", text)
# text = re.sub(r"\n", r"", text)
text = re.sub(r'([;;.!?。!?\?])([^”’])', r"\1\n\2", text) # 单字符断句符
text = re.sub(r'([;;!?。!?\?])([^”’])', r"\1\n\2", text) # 单字符断句符
text = re.sub(r'(\.{6})([^"’”」』])', r"\1\n\2", text) # 英文省略号 text = re.sub(r'(\.{6})([^"’”」』])', r"\1\n\2", text) # 英文省略号
text = re.sub(r'(\…{2})([^"’”」』])', r"\1\n\2", text) # 中文省略号 text = re.sub(r'(\…{2})([^"’”」』])', r"\1\n\2", text) # 中文省略号
text = re.sub(r'([;;!?。!?\?]["’”」』]{0,2})([^;;!?,。!?\?])', r'\1\n\2', text) text = re.sub(r'([;;!?。!?\?]["’”」』]{0,2})([^;;!?,。!?\?])', r'\1\n\2', text)
...@@ -40,7 +42,7 @@ class ChineseTextSplitter(CharacterTextSplitter): ...@@ -40,7 +42,7 @@ class ChineseTextSplitter(CharacterTextSplitter):
ls = [i for i in text.split("\n") if i] ls = [i for i in text.split("\n") if i]
for ele in ls: for ele in ls:
if len(ele) > self.sentence_size: if len(ele) > self.sentence_size:
ele1 = re.sub(r'([,,.]["’”」』]{0,2})([^,,.])', r'\1\n\2', ele) ele1 = re.sub(r'([,,]["’”」』]{0,2})([^,,])', r'\1\n\2', ele)
ele1_ls = ele1.split("\n") ele1_ls = ele1.split("\n")
for ele_ele1 in ele1_ls: for ele_ele1 in ele1_ls:
if len(ele_ele1) > self.sentence_size: if len(ele_ele1) > self.sentence_size:
......
...@@ -30,7 +30,7 @@ def under_non_alpha_ratio(text: str, threshold: float = 0.5): ...@@ -30,7 +30,7 @@ def under_non_alpha_ratio(text: str, threshold: float = 0.5):
def is_possible_title( def is_possible_title(
text: str, text: str,
title_max_word_length: int = 20, title_max_word_length: int = 20,
non_alpha_threshold: float = 0.5, non_alpha_threshold: float = 0.2,
) -> bool: ) -> bool:
"""Checks to see if the text passes all of the checks for a valid title. """Checks to see if the text passes all of the checks for a valid title.
...@@ -50,10 +50,10 @@ def is_possible_title( ...@@ -50,10 +50,10 @@ def is_possible_title(
return False return False
# 文本中有标点符号,就不是title # 文本中有标点符号,就不是title
ENDS_IN_PUNCT_PATTERN = r"[^\w\s]\Z" # ENDS_IN_PUNCT_PATTERN = r"[^\w\s]\Z"
ENDS_IN_PUNCT_RE = re.compile(ENDS_IN_PUNCT_PATTERN) # ENDS_IN_PUNCT_RE = re.compile(ENDS_IN_PUNCT_PATTERN)
if ENDS_IN_PUNCT_RE.search(text) is not None: # if ENDS_IN_PUNCT_RE.search(text) is not None:
return False # return False
# 文本长度不能超过设定值,默认20 # 文本长度不能超过设定值,默认20
# NOTE(robinson) - splitting on spaces here instead of word tokenizing because it # NOTE(robinson) - splitting on spaces here instead of word tokenizing because it
...@@ -82,6 +82,9 @@ def is_possible_title( ...@@ -82,6 +82,9 @@ def is_possible_title(
if not alpha_in_text_5: if not alpha_in_text_5:
return False return False
if not u'\u3001' in text_5:
return False
return True return True
......
...@@ -57,14 +57,17 @@ class MyFAISS(FAISS, VectorStore): ...@@ -57,14 +57,17 @@ class MyFAISS(FAISS, VectorStore):
if i == -1 or 0 < self.score_threshold < scores[0][j]: if i == -1 or 0 < self.score_threshold < scores[0][j]:
# This happens when not enough docs are returned. # This happens when not enough docs are returned.
continue continue
#根据下标获取编号
if i in self.index_to_docstore_id: if i in self.index_to_docstore_id:
_id = self.index_to_docstore_id[i] _id = self.index_to_docstore_id[i]
# 执行接下来的操作 # 执行接下来的操作
else: else:
continue continue
#假定是从faiss中查到的信息存储到docstore,从中根据id查询对于文本
doc = self.docstore.search(_id) doc = self.docstore.search(_id)
if (not self.chunk_conent) or ("context_expand" in doc.metadata and not doc.metadata["context_expand"]):
# 匹配出的文本如果不需要扩展上下文则执行如下代码 # 匹配出的文本如果不需要扩展上下文则执行如下代码
if (not self.chunk_conent) or ("context_expand" in doc.metadata and not doc.metadata["context_expand"]):
#用于判断doc是否是Document的实例
if not isinstance(doc, Document): if not isinstance(doc, Document):
raise ValueError(f"Could not find document for id {_id}, got {doc}") raise ValueError(f"Could not find document for id {_id}, got {doc}")
doc.metadata["score"] = int(scores[0][j]) doc.metadata["score"] = int(scores[0][j])
...@@ -73,6 +76,7 @@ class MyFAISS(FAISS, VectorStore): ...@@ -73,6 +76,7 @@ class MyFAISS(FAISS, VectorStore):
id_set.add(i) id_set.add(i)
docs_len = len(doc.page_content) docs_len = len(doc.page_content)
#在所有文本中获取,组成上下文
for k in range(1, max(i, store_len - i)): for k in range(1, max(i, store_len - i)):
break_flag = False break_flag = False
if "context_expand_method" in doc.metadata and doc.metadata["context_expand_method"] == "forward": if "context_expand_method" in doc.metadata and doc.metadata["context_expand_method"] == "forward":
...@@ -85,6 +89,8 @@ class MyFAISS(FAISS, VectorStore): ...@@ -85,6 +89,8 @@ class MyFAISS(FAISS, VectorStore):
if l not in id_set and 0 <= l < len(self.index_to_docstore_id): if l not in id_set and 0 <= l < len(self.index_to_docstore_id):
_id0 = self.index_to_docstore_id[l] _id0 = self.index_to_docstore_id[l]
doc0 = self.docstore.search(_id0) doc0 = self.docstore.search(_id0)
#合并后的长度大于限制长度或者不属于同一篇文章,则结束
if docs_len + len(doc0.page_content) > self.chunk_size or doc0.metadata["source"] != \ if docs_len + len(doc0.page_content) > self.chunk_size or doc0.metadata["source"] != \
doc.metadata["source"]: doc.metadata["source"]:
break_flag = True break_flag = True
...@@ -100,6 +106,7 @@ class MyFAISS(FAISS, VectorStore): ...@@ -100,6 +106,7 @@ class MyFAISS(FAISS, VectorStore):
if len(id_set) == 0 and self.score_threshold > 0: if len(id_set) == 0 and self.score_threshold > 0:
return [] return []
id_list = sorted(list(id_set)) id_list = sorted(list(id_set))
# TODO: 多余了在循环中不是已经进行了判断吗
id_lists = self.seperate_list(id_list) id_lists = self.seperate_list(id_list)
for id_seq in id_lists: for id_seq in id_lists:
for id in id_seq: for id in id_seq:
......
from .MyFAISS import MyFAISS from .MyFAISS import MyFAISS
from .MyMilvus import MyMilvus
\ No newline at end of file
{ {
"name": "chatgpt-web", "name": "chatgpt-web",
"version": "2.10.9", "version": "2.11.0",
"lockfileVersion": 2, "lockfileVersion": 2,
"requires": true, "requires": true,
"packages": { "packages": {
"": { "": {
"name": "chatgpt-web", "name": "chatgpt-web",
"version": "2.10.9", "version": "2.11.0",
"dependencies": { "dependencies": {
"@traptitech/markdown-it-katex": "^3.6.0", "@traptitech/markdown-it-katex": "^3.6.0",
"@vueuse/core": "^9.13.0", "@vueuse/core": "^9.13.0",
...@@ -16,6 +16,7 @@ ...@@ -16,6 +16,7 @@
"markdown-it": "^13.0.1", "markdown-it": "^13.0.1",
"naive-ui": "^2.34.3", "naive-ui": "^2.34.3",
"pinia": "^2.0.33", "pinia": "^2.0.33",
"qs": "^6.11.1",
"vue": "^3.2.47", "vue": "^3.2.47",
"vue-i18n": "^9.2.2", "vue-i18n": "^9.2.2",
"vue-router": "^4.1.6" "vue-router": "^4.1.6"
...@@ -30,6 +31,7 @@ ...@@ -30,6 +31,7 @@
"@types/markdown-it": "^12.2.3", "@types/markdown-it": "^12.2.3",
"@types/markdown-it-link-attributes": "^3.0.1", "@types/markdown-it-link-attributes": "^3.0.1",
"@types/node": "^18.14.6", "@types/node": "^18.14.6",
"@types/qs": "^6.9.7",
"@vitejs/plugin-vue": "^4.0.0", "@vitejs/plugin-vue": "^4.0.0",
"autoprefixer": "^10.4.13", "autoprefixer": "^10.4.13",
"axios": "^1.3.4", "axios": "^1.3.4",
...@@ -3046,6 +3048,12 @@ ...@@ -3046,6 +3048,12 @@
"integrity": "sha512-Gj7cI7z+98M282Tqmp2K5EIsoouUEzbBJhQQzDE3jSIRk6r9gsz0oUokqIUR4u1R3dMHo0pDHM7sNOHyhulypw==", "integrity": "sha512-Gj7cI7z+98M282Tqmp2K5EIsoouUEzbBJhQQzDE3jSIRk6r9gsz0oUokqIUR4u1R3dMHo0pDHM7sNOHyhulypw==",
"dev": true "dev": true
}, },
"node_modules/@types/qs": {
"version": "6.9.7",
"resolved": "https://registry.npmjs.org/@types/qs/-/qs-6.9.7.tgz",
"integrity": "sha512-FGa1F62FT09qcrueBA6qYTrJPVDzah9a+493+o2PCXsesWHIn27G98TsSMs3WPNbZIEj4+VJf6saSFpvD+3Zsw==",
"dev": true
},
"node_modules/@types/resolve": { "node_modules/@types/resolve": {
"version": "1.17.1", "version": "1.17.1",
"resolved": "https://registry.npmjs.org/@types/resolve/-/resolve-1.17.1.tgz", "resolved": "https://registry.npmjs.org/@types/resolve/-/resolve-1.17.1.tgz",
...@@ -4055,7 +4063,6 @@ ...@@ -4055,7 +4063,6 @@
"version": "1.0.2", "version": "1.0.2",
"resolved": "https://registry.npmjs.org/call-bind/-/call-bind-1.0.2.tgz", "resolved": "https://registry.npmjs.org/call-bind/-/call-bind-1.0.2.tgz",
"integrity": "sha512-7O+FbCihrB5WGbFYesctwmTKae6rOiIzmz1icreWJ+0aA7LJfuqhEso2T9ncpcFtzMQtzXf2QGGueWJGTYsqrA==", "integrity": "sha512-7O+FbCihrB5WGbFYesctwmTKae6rOiIzmz1icreWJ+0aA7LJfuqhEso2T9ncpcFtzMQtzXf2QGGueWJGTYsqrA==",
"dev": true,
"dependencies": { "dependencies": {
"function-bind": "^1.1.1", "function-bind": "^1.1.1",
"get-intrinsic": "^1.0.2" "get-intrinsic": "^1.0.2"
...@@ -5972,8 +5979,7 @@ ...@@ -5972,8 +5979,7 @@
"node_modules/function-bind": { "node_modules/function-bind": {
"version": "1.1.1", "version": "1.1.1",
"resolved": "https://registry.npmjs.org/function-bind/-/function-bind-1.1.1.tgz", "resolved": "https://registry.npmjs.org/function-bind/-/function-bind-1.1.1.tgz",
"integrity": "sha512-yIovAzMX49sF8Yl58fSCWJ5svSLuaibPxXQJFLmBObTuCr0Mf1KiPopGM9NiFjiYBCbfaa2Fh6breQ6ANVTI0A==", "integrity": "sha512-yIovAzMX49sF8Yl58fSCWJ5svSLuaibPxXQJFLmBObTuCr0Mf1KiPopGM9NiFjiYBCbfaa2Fh6breQ6ANVTI0A=="
"dev": true
}, },
"node_modules/function.prototype.name": { "node_modules/function.prototype.name": {
"version": "1.1.5", "version": "1.1.5",
...@@ -6024,7 +6030,6 @@ ...@@ -6024,7 +6030,6 @@
"version": "1.2.0", "version": "1.2.0",
"resolved": "https://registry.npmjs.org/get-intrinsic/-/get-intrinsic-1.2.0.tgz", "resolved": "https://registry.npmjs.org/get-intrinsic/-/get-intrinsic-1.2.0.tgz",
"integrity": "sha512-L049y6nFOuom5wGyRc3/gdTLO94dySVKRACj1RmJZBQXlbTMhtNIgkWkUHq+jYmZvKf14EW1EoJnnjbmoHij0Q==", "integrity": "sha512-L049y6nFOuom5wGyRc3/gdTLO94dySVKRACj1RmJZBQXlbTMhtNIgkWkUHq+jYmZvKf14EW1EoJnnjbmoHij0Q==",
"dev": true,
"dependencies": { "dependencies": {
"function-bind": "^1.1.1", "function-bind": "^1.1.1",
"has": "^1.0.3", "has": "^1.0.3",
...@@ -6240,7 +6245,6 @@ ...@@ -6240,7 +6245,6 @@
"version": "1.0.3", "version": "1.0.3",
"resolved": "https://registry.npmjs.org/has/-/has-1.0.3.tgz", "resolved": "https://registry.npmjs.org/has/-/has-1.0.3.tgz",
"integrity": "sha512-f2dvO0VU6Oej7RkWJGrehjbzMAjFp5/VKPp5tTpWIV4JHHZK1/BxbFRtf/siA2SWTe09caDmVtYYzWEIbBS4zw==", "integrity": "sha512-f2dvO0VU6Oej7RkWJGrehjbzMAjFp5/VKPp5tTpWIV4JHHZK1/BxbFRtf/siA2SWTe09caDmVtYYzWEIbBS4zw==",
"dev": true,
"dependencies": { "dependencies": {
"function-bind": "^1.1.1" "function-bind": "^1.1.1"
}, },
...@@ -6294,7 +6298,6 @@ ...@@ -6294,7 +6298,6 @@
"version": "1.0.3", "version": "1.0.3",
"resolved": "https://registry.npmjs.org/has-symbols/-/has-symbols-1.0.3.tgz", "resolved": "https://registry.npmjs.org/has-symbols/-/has-symbols-1.0.3.tgz",
"integrity": "sha512-l3LCuF6MgDNwTDKkdYGEihYjt5pRPbEg46rtlmnSPlUbgmB8LOIrKJbYYFBSbnPaJexMKtiPO8hmeRjRz2Td+A==", "integrity": "sha512-l3LCuF6MgDNwTDKkdYGEihYjt5pRPbEg46rtlmnSPlUbgmB8LOIrKJbYYFBSbnPaJexMKtiPO8hmeRjRz2Td+A==",
"dev": true,
"engines": { "engines": {
"node": ">= 0.4" "node": ">= 0.4"
}, },
...@@ -8367,7 +8370,6 @@ ...@@ -8367,7 +8370,6 @@
"version": "1.12.3", "version": "1.12.3",
"resolved": "https://registry.npmjs.org/object-inspect/-/object-inspect-1.12.3.tgz", "resolved": "https://registry.npmjs.org/object-inspect/-/object-inspect-1.12.3.tgz",
"integrity": "sha512-geUvdk7c+eizMNUDkRpW1wJwgfOiOeHbxBR/hLXK1aT6zmVSO0jsQcs7fj6MGw89jC/cjGfLcNOrtMYtGqm81g==", "integrity": "sha512-geUvdk7c+eizMNUDkRpW1wJwgfOiOeHbxBR/hLXK1aT6zmVSO0jsQcs7fj6MGw89jC/cjGfLcNOrtMYtGqm81g==",
"dev": true,
"funding": { "funding": {
"url": "https://github.com/sponsors/ljharb" "url": "https://github.com/sponsors/ljharb"
} }
...@@ -8943,6 +8945,20 @@ ...@@ -8943,6 +8945,20 @@
"teleport": ">=0.2.0" "teleport": ">=0.2.0"
} }
}, },
"node_modules/qs": {
"version": "6.11.2",
"resolved": "https://registry.npmjs.org/qs/-/qs-6.11.2.tgz",
"integrity": "sha512-tDNIz22aBzCDxLtVH++VnTfzxlfeK5CbqohpSqpJgj1Wg/cQbStNAz3NuqCs5vV+pjBsK4x4pN9HlVh7rcYRiA==",
"dependencies": {
"side-channel": "^1.0.4"
},
"engines": {
"node": ">=0.6"
},
"funding": {
"url": "https://github.com/sponsors/ljharb"
}
},
"node_modules/queue-microtask": { "node_modules/queue-microtask": {
"version": "1.2.3", "version": "1.2.3",
"resolved": "https://registry.npmjs.org/queue-microtask/-/queue-microtask-1.2.3.tgz", "resolved": "https://registry.npmjs.org/queue-microtask/-/queue-microtask-1.2.3.tgz",
...@@ -9609,7 +9625,6 @@ ...@@ -9609,7 +9625,6 @@
"version": "1.0.4", "version": "1.0.4",
"resolved": "https://registry.npmjs.org/side-channel/-/side-channel-1.0.4.tgz", "resolved": "https://registry.npmjs.org/side-channel/-/side-channel-1.0.4.tgz",
"integrity": "sha512-q5XPytqFEIKHkGdiMIrY10mvLRvnQh42/+GoBlFW3b2LXLE2xxJpZFdm94we0BaoV3RwJyGqg5wS7epxTv0Zvw==", "integrity": "sha512-q5XPytqFEIKHkGdiMIrY10mvLRvnQh42/+GoBlFW3b2LXLE2xxJpZFdm94we0BaoV3RwJyGqg5wS7epxTv0Zvw==",
"dev": true,
"dependencies": { "dependencies": {
"call-bind": "^1.0.0", "call-bind": "^1.0.0",
"get-intrinsic": "^1.0.2", "get-intrinsic": "^1.0.2",
...@@ -13560,6 +13575,12 @@ ...@@ -13560,6 +13575,12 @@
"integrity": "sha512-Gj7cI7z+98M282Tqmp2K5EIsoouUEzbBJhQQzDE3jSIRk6r9gsz0oUokqIUR4u1R3dMHo0pDHM7sNOHyhulypw==", "integrity": "sha512-Gj7cI7z+98M282Tqmp2K5EIsoouUEzbBJhQQzDE3jSIRk6r9gsz0oUokqIUR4u1R3dMHo0pDHM7sNOHyhulypw==",
"dev": true "dev": true
}, },
"@types/qs": {
"version": "6.9.7",
"resolved": "https://registry.npmjs.org/@types/qs/-/qs-6.9.7.tgz",
"integrity": "sha512-FGa1F62FT09qcrueBA6qYTrJPVDzah9a+493+o2PCXsesWHIn27G98TsSMs3WPNbZIEj4+VJf6saSFpvD+3Zsw==",
"dev": true
},
"@types/resolve": { "@types/resolve": {
"version": "1.17.1", "version": "1.17.1",
"resolved": "https://registry.npmjs.org/@types/resolve/-/resolve-1.17.1.tgz", "resolved": "https://registry.npmjs.org/@types/resolve/-/resolve-1.17.1.tgz",
...@@ -14285,7 +14306,6 @@ ...@@ -14285,7 +14306,6 @@
"version": "1.0.2", "version": "1.0.2",
"resolved": "https://registry.npmjs.org/call-bind/-/call-bind-1.0.2.tgz", "resolved": "https://registry.npmjs.org/call-bind/-/call-bind-1.0.2.tgz",
"integrity": "sha512-7O+FbCihrB5WGbFYesctwmTKae6rOiIzmz1icreWJ+0aA7LJfuqhEso2T9ncpcFtzMQtzXf2QGGueWJGTYsqrA==", "integrity": "sha512-7O+FbCihrB5WGbFYesctwmTKae6rOiIzmz1icreWJ+0aA7LJfuqhEso2T9ncpcFtzMQtzXf2QGGueWJGTYsqrA==",
"dev": true,
"requires": { "requires": {
"function-bind": "^1.1.1", "function-bind": "^1.1.1",
"get-intrinsic": "^1.0.2" "get-intrinsic": "^1.0.2"
...@@ -15677,8 +15697,7 @@ ...@@ -15677,8 +15697,7 @@
"function-bind": { "function-bind": {
"version": "1.1.1", "version": "1.1.1",
"resolved": "https://registry.npmjs.org/function-bind/-/function-bind-1.1.1.tgz", "resolved": "https://registry.npmjs.org/function-bind/-/function-bind-1.1.1.tgz",
"integrity": "sha512-yIovAzMX49sF8Yl58fSCWJ5svSLuaibPxXQJFLmBObTuCr0Mf1KiPopGM9NiFjiYBCbfaa2Fh6breQ6ANVTI0A==", "integrity": "sha512-yIovAzMX49sF8Yl58fSCWJ5svSLuaibPxXQJFLmBObTuCr0Mf1KiPopGM9NiFjiYBCbfaa2Fh6breQ6ANVTI0A=="
"dev": true
}, },
"function.prototype.name": { "function.prototype.name": {
"version": "1.1.5", "version": "1.1.5",
...@@ -15714,7 +15733,6 @@ ...@@ -15714,7 +15733,6 @@
"version": "1.2.0", "version": "1.2.0",
"resolved": "https://registry.npmjs.org/get-intrinsic/-/get-intrinsic-1.2.0.tgz", "resolved": "https://registry.npmjs.org/get-intrinsic/-/get-intrinsic-1.2.0.tgz",
"integrity": "sha512-L049y6nFOuom5wGyRc3/gdTLO94dySVKRACj1RmJZBQXlbTMhtNIgkWkUHq+jYmZvKf14EW1EoJnnjbmoHij0Q==", "integrity": "sha512-L049y6nFOuom5wGyRc3/gdTLO94dySVKRACj1RmJZBQXlbTMhtNIgkWkUHq+jYmZvKf14EW1EoJnnjbmoHij0Q==",
"dev": true,
"requires": { "requires": {
"function-bind": "^1.1.1", "function-bind": "^1.1.1",
"has": "^1.0.3", "has": "^1.0.3",
...@@ -15869,7 +15887,6 @@ ...@@ -15869,7 +15887,6 @@
"version": "1.0.3", "version": "1.0.3",
"resolved": "https://registry.npmjs.org/has/-/has-1.0.3.tgz", "resolved": "https://registry.npmjs.org/has/-/has-1.0.3.tgz",
"integrity": "sha512-f2dvO0VU6Oej7RkWJGrehjbzMAjFp5/VKPp5tTpWIV4JHHZK1/BxbFRtf/siA2SWTe09caDmVtYYzWEIbBS4zw==", "integrity": "sha512-f2dvO0VU6Oej7RkWJGrehjbzMAjFp5/VKPp5tTpWIV4JHHZK1/BxbFRtf/siA2SWTe09caDmVtYYzWEIbBS4zw==",
"dev": true,
"requires": { "requires": {
"function-bind": "^1.1.1" "function-bind": "^1.1.1"
} }
...@@ -15904,8 +15921,7 @@ ...@@ -15904,8 +15921,7 @@
"has-symbols": { "has-symbols": {
"version": "1.0.3", "version": "1.0.3",
"resolved": "https://registry.npmjs.org/has-symbols/-/has-symbols-1.0.3.tgz", "resolved": "https://registry.npmjs.org/has-symbols/-/has-symbols-1.0.3.tgz",
"integrity": "sha512-l3LCuF6MgDNwTDKkdYGEihYjt5pRPbEg46rtlmnSPlUbgmB8LOIrKJbYYFBSbnPaJexMKtiPO8hmeRjRz2Td+A==", "integrity": "sha512-l3LCuF6MgDNwTDKkdYGEihYjt5pRPbEg46rtlmnSPlUbgmB8LOIrKJbYYFBSbnPaJexMKtiPO8hmeRjRz2Td+A=="
"dev": true
}, },
"has-tostringtag": { "has-tostringtag": {
"version": "1.0.0", "version": "1.0.0",
...@@ -17405,8 +17421,7 @@ ...@@ -17405,8 +17421,7 @@
"object-inspect": { "object-inspect": {
"version": "1.12.3", "version": "1.12.3",
"resolved": "https://registry.npmjs.org/object-inspect/-/object-inspect-1.12.3.tgz", "resolved": "https://registry.npmjs.org/object-inspect/-/object-inspect-1.12.3.tgz",
"integrity": "sha512-geUvdk7c+eizMNUDkRpW1wJwgfOiOeHbxBR/hLXK1aT6zmVSO0jsQcs7fj6MGw89jC/cjGfLcNOrtMYtGqm81g==", "integrity": "sha512-geUvdk7c+eizMNUDkRpW1wJwgfOiOeHbxBR/hLXK1aT6zmVSO0jsQcs7fj6MGw89jC/cjGfLcNOrtMYtGqm81g=="
"dev": true
}, },
"object-keys": { "object-keys": {
"version": "1.1.1", "version": "1.1.1",
...@@ -17760,6 +17775,14 @@ ...@@ -17760,6 +17775,14 @@
"integrity": "sha512-kV/CThkXo6xyFEZUugw/+pIOywXcDbFYgSct5cT3gqlbkBE1SJdwy6UQoZvodiWF/ckQLZyDE/Bu1M6gVu5lVw==", "integrity": "sha512-kV/CThkXo6xyFEZUugw/+pIOywXcDbFYgSct5cT3gqlbkBE1SJdwy6UQoZvodiWF/ckQLZyDE/Bu1M6gVu5lVw==",
"dev": true "dev": true
}, },
"qs": {
"version": "6.11.2",
"resolved": "https://registry.npmjs.org/qs/-/qs-6.11.2.tgz",
"integrity": "sha512-tDNIz22aBzCDxLtVH++VnTfzxlfeK5CbqohpSqpJgj1Wg/cQbStNAz3NuqCs5vV+pjBsK4x4pN9HlVh7rcYRiA==",
"requires": {
"side-channel": "^1.0.4"
}
},
"queue-microtask": { "queue-microtask": {
"version": "1.2.3", "version": "1.2.3",
"resolved": "https://registry.npmjs.org/queue-microtask/-/queue-microtask-1.2.3.tgz", "resolved": "https://registry.npmjs.org/queue-microtask/-/queue-microtask-1.2.3.tgz",
...@@ -18248,7 +18271,6 @@ ...@@ -18248,7 +18271,6 @@
"version": "1.0.4", "version": "1.0.4",
"resolved": "https://registry.npmjs.org/side-channel/-/side-channel-1.0.4.tgz", "resolved": "https://registry.npmjs.org/side-channel/-/side-channel-1.0.4.tgz",
"integrity": "sha512-q5XPytqFEIKHkGdiMIrY10mvLRvnQh42/+GoBlFW3b2LXLE2xxJpZFdm94we0BaoV3RwJyGqg5wS7epxTv0Zvw==", "integrity": "sha512-q5XPytqFEIKHkGdiMIrY10mvLRvnQh42/+GoBlFW3b2LXLE2xxJpZFdm94we0BaoV3RwJyGqg5wS7epxTv0Zvw==",
"dev": true,
"requires": { "requires": {
"call-bind": "^1.0.0", "call-bind": "^1.0.0",
"get-intrinsic": "^1.0.2", "get-intrinsic": "^1.0.2",
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论