Unverified 提交 a0a8257b 作者: imClumsyPanda 提交者: GitHub

Merge pull request #51 from calcitem/readme

Update README
...@@ -54,7 +54,18 @@ python knowledge_based_chatglm.py ...@@ -54,7 +54,18 @@ python knowledge_based_chatglm.py
- 目前已测试支持 txt、docx、md 格式文件,更多文件格式请参考 [langchain 文档](https://python.langchain.com/en/latest/modules/indexes/document_loaders/examples/unstructured_file.html),目前已知文档中若含有特殊字符,可能存在文件无法加载的问题; - 目前已测试支持 txt、docx、md 格式文件,更多文件格式请参考 [langchain 文档](https://python.langchain.com/en/latest/modules/indexes/document_loaders/examples/unstructured_file.html),目前已知文档中若含有特殊字符,可能存在文件无法加载的问题;
- 使用 macOS 运行本项目时,可能因为 macOS 版本为 13.3 及以上版本导致与 pytorch 不兼容,无法正常运行的情况。 - 使用 macOS 运行本项目时,可能因为 macOS 版本为 13.3 及以上版本导致与 pytorch 不兼容,无法正常运行的情况。
### 常见问题
Q: `Resource punkt not found.` 如何解决?
A: https://github.com/nltk/nltk_data/raw/gh-pages/packages/tokenizers/punkt.zip 中的 `packages/tokenizers` 解压,放到 `Searched in:` 对应目录下。
Q: `Resource averaged_perceptron_tagger not found.` 如何解决?
A: 将 https://github.com/nltk/nltk_data/blob/gh-pages/packages/taggers/averaged_perceptron_tagger.zip 下载,解压放到 `Searched in:` 对应目录下。
## DEMO ## DEMO
以问题`chatglm-6b 的局限性具体体现在哪里,如何实现改进`为例 以问题`chatglm-6b 的局限性具体体现在哪里,如何实现改进`为例
未使用 langchain 接入本地文档时: 未使用 langchain 接入本地文档时:
......
...@@ -55,7 +55,18 @@ python knowledge_based_chatglm.py ...@@ -55,7 +55,18 @@ python knowledge_based_chatglm.py
- Currently tested to support txt, docx, md format files, for more file formats please refer to [langchain documentation](https://python.langchain.com/en/latest/modules/indexes/document_loaders/examples/unstructured_file.html). If the document contains special characters, the file may not be correctly loaded. - Currently tested to support txt, docx, md format files, for more file formats please refer to [langchain documentation](https://python.langchain.com/en/latest/modules/indexes/document_loaders/examples/unstructured_file.html). If the document contains special characters, the file may not be correctly loaded.
- When running this project with macOS, it may not work properly due to incompatibility with pytorch caused by macOS version 13.3 and above. - When running this project with macOS, it may not work properly due to incompatibility with pytorch caused by macOS version 13.3 and above.
### FAQ
Q: How to solve `Resource punkt not found.`?
A: Unzip `packages/tokenizers` in https://github.com/nltk/nltk_data/raw/gh-pages/packages/tokenizers/punkt.zip and put it in the corresponding directory of `Searched in:`.
Q: How to solve `Resource averaged_perceptron_tagger not found.`?
A: Download https://github.com/nltk/nltk_data/blob/gh-pages/packages/taggers/averaged_perceptron_tagger.zip, decompress it and put it in the corresponding directory of `Searched in:`.
## Roadmap ## Roadmap
- [x] local knowledge based application with langchain + ChatGLM-6B - [x] local knowledge based application with langchain + ChatGLM-6B
- [x] unstructured files loaded with langchain - [x] unstructured files loaded with langchain
- [ ] more different file format loaded with langchain - [ ] more different file format loaded with langchain
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论