Merge branch 'master' of github.com:binary-husky/chatgpt_academic

This commit is contained in:
505030475 2023-06-19 14:52:25 +10:00
commit 951d5ec758
7 changed files with 135 additions and 65 deletions

View File

@ -16,7 +16,7 @@ To translate this project to arbitary language with GPT, read and run [`multi_la
> >
> 1.请注意只有**红颜色**标识的函数插件(按钮)才支持读取文件,部分插件位于插件区的**下拉菜单**中。另外我们以**最高优先级**欢迎和处理任何新插件的PR > 1.请注意只有**红颜色**标识的函数插件(按钮)才支持读取文件,部分插件位于插件区的**下拉菜单**中。另外我们以**最高优先级**欢迎和处理任何新插件的PR
> >
> 2.本项目中每个文件的功能都在自译解[`self_analysis.md`](https://github.com/binary-husky/chatgpt_academic/wiki/chatgpt-academic%E9%A1%B9%E7%9B%AE%E8%87%AA%E8%AF%91%E8%A7%A3%E6%8A%A5%E5%91%8A)详细说明。随着版本的迭代您也可以随时自行点击相关函数插件调用GPT重新生成项目的自我解析报告。常见问题汇总在[`wiki`](https://github.com/binary-husky/chatgpt_academic/wiki/%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98)当中。[安装方法](#installation)。 > 2.本项目中每个文件的功能都在自译解[`self_analysis.md`](https://github.com/binary-husky/gpt_academic/wiki/chatgpt-academic%E9%A1%B9%E7%9B%AE%E8%87%AA%E8%AF%91%E8%A7%A3%E6%8A%A5%E5%91%8A)详细说明。随着版本的迭代您也可以随时自行点击相关函数插件调用GPT重新生成项目的自我解析报告。常见问题汇总在[`wiki`](https://github.com/binary-husky/gpt_academic/wiki/%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98)当中。[安装方法](#installation)。
> >
> 3.本项目兼容并鼓励尝试国产大语言模型chatglm和RWKV, 盘古等等。支持多个api-key共存可在配置文件中填写如`API_KEY="openai-key1,openai-key2,api2d-key3"`。需要临时更换`API_KEY`时,在输入区输入临时的`API_KEY`然后回车键提交后即可生效。 > 3.本项目兼容并鼓励尝试国产大语言模型chatglm和RWKV, 盘古等等。支持多个api-key共存可在配置文件中填写如`API_KEY="openai-key1,openai-key2,api2d-key3"`。需要临时更换`API_KEY`时,在输入区输入临时的`API_KEY`然后回车键提交后即可生效。
@ -31,13 +31,13 @@ To translate this project to arbitary language with GPT, read and run [`multi_la
一键中英互译 | 一键中英互译 一键中英互译 | 一键中英互译
一键代码解释 | 显示代码、解释代码、生成代码、给代码加注释 一键代码解释 | 显示代码、解释代码、生成代码、给代码加注释
[自定义快捷键](https://www.bilibili.com/video/BV14s4y1E7jN) | 支持自定义快捷键 [自定义快捷键](https://www.bilibili.com/video/BV14s4y1E7jN) | 支持自定义快捷键
模块化设计 | 支持自定义强大的[函数插件](https://github.com/binary-husky/chatgpt_academic/tree/master/crazy_functions),插件支持[热更新](https://github.com/binary-husky/chatgpt_academic/wiki/%E5%87%BD%E6%95%B0%E6%8F%92%E4%BB%B6%E6%8C%87%E5%8D%97) 模块化设计 | 支持自定义强大的[函数插件](https://github.com/binary-husky/gpt_academic/tree/master/crazy_functions),插件支持[热更新](https://github.com/binary-husky/gpt_academic/wiki/%E5%87%BD%E6%95%B0%E6%8F%92%E4%BB%B6%E6%8C%87%E5%8D%97)
[自我程序剖析](https://www.bilibili.com/video/BV1cj411A7VW) | [函数插件] [一键读懂](https://github.com/binary-husky/chatgpt_academic/wiki/chatgpt-academic%E9%A1%B9%E7%9B%AE%E8%87%AA%E8%AF%91%E8%A7%A3%E6%8A%A5%E5%91%8A)本项目的源代码 [自我程序剖析](https://www.bilibili.com/video/BV1cj411A7VW) | [函数插件] [一键读懂](https://github.com/binary-husky/gpt_academic/wiki/chatgpt-academic%E9%A1%B9%E7%9B%AE%E8%87%AA%E8%AF%91%E8%A7%A3%E6%8A%A5%E5%91%8A)本项目的源代码
[程序剖析](https://www.bilibili.com/video/BV1cj411A7VW) | [函数插件] 一键可以剖析其他Python/C/C++/Java/Lua/...项目树 [程序剖析](https://www.bilibili.com/video/BV1cj411A7VW) | [函数插件] 一键可以剖析其他Python/C/C++/Java/Lua/...项目树
读论文、[翻译](https://www.bilibili.com/video/BV1KT411x7Wn)论文 | [函数插件] 一键解读latex/pdf论文全文并生成摘要 读论文、[翻译](https://www.bilibili.com/video/BV1KT411x7Wn)论文 | [函数插件] 一键解读latex/pdf论文全文并生成摘要
Latex全文[翻译](https://www.bilibili.com/video/BV1nk4y1Y7Js/)、[润色](https://www.bilibili.com/video/BV1FT411H7c5/) | [函数插件] 一键翻译或润色latex论文 Latex全文[翻译](https://www.bilibili.com/video/BV1nk4y1Y7Js/)、[润色](https://www.bilibili.com/video/BV1FT411H7c5/) | [函数插件] 一键翻译或润色latex论文
批量注释生成 | [函数插件] 一键批量生成函数注释 批量注释生成 | [函数插件] 一键批量生成函数注释
Markdown[中英互译](https://www.bilibili.com/video/BV1yo4y157jV/) | [函数插件] 看到上面5种语言的[README](https://github.com/binary-husky/chatgpt_academic/blob/master/docs/README_EN.md)了吗? Markdown[中英互译](https://www.bilibili.com/video/BV1yo4y157jV/) | [函数插件] 看到上面5种语言的[README](https://github.com/binary-husky/gpt_academic/blob/master/docs/README_EN.md)了吗?
chat分析报告生成 | [函数插件] 运行后自动生成总结汇报 chat分析报告生成 | [函数插件] 运行后自动生成总结汇报
[PDF论文全文翻译功能](https://www.bilibili.com/video/BV1KT411x7Wn) | [函数插件] PDF论文提取题目&摘要+翻译全文(多线程) [PDF论文全文翻译功能](https://www.bilibili.com/video/BV1KT411x7Wn) | [函数插件] PDF论文提取题目&摘要+翻译全文(多线程)
[Arxiv小助手](https://www.bilibili.com/video/BV1LM4y1279X) | [函数插件] 输入arxiv文章url即可一键翻译摘要+下载PDF [Arxiv小助手](https://www.bilibili.com/video/BV1LM4y1279X) | [函数插件] 输入arxiv文章url即可一键翻译摘要+下载PDF
@ -46,8 +46,8 @@ chat分析报告生成 | [函数插件] 运行后自动生成总结汇报
⭐Arxiv论文精细翻译 | [函数插件] 一键[以超高质量翻译arxiv论文](https://www.bilibili.com/video/BV1dz4y1v77A/),迄今为止最好的论文翻译工具⭐ ⭐Arxiv论文精细翻译 | [函数插件] 一键[以超高质量翻译arxiv论文](https://www.bilibili.com/video/BV1dz4y1v77A/),迄今为止最好的论文翻译工具⭐
公式/图片/表格显示 | 可以同时显示公式的[tex形式和渲染形式](https://user-images.githubusercontent.com/96192199/230598842-1d7fcddd-815d-40ee-af60-baf488a199df.png),支持公式、代码高亮 公式/图片/表格显示 | 可以同时显示公式的[tex形式和渲染形式](https://user-images.githubusercontent.com/96192199/230598842-1d7fcddd-815d-40ee-af60-baf488a199df.png),支持公式、代码高亮
多线程函数插件支持 | 支持多线调用chatgpt一键处理[海量文本](https://www.bilibili.com/video/BV1FT411H7c5/)或程序 多线程函数插件支持 | 支持多线调用chatgpt一键处理[海量文本](https://www.bilibili.com/video/BV1FT411H7c5/)或程序
启动暗色gradio[主题](https://github.com/binary-husky/chatgpt_academic/issues/173) | 在浏览器url后面添加```/?__theme=dark```可以切换dark主题 启动暗色gradio[主题](https://github.com/binary-husky/gpt_academic/issues/173) | 在浏览器url后面添加```/?__theme=dark```可以切换dark主题
[多LLM模型](https://www.bilibili.com/video/BV1wT411p7yf)支持[API2D](https://api2d.com/)接口支持 | 同时被GPT3.5、GPT4、[清华ChatGLM](https://github.com/THUDM/ChatGLM-6B)、[复旦MOSS](https://github.com/OpenLMLab/MOSS)同时伺候的感觉一定会很不错吧? [多LLM模型](https://www.bilibili.com/video/BV1wT411p7yf)支持 | 同时被GPT3.5、GPT4、[清华ChatGLM](https://github.com/THUDM/ChatGLM-6B)、[复旦MOSS](https://github.com/OpenLMLab/MOSS)同时伺候的感觉一定会很不错吧?
更多LLM模型接入支持[huggingface部署](https://huggingface.co/spaces/qingxu98/gpt-academic) | 加入Newbing接口(新必应),引入清华[Jittorllms](https://github.com/Jittor/JittorLLMs)支持[LLaMA](https://github.com/facebookresearch/llama)[RWKV](https://github.com/BlinkDL/ChatRWKV)和[盘古α](https://openi.org.cn/pangu/) 更多LLM模型接入支持[huggingface部署](https://huggingface.co/spaces/qingxu98/gpt-academic) | 加入Newbing接口(新必应),引入清华[Jittorllms](https://github.com/Jittor/JittorLLMs)支持[LLaMA](https://github.com/facebookresearch/llama)[RWKV](https://github.com/BlinkDL/ChatRWKV)和[盘古α](https://openi.org.cn/pangu/)
更多新功能展示(图像生成等) …… | 见本文档结尾处 …… 更多新功能展示(图像生成等) …… | 见本文档结尾处 ……
@ -91,8 +91,8 @@ chat分析报告生成 | [函数插件] 运行后自动生成总结汇报
1. 下载项目 1. 下载项目
```sh ```sh
git clone https://github.com/binary-husky/chatgpt_academic.git git clone https://github.com/binary-husky/.git
cd chatgpt_academic cd gpt_academic
``` ```
2. 配置API_KEY 2. 配置API_KEY
@ -113,6 +113,7 @@ conda activate gptac_venv # 激活anaconda环境
python -m pip install -r requirements.txt # 这个步骤和pip安装一样的步骤 python -m pip install -r requirements.txt # 这个步骤和pip安装一样的步骤
``` ```
<details><summary>如果需要支持清华ChatGLM/复旦MOSS作为后端请点击展开此处</summary> <details><summary>如果需要支持清华ChatGLM/复旦MOSS作为后端请点击展开此处</summary>
<p> <p>
@ -150,8 +151,8 @@ python main.py
1. 仅ChatGPT推荐大多数人选择 1. 仅ChatGPT推荐大多数人选择
``` sh ``` sh
git clone https://github.com/binary-husky/chatgpt_academic.git # 下载项目 git clone https://github.com/binary-husky/gpt_academic.git # 下载项目
cd chatgpt_academic # 进入路径 cd gpt_academic # 进入路径
nano config.py # 用任意文本编辑器编辑config.py, 配置 “Proxy” “API_KEY” 以及 “WEB_PORT” (例如50923) 等 nano config.py # 用任意文本编辑器编辑config.py, 配置 “Proxy” “API_KEY” 以及 “WEB_PORT” (例如50923) 等
docker build -t gpt-academic . # 安装 docker build -t gpt-academic . # 安装
@ -160,6 +161,7 @@ docker run --rm -it --net=host gpt-academic
#(最后一步-选择2在macOS/windows环境下只能用-p选项将容器上的端口(例如50923)暴露给主机上的端口 #(最后一步-选择2在macOS/windows环境下只能用-p选项将容器上的端口(例如50923)暴露给主机上的端口
docker run --rm -it -e WEB_PORT=50923 -p 50923:50923 gpt-academic docker run --rm -it -e WEB_PORT=50923 -p 50923:50923 gpt-academic
``` ```
P.S. 如果需要依赖Latex的插件功能请见Wiki
2. ChatGPT + ChatGLM + MOSS需要熟悉Docker 2. ChatGPT + ChatGLM + MOSS需要熟悉Docker
@ -188,10 +190,10 @@ docker-compose up
按照`config.py`中的说明配置API_URL_REDIRECT即可。 按照`config.py`中的说明配置API_URL_REDIRECT即可。
4. 远程云服务器部署(需要云服务器知识与经验)。 4. 远程云服务器部署(需要云服务器知识与经验)。
请访问[部署wiki-1](https://github.com/binary-husky/chatgpt_academic/wiki/%E4%BA%91%E6%9C%8D%E5%8A%A1%E5%99%A8%E8%BF%9C%E7%A8%8B%E9%83%A8%E7%BD%B2%E6%8C%87%E5%8D%97) 请访问[部署wiki-1](https://github.com/binary-husky/gpt_academic/wiki/%E4%BA%91%E6%9C%8D%E5%8A%A1%E5%99%A8%E8%BF%9C%E7%A8%8B%E9%83%A8%E7%BD%B2%E6%8C%87%E5%8D%97)
5. 使用WSL2Windows Subsystem for Linux 子系统)。 5. 使用WSL2Windows Subsystem for Linux 子系统)。
请访问[部署wiki-2](https://github.com/binary-husky/chatgpt_academic/wiki/%E4%BD%BF%E7%94%A8WSL2%EF%BC%88Windows-Subsystem-for-Linux-%E5%AD%90%E7%B3%BB%E7%BB%9F%EF%BC%89%E9%83%A8%E7%BD%B2) 请访问[部署wiki-2](https://github.com/binary-husky/gpt_academic/wiki/%E4%BD%BF%E7%94%A8WSL2%EF%BC%88Windows-Subsystem-for-Linux-%E5%AD%90%E7%B3%BB%E7%BB%9F%EF%BC%89%E9%83%A8%E7%BD%B2)
6. 如何在二级网址(如`http://localhost/subpath`)下运行。 6. 如何在二级网址(如`http://localhost/subpath`)下运行。
请访问[FastAPI运行说明](docs/WithFastapi.md) 请访问[FastAPI运行说明](docs/WithFastapi.md)
@ -220,7 +222,7 @@ docker-compose up
编写强大的函数插件来执行任何你想得到的和想不到的任务。 编写强大的函数插件来执行任何你想得到的和想不到的任务。
本项目的插件编写、调试难度很低只要您具备一定的python基础知识就可以仿照我们提供的模板实现自己的插件功能。 本项目的插件编写、调试难度很低只要您具备一定的python基础知识就可以仿照我们提供的模板实现自己的插件功能。
详情请参考[函数插件指南](https://github.com/binary-husky/chatgpt_academic/wiki/%E5%87%BD%E6%95%B0%E6%8F%92%E4%BB%B6%E6%8C%87%E5%8D%97)。 详情请参考[函数插件指南](https://github.com/binary-husky/gpt_academic/wiki/%E5%87%BD%E6%95%B0%E6%8F%92%E4%BB%B6%E6%8C%87%E5%8D%97)。
--- ---
# Latest Update # Latest Update
@ -228,7 +230,7 @@ docker-compose up
1. 对话保存功能。在函数插件区调用 `保存当前的对话` 即可将当前对话保存为可读+可复原的html文件 1. 对话保存功能。在函数插件区调用 `保存当前的对话` 即可将当前对话保存为可读+可复原的html文件
另外在函数插件区(下拉菜单)调用 `载入对话历史存档` ,即可还原之前的会话。 另外在函数插件区(下拉菜单)调用 `载入对话历史存档` ,即可还原之前的会话。
Tip不指定文件直接点击 `载入对话历史存档` 可以查看历史html存档缓存,点击 `删除所有本地对话历史记录` 可以删除所有html存档缓存 Tip不指定文件直接点击 `载入对话历史存档` 可以查看历史html存档缓存。
<div align="center"> <div align="center">
<img src="https://user-images.githubusercontent.com/96192199/235222390-24a9acc0-680f-49f5-bc81-2f3161f1e049.png" width="500" > <img src="https://user-images.githubusercontent.com/96192199/235222390-24a9acc0-680f-49f5-bc81-2f3161f1e049.png" width="500" >
</div> </div>
@ -251,38 +253,33 @@ Tip不指定文件直接点击 `载入对话历史存档` 可以查看历史h
<img src="https://user-images.githubusercontent.com/96192199/227504931-19955f78-45cd-4d1c-adac-e71e50957915.png" height="400" > <img src="https://user-images.githubusercontent.com/96192199/227504931-19955f78-45cd-4d1c-adac-e71e50957915.png" height="400" >
</div> </div>
5. 这是一个能够“自我译解”的开源项目 5. 译解其他开源项目
<div align="center">
<img src="https://user-images.githubusercontent.com/96192199/226936850-c77d7183-0749-4c1c-9875-fd4891842d0c.png" width="500" >
</div>
6. 译解其他开源项目,不在话下
<div align="center"> <div align="center">
<img src="https://user-images.githubusercontent.com/96192199/226935232-6b6a73ce-8900-4aee-93f9-733c7e6fef53.png" height="250" > <img src="https://user-images.githubusercontent.com/96192199/226935232-6b6a73ce-8900-4aee-93f9-733c7e6fef53.png" height="250" >
<img src="https://user-images.githubusercontent.com/96192199/226969067-968a27c1-1b9c-486b-8b81-ab2de8d3f88a.png" height="250" > <img src="https://user-images.githubusercontent.com/96192199/226969067-968a27c1-1b9c-486b-8b81-ab2de8d3f88a.png" height="250" >
</div> </div>
7. 装饰[live2d](https://github.com/fghrsh/live2d_demo)的小功能(默认关闭,需要修改`config.py` 6. 装饰[live2d](https://github.com/fghrsh/live2d_demo)的小功能(默认关闭,需要修改`config.py`
<div align="center"> <div align="center">
<img src="https://user-images.githubusercontent.com/96192199/236432361-67739153-73e8-43fe-8111-b61296edabd9.png" width="500" > <img src="https://user-images.githubusercontent.com/96192199/236432361-67739153-73e8-43fe-8111-b61296edabd9.png" width="500" >
</div> </div>
8. 新增MOSS大语言模型支持 7. 新增MOSS大语言模型支持
<div align="center"> <div align="center">
<img src="https://user-images.githubusercontent.com/96192199/236639178-92836f37-13af-4fdd-984d-b4450fe30336.png" width="500" > <img src="https://user-images.githubusercontent.com/96192199/236639178-92836f37-13af-4fdd-984d-b4450fe30336.png" width="500" >
</div> </div>
9. OpenAI图像生成 8. OpenAI图像生成
<div align="center"> <div align="center">
<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/bc7ab234-ad90-48a0-8d62-f703d9e74665" width="500" > <img src="https://github.com/binary-husky/gpt_academic/assets/96192199/bc7ab234-ad90-48a0-8d62-f703d9e74665" width="500" >
</div> </div>
10. OpenAI音频解析与总结 9. OpenAI音频解析与总结
<div align="center"> <div align="center">
<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/709ccf95-3aee-498a-934a-e1c22d3d5d5b" width="500" > <img src="https://github.com/binary-husky/gpt_academic/assets/96192199/709ccf95-3aee-498a-934a-e1c22d3d5d5b" width="500" >
</div> </div>
11. Latex全文校对纠错 10. Latex全文校对纠错
<div align="center"> <div align="center">
<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/651ccd98-02c9-4464-91e1-77a6b7d1b033" height="200" > ===> <img src="https://github.com/binary-husky/gpt_academic/assets/96192199/651ccd98-02c9-4464-91e1-77a6b7d1b033" height="200" > ===>
<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/476f66d9-7716-4537-b5c1-735372c25adb" height="200"> <img src="https://github.com/binary-husky/gpt_academic/assets/96192199/476f66d9-7716-4537-b5c1-735372c25adb" height="200">
@ -310,30 +307,32 @@ gpt_academic开发者QQ群-2610599535
- 已知问题 - 已知问题
- 某些浏览器翻译插件干扰此软件前端的运行 - 某些浏览器翻译插件干扰此软件前端的运行
- 官方Gradio目前有很多兼容性Bug请务必使用requirement.txt安装Gradio - 官方Gradio目前有很多兼容性Bug请务必使用`requirement.txt`安装Gradio
## 参考与学习 ## 参考与学习
``` ```
代码中参考了很多其他优秀项目中的设计,主要包括 代码中参考了很多其他优秀项目中的设计,顺序不分先后
# 项目1清华ChatGLM-6B: # 清华ChatGLM-6B:
https://github.com/THUDM/ChatGLM-6B https://github.com/THUDM/ChatGLM-6B
# 项目2清华JittorLLMs: # 清华JittorLLMs:
https://github.com/Jittor/JittorLLMs https://github.com/Jittor/JittorLLMs
# 项目3Edge-GPT: # ChatPaper:
https://github.com/acheong08/EdgeGPT
# 项目4ChuanhuChatGPT:
https://github.com/GaiZhenbiao/ChuanhuChatGPT
# 项目5ChatPaper:
https://github.com/kaixindelele/ChatPaper https://github.com/kaixindelele/ChatPaper
# 更多: # Edge-GPT:
https://github.com/acheong08/EdgeGPT
# ChuanhuChatGPT:
https://github.com/GaiZhenbiao/ChuanhuChatGPT
# Oobabooga one-click installer:
https://github.com/oobabooga/one-click-installers
# More
https://github.com/gradio-app/gradio https://github.com/gradio-app/gradio
https://github.com/fghrsh/live2d_demo https://github.com/fghrsh/live2d_demo
https://github.com/oobabooga/one-click-installers
``` ```

View File

@ -46,7 +46,7 @@ MAX_RETRY = 2
# 模型选择是 (注意: LLM_MODEL是默认选中的模型, 同时它必须被包含在AVAIL_LLM_MODELS切换列表中 ) # 模型选择是 (注意: LLM_MODEL是默认选中的模型, 同时它必须被包含在AVAIL_LLM_MODELS切换列表中 )
LLM_MODEL = "gpt-3.5-turbo" # 可选 ↓↓↓ LLM_MODEL = "gpt-3.5-turbo" # 可选 ↓↓↓
AVAIL_LLM_MODELS = ["gpt-3.5-turbo", "api2d-gpt-3.5-turbo", "gpt-4", "api2d-gpt-4", "chatglm", "moss", "newbing", "newbing-free", "stack-claude"] AVAIL_LLM_MODELS = ["gpt-3.5-turbo-16k", "gpt-3.5-turbo", "api2d-gpt-3.5-turbo", "gpt-4", "api2d-gpt-4", "chatglm", "moss", "newbing", "newbing-free", "stack-claude"]
# P.S. 其他可用的模型还包括 ["newbing-free", "jittorllms_rwkv", "jittorllms_pangualpha", "jittorllms_llama"] # P.S. 其他可用的模型还包括 ["newbing-free", "jittorllms_rwkv", "jittorllms_pangualpha", "jittorllms_llama"]
# 本地LLM模型如ChatGLM的执行方式 CPU/GPU # 本地LLM模型如ChatGLM的执行方式 CPU/GPU

View File

@ -30,7 +30,7 @@ def 知识库问答(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_pro
) )
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
from .crazy_utils import try_install_deps from .crazy_utils import try_install_deps
try_install_deps(['zh_langchain==0.2.0']) try_install_deps(['zh_langchain==0.2.1'])
# < --------------------读取参数--------------- > # < --------------------读取参数--------------- >
if ("advanced_arg" in plugin_kwargs) and (plugin_kwargs["advanced_arg"] == ""): plugin_kwargs.pop("advanced_arg") if ("advanced_arg" in plugin_kwargs) and (plugin_kwargs["advanced_arg"] == ""): plugin_kwargs.pop("advanced_arg")
@ -84,7 +84,7 @@ def 读取知识库作答(txt, llm_kwargs, plugin_kwargs, chatbot, history, syst
chatbot.append(["依赖不足", "导入依赖失败。正在尝试自动安装,请查看终端的输出或耐心等待..."]) chatbot.append(["依赖不足", "导入依赖失败。正在尝试自动安装,请查看终端的输出或耐心等待..."])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
from .crazy_utils import try_install_deps from .crazy_utils import try_install_deps
try_install_deps(['zh_langchain==0.2.0']) try_install_deps(['zh_langchain==0.2.1'])
# < ------------------- --------------- > # < ------------------- --------------- >
kai = knowledge_archive_interface() kai = knowledge_archive_interface()

View File

@ -5,7 +5,7 @@ pj = os.path.join
ARXIV_CACHE_DIR = os.path.expanduser(f"~/arxiv_cache/") ARXIV_CACHE_DIR = os.path.expanduser(f"~/arxiv_cache/")
# =================================== 工具函数 =============================================== # =================================== 工具函数 ===============================================
沙雕GPT啊别犯这些低级翻译错误 = 'You must to translate "agent" to "智能体". ' 专业词汇声明 = 'If the term "agent" is used in this section, it should be translated to "智能体". '
def switch_prompt(pfg, mode): def switch_prompt(pfg, mode):
""" """
Generate prompts and system prompts based on the mode for proofreading or translating. Generate prompts and system prompts based on the mode for proofreading or translating.
@ -25,7 +25,7 @@ def switch_prompt(pfg, mode):
f"\n\n{frag}" for frag in pfg.sp_file_contents] f"\n\n{frag}" for frag in pfg.sp_file_contents]
sys_prompt_array = ["You are a professional academic paper writer." for _ in range(n_split)] sys_prompt_array = ["You are a professional academic paper writer." for _ in range(n_split)]
elif mode == 'translate_zh': elif mode == 'translate_zh':
inputs_array = [r"Below is a section from an English academic paper, translate it into Chinese." + 沙雕GPT啊别犯这些低级翻译错误 + inputs_array = [r"Below is a section from an English academic paper, translate it into Chinese. " + 专业词汇声明 +
r"Do not modify any latex command such as \section, \cite, \begin, \item and equations. " + r"Do not modify any latex command such as \section, \cite, \begin, \item and equations. " +
r"Answer me only with the translated text:" + r"Answer me only with the translated text:" +
f"\n\n{frag}" for frag in pfg.sp_file_contents] f"\n\n{frag}" for frag in pfg.sp_file_contents]
@ -146,7 +146,7 @@ def Latex英文纠错加PDF对比(txt, llm_kwargs, plugin_kwargs, chatbot, histo
from .latex_utils import Latex精细分解与转化, 编译Latex from .latex_utils import Latex精细分解与转化, 编译Latex
except Exception as e: except Exception as e:
chatbot.append([ f"解析项目: {txt}", chatbot.append([ f"解析项目: {txt}",
f"尝试执行Latex指令失败。Latex没有安装, 或者不在环境变量PATH中。报错信息\n\n```\n\n{trimmed_format_exc()}\n\n```\n\n"]) f"尝试执行Latex指令失败。Latex没有安装, 或者不在环境变量PATH中。安装方法https://tug.org/texlive/。报错信息\n\n```\n\n{trimmed_format_exc()}\n\n```\n\n"])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
return return
@ -205,7 +205,7 @@ def Latex翻译中文并重新编译PDF(txt, llm_kwargs, plugin_kwargs, chatbot,
# <-------------- information about this plugin -------------> # <-------------- information about this plugin ------------->
chatbot.append([ chatbot.append([
"函数插件功能?", "函数插件功能?",
"对整个Latex项目进行翻译, 生成中文PDF。函数插件贡献者: Binary-Husky。注意事项: 目前仅支持GPT3.5/GPT4其他模型转化效果未知。目前对机器学习类文献转化效果最好其他类型文献转化效果未知。"]) "对整个Latex项目进行翻译, 生成中文PDF。函数插件贡献者: Binary-Husky。注意事项: 此插件Windows支持最佳Linux下必须使用Docker安装详见项目主README.md。目前仅支持GPT3.5/GPT4其他模型转化效果未知。目前对机器学习类文献转化效果最好其他类型文献转化效果未知。"])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
@ -216,7 +216,7 @@ def Latex翻译中文并重新编译PDF(txt, llm_kwargs, plugin_kwargs, chatbot,
from .latex_utils import Latex精细分解与转化, 编译Latex from .latex_utils import Latex精细分解与转化, 编译Latex
except Exception as e: except Exception as e:
chatbot.append([ f"解析项目: {txt}", chatbot.append([ f"解析项目: {txt}",
f"尝试执行Latex指令失败。Latex没有安装, 或者不在环境变量PATH中。报错信息\n\n```\n\n{trimmed_format_exc()}\n\n```\n\n"]) f"尝试执行Latex指令失败。Latex没有安装, 或者不在环境变量PATH中。安装方法https://tug.org/texlive/。报错信息\n\n```\n\n{trimmed_format_exc()}\n\n```\n\n"])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
return return

View File

@ -23,13 +23,38 @@ def split_worker(text, mask, pattern, flags=0):
mask[res.span()[0]:res.span()[1]] = PRESERVE mask[res.span()[0]:res.span()[1]] = PRESERVE
return text, mask return text, mask
def split_worker_reverse_caption(text, mask, pattern, flags=0): def split_worker_careful_brace(text, mask, pattern, flags=0):
""" """
Move caption area out of preserve area Move area into preserve area
""" """
pattern_compile = re.compile(pattern, flags) pattern_compile = re.compile(pattern, flags)
for res in pattern_compile.finditer(text): for res in pattern_compile.finditer(text):
mask[res.regs[1][0]:res.regs[1][1]] = TRANSFORM brace_level = -1
p = begin = end = res.regs[0][0]
for _ in range(1024*16):
if text[p] == '}' and brace_level == 0: break
elif text[p] == '}': brace_level -= 1
elif text[p] == '{': brace_level += 1
p += 1
end = p+1
mask[begin:end] = PRESERVE
return text, mask
def split_worker_reverse_careful_brace(text, mask, pattern, flags=0):
"""
Move area out of preserve area
"""
pattern_compile = re.compile(pattern, flags)
for res in pattern_compile.finditer(text):
brace_level = 0
p = begin = end = res.regs[1][0]
for _ in range(1024*16):
if text[p] == '}' and brace_level == 0: break
elif text[p] == '}': brace_level -= 1
elif text[p] == '{': brace_level += 1
p += 1
end = p
mask[begin:end] = TRANSFORM
return text, mask return text, mask
def split_worker_begin_end(text, mask, pattern, flags=0, limit_n_lines=42): def split_worker_begin_end(text, mask, pattern, flags=0, limit_n_lines=42):
@ -97,17 +122,19 @@ def 寻找Latex主文件(file_manifest, mode):
else: else:
continue continue
raise RuntimeError('无法找到一个主Tex文件包含documentclass关键字') raise RuntimeError('无法找到一个主Tex文件包含documentclass关键字')
def rm_comments(main_file): def rm_comments(main_file):
new_file_remove_comment_lines = [] new_file_remove_comment_lines = []
for l in main_file.splitlines(): for l in main_file.splitlines():
# 删除整行的空注释 # 删除整行的空注释
if l.startswith("%") or (l.startswith(" ") and l.lstrip().startswith("%")): if l.lstrip().startswith("%"):
pass pass
else: else:
new_file_remove_comment_lines.append(l) new_file_remove_comment_lines.append(l)
main_file = '\n'.join(new_file_remove_comment_lines) main_file = '\n'.join(new_file_remove_comment_lines)
main_file = re.sub(r'(?<!\\)%.*', '', main_file) # 使用正则表达式查找半行注释, 并替换为空字符串 main_file = re.sub(r'(?<!\\)%.*', '', main_file) # 使用正则表达式查找半行注释, 并替换为空字符串
return main_file return main_file
def merge_tex_files_(project_foler, main_file, mode): def merge_tex_files_(project_foler, main_file, mode):
""" """
Merge Tex project recrusively Merge Tex project recrusively
@ -138,17 +165,23 @@ def merge_tex_files(project_foler, main_file, mode):
main_file = rm_comments(main_file) main_file = rm_comments(main_file)
if mode == 'translate_zh': if mode == 'translate_zh':
# find paper documentclass
pattern = re.compile(r'\\documentclass.*\n') pattern = re.compile(r'\\documentclass.*\n')
match = pattern.search(main_file) match = pattern.search(main_file)
assert match is not None, "Cannot find documentclass statement!"
position = match.end() position = match.end()
add_ctex = '\\usepackage{ctex}\n' add_ctex = '\\usepackage{ctex}\n'
add_url = '\\usepackage{url}\n' if '{url}' not in main_file else '' add_url = '\\usepackage{url}\n' if '{url}' not in main_file else ''
main_file = main_file[:position] + add_ctex + add_url + main_file[position:] main_file = main_file[:position] + add_ctex + add_url + main_file[position:]
# 2 fontset=windows # fontset=windows
import platform import platform
if platform.system() != 'Windows': if platform.system() != 'Windows':
main_file = re.sub(r"\\documentclass\[(.*?)\]{(.*?)}", r"\\documentclass[\1,fontset=windows]{\2}",main_file) main_file = re.sub(r"\\documentclass\[(.*?)\]{(.*?)}", r"\\documentclass[\1,fontset=windows]{\2}",main_file)
main_file = re.sub(r"\\documentclass{(.*?)}", r"\\documentclass[fontset=windows]{\1}",main_file) main_file = re.sub(r"\\documentclass{(.*?)}", r"\\documentclass[fontset=windows]{\1}",main_file)
# find paper abstract
pattern = re.compile(r'\\begin\{abstract\}.*\n')
match = pattern.search(main_file)
assert match is not None, "Cannot find paper abstract section!"
return main_file return main_file
@ -185,14 +218,39 @@ def fix_content(final_tex, node_string):
if node_string.count('\_') > 0 and node_string.count('\_') > final_tex.count('\_'): if node_string.count('\_') > 0 and node_string.count('\_') > final_tex.count('\_'):
# walk and replace any _ without \ # walk and replace any _ without \
final_tex = re.sub(r"(?<!\\)_", "\\_", final_tex) final_tex = re.sub(r"(?<!\\)_", "\\_", final_tex)
if node_string.count('{') != node_string.count('}'):
if final_tex.count('{') != node_string.count('{'): def compute_brace_level(string):
final_tex = node_string # 出问题了,还原原文 # this function count the number of { and }
if final_tex.count('}') != node_string.count('}'): brace_level = 0
final_tex = node_string # 出问题了,还原原文 for c in string:
if c == "{": brace_level += 1
elif c == "}": brace_level -= 1
return brace_level
def join_most(tex_t, tex_o):
# this function join translated string and original string when something goes wrong
p_t = 0
p_o = 0
def find_next(string, chars, begin):
p = begin
while p < len(string):
if string[p] in chars: return p, string[p]
p += 1
return None, None
while True:
res1, char = find_next(tex_o, ['{','}'], p_o)
if res1 is None: break
res2, char = find_next(tex_t, [char], p_t)
if res2 is None: break
p_o = res1 + 1
p_t = res2 + 1
return tex_t[:p_t] + tex_o[p_o:]
if compute_brace_level(final_tex) != compute_brace_level(node_string):
# 出问题了,还原部分原文,保证括号正确
final_tex = join_most(final_tex, node_string)
return final_tex return final_tex
def split_subprocess(txt, project_folder, return_dict): def split_subprocess(txt, project_folder, return_dict, opts):
""" """
break down latex file to a linked list, break down latex file to a linked list,
each node use a preserve flag to indicate whether it should each node use a preserve flag to indicate whether it should
@ -239,7 +297,8 @@ def split_subprocess(txt, project_folder, return_dict):
text, mask = split_worker(text, mask, r"\\vspace\{(.*?)\}") text, mask = split_worker(text, mask, r"\\vspace\{(.*?)\}")
text, mask = split_worker(text, mask, r"\\hspace\{(.*?)\}") text, mask = split_worker(text, mask, r"\\hspace\{(.*?)\}")
text, mask = split_worker(text, mask, r"\\end\{(.*?)\}") text, mask = split_worker(text, mask, r"\\end\{(.*?)\}")
# text, mask = split_worker_reverse_caption(text, mask, r"\\caption\{(.*?)\}", re.DOTALL) text, mask = split_worker_careful_brace(text, mask, r"\\hl\{(.*?)\}", re.DOTALL)
text, mask = split_worker_reverse_careful_brace(text, mask, r"\\caption\{(.*?)\}", re.DOTALL)
root = convert_to_linklist(text, mask) root = convert_to_linklist(text, mask)
# 修复括号 # 修复括号
@ -365,11 +424,12 @@ class LatexPaperSplit():
if mode == 'translate_zh': if mode == 'translate_zh':
pattern = re.compile(r'\\begin\{abstract\}.*\n') pattern = re.compile(r'\\begin\{abstract\}.*\n')
match = pattern.search(result_string) match = pattern.search(result_string)
assert match is not None, "Cannot find paper abstract section!"
position = match.end() position = match.end()
result_string = result_string[:position] + self.msg + msg + self.msg_declare + result_string[position:] result_string = result_string[:position] + self.msg + msg + self.msg_declare + result_string[position:]
return result_string return result_string
def split(self, txt, project_folder): def split(self, txt, project_folder, opts):
""" """
break down latex file to a linked list, break down latex file to a linked list,
each node use a preserve flag to indicate whether it should each node use a preserve flag to indicate whether it should
@ -381,7 +441,7 @@ class LatexPaperSplit():
return_dict = manager.dict() return_dict = manager.dict()
p = multiprocessing.Process( p = multiprocessing.Process(
target=split_subprocess, target=split_subprocess,
args=(txt, project_folder, return_dict)) args=(txt, project_folder, return_dict, opts))
p.start() p.start()
p.join() p.join()
self.nodes = return_dict['nodes'] self.nodes = return_dict['nodes']
@ -440,7 +500,7 @@ class LatexPaperFileGroup():
def Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, mode='proofread', switch_prompt=None): def Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, mode='proofread', switch_prompt=None, opts=[]):
import time, os, re import time, os, re
from .crazy_utils import request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency from .crazy_utils import request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency
from .latex_utils import LatexPaperFileGroup, merge_tex_files, LatexPaperSplit, 寻找Latex主文件 from .latex_utils import LatexPaperFileGroup, merge_tex_files, LatexPaperSplit, 寻找Latex主文件
@ -469,8 +529,10 @@ def Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin
f.write(merged_content) f.write(merged_content)
# <-------- 精细切分latex文件 ----------> # <-------- 精细切分latex文件 ---------->
chatbot.append((f"Latex文件融合完成", f'[Local Message] 正在精细切分latex文件这需要一段时间计算文档越长耗时越长请耐心等待。'))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
lps = LatexPaperSplit() lps = LatexPaperSplit()
res = lps.split(merged_content, project_folder) # 消耗时间的函数 res = lps.split(merged_content, project_folder, opts) # 消耗时间的函数
# <-------- 拆分过长的latex片段 ----------> # <-------- 拆分过长的latex片段 ---------->
pfg = LatexPaperFileGroup() pfg = LatexPaperFileGroup()
@ -567,7 +629,7 @@ def 编译Latex(chatbot, history, main_file_original, main_file_modified, work_f
current_dir = os.getcwd() current_dir = os.getcwd()
n_fix = 1 n_fix = 1
max_try = 32 max_try = 32
chatbot.append([f"正在编译PDF文档", f'编译已经开始。当前工作路径为{work_folder}如果程序停顿5分钟以上则大概率是卡死在Latex里面了。不幸卡死时请直接去该路径下取回翻译结果,或者重启之后再度尝试 ...']); yield from update_ui(chatbot=chatbot, history=history) chatbot.append([f"正在编译PDF文档", f'编译已经开始。当前工作路径为{work_folder}如果程序停顿5分钟以上请直接去该路径下取回翻译结果,或者重启之后再度尝试 ...']); yield from update_ui(chatbot=chatbot, history=history)
chatbot.append([f"正在编译PDF文档", '...']); yield from update_ui(chatbot=chatbot, history=history); time.sleep(1); chatbot[-1] = list(chatbot[-1]) # 刷新界面 chatbot.append([f"正在编译PDF文档", '...']); yield from update_ui(chatbot=chatbot, history=history); time.sleep(1); chatbot[-1] = list(chatbot[-1]) # 刷新界面
yield from update_ui_lastest_msg('编译已经开始...', chatbot, history) # 刷新Gradio前端界面 yield from update_ui_lastest_msg('编译已经开始...', chatbot, history) # 刷新Gradio前端界面

View File

@ -83,6 +83,15 @@ model_info = {
"tokenizer": tokenizer_gpt35, "tokenizer": tokenizer_gpt35,
"token_cnt": get_token_num_gpt35, "token_cnt": get_token_num_gpt35,
}, },
"gpt-3.5-turbo-16k": {
"fn_with_ui": chatgpt_ui,
"fn_without_ui": chatgpt_noui,
"endpoint": openai_endpoint,
"max_token": 1024*16,
"tokenizer": tokenizer_gpt35,
"token_cnt": get_token_num_gpt35,
},
"gpt-4": { "gpt-4": {
"fn_with_ui": chatgpt_ui, "fn_with_ui": chatgpt_ui,

View File

@ -1,5 +1,5 @@
{ {
"version": 3.4, "version": 3.41,
"show_feature": true, "show_feature": true,
"new_feature": "新增最强Arxiv论文翻译插件 <-> 修复gradio复制按钮BUG <-> 修复PDF翻译的BUG, 新增HTML中英双栏对照 <-> 添加了OpenAI图片生成插件 <-> 添加了OpenAI音频转文本总结插件 <-> 通过Slack添加对Claude的支持" "new_feature": "增加gpt-3.5-16k的支持 <-> 新增最强Arxiv论文翻译插件 <-> 修复gradio复制按钮BUG <-> 修复PDF翻译的BUG, 新增HTML中英双栏对照 <-> 添加了OpenAI图片生成插件 <-> 添加了OpenAI音频转文本总结插件 <-> 通过Slack添加对Claude的支持"
} }