Merge branch 'master' into improve_ui_master

This commit is contained in:
binary-husky 2023-07-09 20:47:53 +08:00
commit f725ab1b31
28 changed files with 871 additions and 1281 deletions

View File

@ -1,28 +1,34 @@
# 此Dockerfile适用于“无本地模型”的环境构建如果需要使用chatglm等本地模型请参考 docs/Dockerfile+ChatGLM
# 如何构建: 先修改 `config.py` 然后 docker build -t gpt-academic .
# 如何运行: docker run --rm -it --net=host gpt-academic
# 此Dockerfile适用于“无本地模型”的环境构建如果需要使用chatglm等本地模型或者latex运行依赖请参考 docker-compose.yml
# 如何构建: 先修改 `config.py` 然后 `docker build -t gpt-academic . `
# 如何运行(Linux下): `docker run --rm -it --net=host gpt-academic `
# 如何运行(其他操作系统选择任意一个固定端口50923): `docker run --rm -it -e WEB_PORT=50923 -p 50923:50923 gpt-academic `
FROM python:3.11
# 非必要步骤更换pip源
RUN echo '[global]' > /etc/pip.conf && \
echo 'index-url = https://mirrors.aliyun.com/pypi/simple/' >> /etc/pip.conf && \
echo 'trusted-host = mirrors.aliyun.com' >> /etc/pip.conf
# 进入工作路径
WORKDIR /gpt
# 安装依赖
# 安装大部分依赖利用Docker缓存加速以后的构建
COPY requirements.txt ./
COPY ./docs/gradio-3.32.2-py3-none-any.whl ./docs/gradio-3.32.2-py3-none-any.whl
RUN pip3 install -r requirements.txt
# 装载项目文件
# 装载项目文件,安装剩余依赖
COPY . .
RUN pip3 install -r requirements.txt
# 可选步骤,用于预热模块
# 非必要步骤,用于预热模块
RUN python3 -c 'from check_proxy import warm_up_modules; warm_up_modules()'
# 启动
CMD ["python3", "-u", "main.py"]

View File

@ -1,24 +1,26 @@
> **Note**
>
> 2023.5.27 对Gradio依赖进行了调整Fork并解决了官方Gradio的若干Bugs。请及时**更新代码**并重新更新pip依赖。安装依赖时请严格选择`requirements.txt`中**指定的版本**
>
> `pip install -r requirements.txt`
> 2023.7.5: Gradio依赖调整。请及时**更新代码**
>
> 2023.7.8: pydantic出现兼容问题已修改 `requirements.txt`。安装依赖时,请严格选择`requirements.txt`中**指定的版本**
>
> `pip install -r requirements.txt`
# <img src="docs/logo.png" width="40" > GPT 学术优化 (GPT Academic)
**如果喜欢这个项目请给它一个Star如果你发明了更好用的快捷键或函数插件欢迎发pull requests**
# <div align=center><img src="docs/logo.png" width="40" > GPT 学术优化 (GPT Academic)</div>
**如果喜欢这个项目请给它一个Star如果您发明了好用的快捷键或函数插件欢迎发pull requests**
If you like this project, please give it a Star. If you've come up with more useful academic shortcuts or functional plugins, feel free to open an issue or pull request. We also have a README in [English|](docs/README_EN.md)[日本語|](docs/README_JP.md)[한국어|](https://github.com/mldljyh/ko_gpt_academic)[Русский|](docs/README_RS.md)[Français](docs/README_FR.md) translated by this project itself.
To translate this project to arbitary language with GPT, read and run [`multi_language.py`](multi_language.py) (experimental).
> **Note**
>
> 1.请注意只有**红色**标识的函数插件(按钮)才支持读取文件,部分插件位于插件区的**下拉菜单**中。另外我们以**最高优先级**欢迎和处理任何新插件的PR
> 1.请注意只有 **高亮(如红色)** 标识的函数插件(按钮)才支持读取文件,部分插件位于插件区的**下拉菜单**中。另外我们以**最高优先级**欢迎和处理任何新插件的PR
>
> 2.本项目中每个文件的功能都在自译解[`self_analysis.md`](https://github.com/binary-husky/gpt_academic/wiki/chatgpt-academic%E9%A1%B9%E7%9B%AE%E8%87%AA%E8%AF%91%E8%A7%A3%E6%8A%A5%E5%91%8A)详细说明。随着版本的迭代您也可以随时自行点击相关函数插件调用GPT重新生成项目的自我解析报告。常见问题汇总在[`wiki`](https://github.com/binary-husky/gpt_academic/wiki/%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98)当中。[安装方法](#installation)。
>
> 3.本项目兼容并鼓励尝试国产大语言模型chatglm和RWKV, 盘古等等。支持多个api-key共存可在配置文件中填写如`API_KEY="openai-key1,openai-key2,api2d-key3"`。需要临时更换`API_KEY`时,在输入区输入临时的`API_KEY`然后回车键提交后即可生效。
> 3.本项目兼容并鼓励尝试国产大语言模型ChatGLM和Moss等等。支持多个api-key共存可在配置文件中填写如`API_KEY="openai-key1,openai-key2,api2d-key3"`。需要临时更换`API_KEY`时,在输入区输入临时的`API_KEY`然后回车键提交后即可生效。
@ -42,13 +44,13 @@ chat分析报告生成 | [函数插件] 运行后自动生成总结汇报
[PDF论文全文翻译功能](https://www.bilibili.com/video/BV1KT411x7Wn) | [函数插件] PDF论文提取题目&摘要+翻译全文(多线程)
[Arxiv小助手](https://www.bilibili.com/video/BV1LM4y1279X) | [函数插件] 输入arxiv文章url即可一键翻译摘要+下载PDF
[谷歌学术统合小助手](https://www.bilibili.com/video/BV19L411U7ia) | [函数插件] 给定任意谷歌学术搜索页面URL让gpt帮你[写relatedworks](https://www.bilibili.com/video/BV1GP411U7Az/)
互联网信息聚合+GPT | [函数插件] 一键[让GPT从互联网获取信息](https://www.bilibili.com/video/BV1om4y127ck),再回答问题,让信息永不过时
⭐Arxiv论文精细翻译 | [函数插件] 一键[以超高质量翻译arxiv论文](https://www.bilibili.com/video/BV1dz4y1v77A/)迄今为止最好的论文翻译工具⭐
互联网信息聚合+GPT | [函数插件] 一键[让GPT从互联网获取信息](https://www.bilibili.com/video/BV1om4y127ck)回答问题,让信息永不过时
⭐Arxiv论文精细翻译 | [函数插件] 一键[以超高质量翻译arxiv论文](https://www.bilibili.com/video/BV1dz4y1v77A/)目前最好的论文翻译工具
公式/图片/表格显示 | 可以同时显示公式的[tex形式和渲染形式](https://user-images.githubusercontent.com/96192199/230598842-1d7fcddd-815d-40ee-af60-baf488a199df.png),支持公式、代码高亮
多线程函数插件支持 | 支持多线调用chatgpt一键处理[海量文本](https://www.bilibili.com/video/BV1FT411H7c5/)或程序
启动暗色gradio[主题](https://github.com/binary-husky/gpt_academic/issues/173) | 在浏览器url后面添加```/?__theme=dark```可以切换dark主题
启动暗色[主题](https://github.com/binary-husky/gpt_academic/issues/173) | 在浏览器url后面添加```/?__theme=dark```可以切换dark主题
[多LLM模型](https://www.bilibili.com/video/BV1wT411p7yf)支持 | 同时被GPT3.5、GPT4、[清华ChatGLM](https://github.com/THUDM/ChatGLM-6B)、[复旦MOSS](https://github.com/OpenLMLab/MOSS)同时伺候的感觉一定会很不错吧?
更多LLM模型接入支持[huggingface部署](https://huggingface.co/spaces/qingxu98/gpt-academic) | 加入Newbing接口(新必应),引入清华[Jittorllms](https://github.com/Jittor/JittorLLMs)支持[LLaMA](https://github.com/facebookresearch/llama)[RWKV](https://github.com/BlinkDL/ChatRWKV)和[盘古α](https://openi.org.cn/pangu/)
更多LLM模型接入支持[huggingface部署](https://huggingface.co/spaces/qingxu98/gpt-academic) | 加入Newbing接口(新必应),引入清华[Jittorllms](https://github.com/Jittor/JittorLLMs)支持[LLaMA](https://github.com/facebookresearch/llama)和[盘古α](https://openi.org.cn/pangu/)
更多新功能展示(图像生成等) …… | 见本文档结尾处 ……
</div>
@ -85,9 +87,8 @@ chat分析报告生成 | [函数插件] 运行后自动生成总结汇报
<img src="https://user-images.githubusercontent.com/96192199/232537274-deca0563-7aa6-4b5d-94a2-b7c453c47794.png" width="700" >
</div>
---
# Installation
## 安装-方法1:直接运行 (Windows, Linux or MacOS)
### 安装方法I:直接运行 (Windows, Linux or MacOS)
1. 下载项目
```sh
@ -140,7 +141,7 @@ AVAIL_LLM_MODELS = ["gpt-3.5-turbo", "api2d-gpt-3.5-turbo", "gpt-4", "api2d-gpt-
python main.py
```
## 安装-方法2使用Docker
### 安装方法II使用Docker
1. 仅ChatGPT推荐大多数人选择等价于docker-compose方案1
@ -171,7 +172,7 @@ docker-compose up
```
## 安装-方法3:其他部署姿势
### 安装方法III:其他部署姿势
1. 一键运行脚本。
完全不熟悉python环境的Windows用户可以下载[Release](https://github.com/binary-husky/gpt_academic/releases)中发布的一键运行脚本安装无本地模型的版本。
脚本的贡献来源是[oobabooga](https://github.com/oobabooga/one-click-installers)。
@ -194,11 +195,9 @@ docker-compose up
7. 如何在二级网址(如`http://localhost/subpath`)下运行。
请访问[FastAPI运行说明](docs/WithFastapi.md)
---
# Advanced Usage
## 自定义新的便捷按钮 / 自定义函数插件
1. 自定义新的便捷按钮(学术快捷键)
# Advanced Usage
### I自定义新的便捷按钮学术快捷键
任意文本编辑器打开`core_functional.py`,添加条目如下,然后重启程序即可。(如果按钮已经添加成功并可见,那么前缀、后缀都支持热修改,无需重启程序即可生效。)
例如
```
@ -214,15 +213,15 @@ docker-compose up
<img src="https://user-images.githubusercontent.com/96192199/226899272-477c2134-ed71-4326-810c-29891fe4a508.png" width="500" >
</div>
2. 自定义函数插件
### II自定义函数插件
编写强大的函数插件来执行任何你想得到的和想不到的任务。
本项目的插件编写、调试难度很低只要您具备一定的python基础知识就可以仿照我们提供的模板实现自己的插件功能。
详情请参考[函数插件指南](https://github.com/binary-husky/gpt_academic/wiki/%E5%87%BD%E6%95%B0%E6%8F%92%E4%BB%B6%E6%8C%87%E5%8D%97)。
---
# Latest Update
## 新功能动态
### I新功能动态
1. 对话保存功能。在函数插件区调用 `保存当前的对话` 即可将当前对话保存为可读+可复原的html文件
另外在函数插件区(下拉菜单)调用 `载入对话历史存档` ,即可还原之前的会话。
@ -283,7 +282,7 @@ Tip不指定文件直接点击 `载入对话历史存档` 可以查看历史h
## 版本:
### II版本:
- version 3.5(Todo): 使用自然语言调用本项目的所有函数插件(高优先级)
- version 3.4: +arxiv论文翻译、latex论文批改功能
- version 3.3: +互联网信息综合功能
@ -305,11 +304,12 @@ gpt_academic开发者QQ群-2610599535
- 某些浏览器翻译插件干扰此软件前端的运行
- 官方Gradio目前有很多兼容性Bug请务必使用`requirement.txt`安装Gradio
## 主题
### III主题
1. `Chuanhu-Keldos-Green` [网址](https://github.com/GaiZhenbiao/ChuanhuChatGPT/)
## 参考与学习
### IV参考与学习
```
代码中参考了很多其他优秀项目中的设计,顺序不分先后:

View File

@ -1,17 +1,27 @@
# [step 1]>> 例如: API_KEY = "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" 此key无效
API_KEY = "sk-此处填API密钥" # 可同时填写多个API-KEY用英文逗号分割例如API_KEY = "sk-openaikey1,sk-openaikey2,fkxxxx-api2dkey1,fkxxxx-api2dkey2"
"""
以下所有配置也都支持利用环境变量覆写环境变量配置格式见docker-compose.yml
读取优先级环境变量 > config_private.py > config.py
--- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ---
All the following configurations also support using environment variables to override,
and the environment variable configuration format can be seen in docker-compose.yml.
Configuration reading priority: environment variable > config_private.py > config.py
"""
# [step 1]>> API_KEY = "sk-123456789xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx123456789"。极少数情况下还需要填写组织格式如org-123456789abcdefghijklmno的请向下翻找 API_ORG 设置项
API_KEY = "此处填API密钥" # 可同时填写多个API-KEY用英文逗号分割例如API_KEY = "sk-openaikey1,sk-openaikey2,fkxxxx-api2dkey3,azure-apikey4"
# [step 2]>> 改为True应用代理如果直接在海外服务器部署此处不修改
USE_PROXY = False
if USE_PROXY:
# 填写格式是 [协议]:// [地址] :[端口]填写之前不要忘记把USE_PROXY改成True如果直接在海外服务器部署此处不修改
# 例如 "socks5h://localhost:11284"
# [协议] 常见协议无非socks5h/http; 例如 v2**y 和 ss* 的默认本地协议是socks5h; 而cl**h 的默认本地协议是http
# [地址] 懂的都懂不懂就填localhost或者127.0.0.1肯定错不了localhost意思是代理软件安装在本机上
# [端口] 在代理软件的设置里找。虽然不同的代理软件界面不一样,但端口号都应该在最显眼的位置上
# 代理网络的地址,打开你的*学*网软件查看代理的协议(socks5/http)、地址(localhost)和端口(11284)
"""
填写格式是 [协议]:// [地址] :[端口]填写之前不要忘记把USE_PROXY改成True如果直接在海外服务器部署此处不修改
<配置教程&视频教程> https://github.com/binary-husky/gpt_academic/issues/1>
[协议] 常见协议无非socks5h/http; 例如 v2**y ss* 的默认本地协议是socks5h; 而cl**h 的默认本地协议是http
[地址] 懂的都懂不懂就填localhost或者127.0.0.1肯定错不了localhost意思是代理软件安装在本机上
[端口] 在代理软件的设置里找虽然不同的代理软件界面不一样但端口号都应该在最显眼的位置上
"""
# 代理网络的地址,打开你的*学*网软件查看代理的协议(socks5h / http)、地址(localhost)和端口(11284)
proxies = {
# [协议]:// [地址] :[端口]
"http": "socks5h://localhost:11284", # 再例如 "http": "http://127.0.0.1:7890",
@ -20,82 +30,100 @@ if USE_PROXY:
else:
proxies = None
# [step 3]>> 多线程函数插件中默认允许多少路线程同时访问OpenAI。Free trial users的限制是每分钟3次Pay-as-you-go users的限制是每分钟3500次
# 一言以蔽之免费用户填3OpenAI绑了信用卡的用户可以填 16 或者更高。提高限制请查询https://platform.openai.com/docs/guides/rate-limits/overview
# ------------------------------------ 以下配置可以优化体验, 但大部分场合下并不需要修改 ------------------------------------
# 重新URL重新定向实现更换API_URL的作用常规情况下不要修改!! 高危设置通过修改此设置您将把您的API-KEY和对话隐私完全暴露给您设定的中间人
# 格式 API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions": "在这里填写重定向的api.openai.com的URL"}
# 例如 API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions":"https://reverse-proxy-url/v1/chat/completions"}
API_URL_REDIRECT = {}
# 多线程函数插件中默认允许多少路线程同时访问OpenAI。Free trial users的限制是每分钟3次Pay-as-you-go users的限制是每分钟3500次
# 一言以蔽之免费5刀用户填3OpenAI绑了信用卡的用户可以填 16 或者更高。提高限制请查询https://platform.openai.com/docs/guides/rate-limits/overview
DEFAULT_WORKER_NUM = 3
# [step 4]>> 以下配置可以优化体验,但大部分场合下并不需要修改
# 对话窗的高度
CHATBOT_HEIGHT = 1115
# 代码高亮
CODE_HIGHLIGHT = True
# 窗口布局
LAYOUT = "LEFT-RIGHT" # "LEFT-RIGHT"(左右布局) # "TOP-DOWN"(上下布局)
DARK_MODE = True # "LEFT-RIGHT"(左右布局) # "TOP-DOWN"(上下布局)
LAYOUT = "LEFT-RIGHT" # "LEFT-RIGHT"(左右布局) # "TOP-DOWN"(上下布局)
DARK_MODE = True # 暗色模式 / 亮色模式
# 发送请求到OpenAI后等待多久判定为超时
TIMEOUT_SECONDS = 30
# 网页的端口, -1代表随机端口
WEB_PORT = -1
# 如果OpenAI不响应网络卡顿、代理失败、KEY失效重试的次数限制
MAX_RETRY = 2
# 模型选择是 (注意: LLM_MODEL是默认选中的模型, 同时它必须被包含在AVAIL_LLM_MODELS切换列表中 )
# 模型选择是 (注意: LLM_MODEL是默认选中的模型, 它*必须*被包含在AVAIL_LLM_MODELS列表中 )
LLM_MODEL = "gpt-3.5-turbo" # 可选 ↓↓↓
AVAIL_LLM_MODELS = ["gpt-3.5-turbo-16k", "gpt-3.5-turbo", "azure-gpt35", "api2d-gpt-3.5-turbo", "gpt-4", "api2d-gpt-4", "chatglm", "moss", "newbing", "newbing-free", "stack-claude"]
AVAIL_LLM_MODELS = ["gpt-3.5-turbo-16k", "gpt-3.5-turbo", "azure-gpt-3.5", "api2d-gpt-3.5-turbo", "gpt-4", "api2d-gpt-4", "chatglm", "moss", "newbing", "stack-claude"]
# P.S. 其他可用的模型还包括 ["gpt-3.5-turbo-0613", "gpt-3.5-turbo-16k-0613", "newbing-free", "jittorllms_rwkv", "jittorllms_pangualpha", "jittorllms_llama"]
# 本地LLM模型如ChatGLM的执行方式 CPU/GPU
LOCAL_MODEL_DEVICE = "cpu" # 可选 "cuda"
# 设置gradio的并行线程数不需要修改
CONCURRENT_COUNT = 100
# 是否在提交时自动清空输入框
AUTO_CLEAR_TXT = False
# 色彩主体,可选 ["Default", "Chuanhu-Keldos-Green"]
THEME = "Chuanhu-Keldos-Green"
THEME = "Default"
# 加一个live2d装饰
ADD_WAIFU = False
# 设置用户名和密码不需要修改相关功能不稳定与gradio版本和网络都相关如果本地使用不建议加这个
# [("username", "password"), ("username2", "password2"), ...]
AUTHENTICATION = []
# 重新URL重新定向实现更换API_URL的作用常规情况下不要修改!!
# 高危设置通过修改此设置您将把您的API-KEY和对话隐私完全暴露给您设定的中间人
# 格式 {"https://api.openai.com/v1/chat/completions": "在这里填写重定向的api.openai.com的URL"}
# 例如 API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions": "https://ai.open.com/api/conversation"}
API_URL_REDIRECT = {}
# 如果需要在二级路径下运行(常规情况下,不要修改!!需要配合修改main.py才能生效!
CUSTOM_PATH = "/"
# 如果需要使用newbing把newbing的长长的cookie放到这里
NEWBING_STYLE = "creative" # ["creative", "balanced", "precise"]
# 从现在起,如果您调用"newbing-free"模型则无需填写NEWBING_COOKIES
NEWBING_COOKIES = """
your bing cookies here
"""
# 极少数情况下openai的官方KEY需要伴随组织编码格式如org-xxxxxxxxxxxxxxxxxxxxxxxx使用
API_ORG = ""
# 如果需要使用Slack Claude使用教程详情见 request_llm/README.md
SLACK_CLAUDE_BOT_ID = ''
SLACK_CLAUDE_USER_TOKEN = ''
# 如果需要使用AZURE 详情请见额外文档 docs\use_azure.md
AZURE_ENDPOINT = "https://你的api名称.openai.azure.com/"
AZURE_API_KEY = "填入azure openai api的密钥"
AZURE_API_VERSION = "填入api版本"
AZURE_ENGINE = "填入ENGINE"
AZURE_ENDPOINT = "https://你亲手写的api名称.openai.azure.com/"
AZURE_API_KEY = "填入azure openai api的密钥" # 建议直接在API_KEY处填写该选项即将被弃用
AZURE_ENGINE = "填入你亲手写的部署名" # 读 docs\use_azure.md
# 使用Newbing
NEWBING_STYLE = "creative" # ["creative", "balanced", "precise"]
NEWBING_COOKIES = """
put your new bing cookies here
"""
# 阿里云实时语音识别 配置门槛较高 限高级用户使用 参考 https://help.aliyun.com/document_detail/450255.html
ENABLE_AUDIO = True
ALIYUN_TOKEN="" # 例如 f37f30e0f9934c34a992f6f64f7eba4f
ALIYUN_APPKEY="" # 例如 RoPlZrM88DnAFkZK
ALIYUN_APPKEY="" # 例如 RoPlZrM88DnAFkZK

View File

@ -352,6 +352,32 @@ def get_crazy_functions():
})
except:
print('Load function plugin failed')
try:
from crazy_functions.交互功能函数模板 import 交互功能模板函数
function_plugins.update({
"交互功能模板函数": {
"Color": "stop",
"AsButton": False,
"Function": HotReload(交互功能模板函数)
}
})
except:
print('Load function plugin failed')
# try:
# from crazy_functions.chatglm微调工具 import 微调数据集生成
# function_plugins.update({
# "黑盒模型学习: 微调数据集生成 (先上传数据集)": {
# "Color": "stop",
# "AsButton": False,
# "AdvancedArgs": True,
# "ArgsReminder": "针对数据集输入(如 绿帽子*深蓝色衬衫*黑色运动裤)给出指令,例如您可以将以下命令复制到下方: --llm_to_learn=azure-gpt-3.5 --prompt_prefix='根据下面的服装类型提示想象一个穿着者对这个人外貌、身处的环境、内心世界、过去经历进行描写。要求100字以内用第二人称。' --system_prompt=''",
# "Function": HotReload(微调数据集生成)
# }
# })
# except:
# print('Load function plugin failed')
try:
from crazy_functions.Latex输出PDF结果 import Latex英文纠错加PDF对比

View File

@ -0,0 +1,71 @@
from toolbox import CatchException, update_ui, promote_file_to_downloadzone
from .crazy_utils import request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency
import datetime, json
def fetch_items(list_of_items, batch_size):
for i in range(0, len(list_of_items), batch_size):
yield list_of_items[i:i + batch_size]
def string_to_options(arguments):
import argparse
import shlex
# Create an argparse.ArgumentParser instance
parser = argparse.ArgumentParser()
# Add command-line arguments
parser.add_argument("--llm_to_learn", type=str, help="LLM model to learn", default="gpt-3.5-turbo")
parser.add_argument("--prompt_prefix", type=str, help="Prompt prefix", default='')
parser.add_argument("--system_prompt", type=str, help="System prompt", default='')
parser.add_argument("--batch", type=int, help="System prompt", default=50)
# Parse the arguments
args = parser.parse_args(shlex.split(arguments))
return args
@CatchException
def 微调数据集生成(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
"""
txt 输入栏用户输入的文本例如需要翻译的一段话再例如一个包含了待处理文件的路径
llm_kwargs gpt模型参数如温度和top_p等一般原样传递下去就行
plugin_kwargs 插件模型的参数
chatbot 聊天显示框的句柄用于显示给用户
history 聊天历史前情提要
system_prompt 给gpt的静默提醒
web_port 当前软件运行的端口号
"""
history = [] # 清空历史,以免输入溢出
chatbot.append(("这是什么功能?", "[Local Message] 微调数据集生成"))
if ("advanced_arg" in plugin_kwargs) and (plugin_kwargs["advanced_arg"] == ""): plugin_kwargs.pop("advanced_arg")
args = plugin_kwargs.get("advanced_arg", None)
if args is None:
chatbot.append(("没给定指令", "退出"))
yield from update_ui(chatbot=chatbot, history=history); return
else:
arguments = string_to_options(arguments=args)
dat = []
with open(txt, 'r', encoding='utf8') as f:
for line in f.readlines():
json_dat = json.loads(line)
dat.append(json_dat["content"])
llm_kwargs['llm_model'] = arguments.llm_to_learn
for batch in fetch_items(dat, arguments.batch):
res = yield from request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency(
inputs_array=[f"{arguments.prompt_prefix}\n\n{b}" for b in (batch)],
inputs_show_user_array=[f"Show Nothing" for _ in (batch)],
llm_kwargs=llm_kwargs,
chatbot=chatbot,
history_array=[[] for _ in (batch)],
sys_prompt_array=[arguments.system_prompt for _ in (batch)],
max_workers=10 # OpenAI所允许的最大并行过载
)
with open(txt+'.generated.json', 'a+', encoding='utf8') as f:
for b, r in zip(batch, res[1::2]):
f.write(json.dumps({"content":b, "summary":r}, ensure_ascii=False)+'\n')
promote_file_to_downloadzone(txt+'.generated.json', rename_file='generated.json', chatbot=chatbot)
return

View File

@ -211,22 +211,30 @@ def test_Latex():
# # for cookies, cb, hist, msg in silence_stdout(编译Latex)(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
# cli_printer.print(cb) # print(cb)
def test_chatglm_finetune():
from crazy_functions.chatglm微调工具 import 微调数据集生成
txt = 'build/dev.json'
plugin_kwargs = {"advanced_arg":"--llm_to_learn=gpt-3.5-turbo --prompt_prefix='根据下面的服装类型提示想象一个穿着者对这个人外貌、身处的环境、内心世界、人设进行描写。要求100字以内用第二人称。' --system_prompt=''" }
for cookies, cb, hist, msg in (微调数据集生成)(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
cli_printer.print(cb)
# test_解析一个Python项目()
# test_Latex英文润色()
# test_Markdown中译英()
# test_批量翻译PDF文档()
# test_谷歌检索小助手()
# test_总结word文档()
# test_下载arxiv论文并翻译摘要()
# test_解析一个Cpp项目()
# test_联网回答问题()
# test_解析ipynb文件()
# test_数学动画生成manim()
# test_Langchain知识库()
# test_Langchain知识库读取()
if __name__ == "__main__":
test_Latex()
# test_解析一个Python项目()
# test_Latex英文润色()
# test_Markdown中译英()
# test_批量翻译PDF文档()
# test_谷歌检索小助手()
# test_总结word文档()
# test_下载arxiv论文并翻译摘要()
# test_解析一个Cpp项目()
# test_联网回答问题()
# test_解析ipynb文件()
# test_数学动画生成manim()
# test_Langchain知识库()
# test_Langchain知识库读取()
# test_Latex()
test_chatglm_finetune()
input("程序完成,回车退出。")
print("退出。")

View File

@ -130,6 +130,11 @@ def request_gpt_model_in_new_thread_with_ui_alive(
yield from update_ui(chatbot=chatbot, history=[]) # 如果最后成功了,则删除报错信息
return final_result
def can_multi_process(llm):
if llm.startswith('gpt-'): return True
if llm.startswith('api2d-'): return True
if llm.startswith('azure-'): return True
return False
def request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency(
inputs_array, inputs_show_user_array, llm_kwargs,
@ -175,7 +180,7 @@ def request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency(
except: max_workers = 8
if max_workers <= 0: max_workers = 3
# 屏蔽掉 chatglm的多线程可能会导致严重卡顿
if not (llm_kwargs['llm_model'].startswith('gpt-') or llm_kwargs['llm_model'].startswith('api2d-')):
if not can_multi_process(llm_kwargs['llm_model']):
max_workers = 1
executor = ThreadPoolExecutor(max_workers=max_workers)

View File

@ -189,6 +189,18 @@ def rm_comments(main_file):
main_file = re.sub(r'(?<!\\)%.*', '', main_file) # 使用正则表达式查找半行注释, 并替换为空字符串
return main_file
def find_tex_file_ignore_case(fp):
dir_name = os.path.dirname(fp)
base_name = os.path.basename(fp)
if not base_name.endswith('.tex'): base_name+='.tex'
if os.path.exists(pj(dir_name, base_name)): return pj(dir_name, base_name)
# go case in-sensitive
import glob
for f in glob.glob(dir_name+'/*.tex'):
base_name_s = os.path.basename(fp)
if base_name_s.lower() == base_name.lower(): return f
return None
def merge_tex_files_(project_foler, main_file, mode):
"""
Merge Tex project recrusively
@ -197,15 +209,11 @@ def merge_tex_files_(project_foler, main_file, mode):
for s in reversed([q for q in re.finditer(r"\\input\{(.*?)\}", main_file, re.M)]):
f = s.group(1)
fp = os.path.join(project_foler, f)
if os.path.exists(fp):
# e.g., \input{srcs/07_appendix.tex}
with open(fp, 'r', encoding='utf-8', errors='replace') as fx:
c = fx.read()
else:
# e.g., \input{srcs/07_appendix}
assert os.path.exists(fp+'.tex'), f'即找不到{fp},也找不到{fp}.texTex源文件缺失'
with open(fp+'.tex', 'r', encoding='utf-8', errors='replace') as fx:
c = fx.read()
fp = find_tex_file_ignore_case(fp)
if fp:
with open(fp, 'r', encoding='utf-8', errors='replace') as fx: c = fx.read()
else:
raise RuntimeError(f'找不到{fp}Tex源文件缺失')
c = merge_tex_files_(project_foler, c, mode)
main_file = main_file[:s.span()[0]] + c + main_file[s.span()[1]:]
return main_file
@ -324,7 +332,7 @@ def split_subprocess(txt, project_folder, return_dict, opts):
# 吸收在42行以内的begin-end组合
text, mask = set_forbidden_text_begin_end(text, mask, r"\\begin\{([a-z\*]*)\}(.*?)\\end\{\1\}", re.DOTALL, limit_n_lines=42)
# 吸收匿名公式
text, mask = set_forbidden_text(text, mask, [ r"\$\$(.*?)\$\$", r"\\\[.*?\\\]" ], re.DOTALL)
text, mask = set_forbidden_text(text, mask, [ r"\$\$([^$]+)\$\$", r"\\\[.*?\\\]" ], re.DOTALL)
# 吸收其他杂项
text, mask = set_forbidden_text(text, mask, [ r"\\section\{(.*?)\}", r"\\section\*\{(.*?)\}", r"\\subsection\{(.*?)\}", r"\\subsubsection\{(.*?)\}" ])
text, mask = set_forbidden_text(text, mask, [ r"\\bibliography\{(.*?)\}", r"\\bibliographystyle\{(.*?)\}" ])

View File

@ -0,0 +1,63 @@
from toolbox import CatchException, update_ui
from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
@CatchException
def 交互功能模板函数(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
"""
txt 输入栏用户输入的文本例如需要翻译的一段话再例如一个包含了待处理文件的路径
llm_kwargs gpt模型参数, 如温度和top_p等, 一般原样传递下去就行
plugin_kwargs 插件模型的参数, 如温度和top_p等, 一般原样传递下去就行
chatbot 聊天显示框的句柄用于显示给用户
history 聊天历史前情提要
system_prompt 给gpt的静默提醒
web_port 当前软件运行的端口号
"""
history = [] # 清空历史,以免输入溢出
chatbot.append(("这是什么功能?", "交互功能函数模板。在执行完成之后, 可以将自身的状态存储到cookie中, 等待用户的再次调用。"))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
state = chatbot._cookies.get('plugin_state_0001', None) # 初始化插件状态
if state is None:
chatbot._cookies['lock_plugin'] = 'crazy_functions.交互功能函数模板->交互功能模板函数' # 赋予插件锁定 锁定插件回调路径,当下一次用户提交时,会直接转到该函数
chatbot._cookies['plugin_state_0001'] = 'wait_user_keyword' # 赋予插件状态
chatbot.append(("第一次调用:", "请输入关键词, 我将为您查找相关壁纸, 建议使用英文单词, 插件锁定中,请直接提交即可。"))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
return
if state == 'wait_user_keyword':
chatbot._cookies['lock_plugin'] = None # 解除插件锁定,避免遗忘导致死锁
chatbot._cookies['plugin_state_0001'] = None # 解除插件状态,避免遗忘导致死锁
# 解除插件锁定
chatbot.append((f"获取关键词:{txt}", ""))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
page_return = get_image_page_by_keyword(txt)
inputs=inputs_show_user=f"Extract all image urls in this html page, pick the first 5 images and show them with markdown format: \n\n {page_return}"
gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(
inputs=inputs, inputs_show_user=inputs_show_user,
llm_kwargs=llm_kwargs, chatbot=chatbot, history=[],
sys_prompt="When you want to show an image, use markdown format. e.g. ![image_description](image_url). If there are no image url provided, answer 'no image url provided'"
)
chatbot[-1] = [chatbot[-1][0], gpt_say]
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
return
# ---------------------------------------------------------------------------------
def get_image_page_by_keyword(keyword):
import requests
from bs4 import BeautifulSoup
response = requests.get(f'https://wallhaven.cc/search?q={keyword}', timeout=2)
res = "image urls: \n"
for image_element in BeautifulSoup(response.content, 'html.parser').findAll("img"):
try:
res += image_element["data-src"]
res += "\n"
except:
pass
return res

View File

@ -14,17 +14,19 @@ def 解析docx(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot
doc = Document(fp)
file_content = "\n".join([para.text for para in doc.paragraphs])
else:
import win32com.client
word = win32com.client.Dispatch("Word.Application")
word.visible = False
# 打开文件
print('fp', os.getcwd())
doc = word.Documents.Open(os.getcwd() + '/' + fp)
# file_content = doc.Content.Text
doc = word.ActiveDocument
file_content = doc.Range().Text
doc.Close()
word.Quit()
try:
import win32com.client
word = win32com.client.Dispatch("Word.Application")
word.visible = False
# 打开文件
doc = word.Documents.Open(os.getcwd() + '/' + fp)
# file_content = doc.Content.Text
doc = word.ActiveDocument
file_content = doc.Range().Text
doc.Close()
word.Quit()
except:
raise RuntimeError('请先将.doc文档转换为.docx文档。')
print(file_content)
# private_upload里面的文件名在解压zip后容易出现乱码rar和7z格式正常故可以只分析文章内容不输入文件名

View File

@ -1,121 +1,107 @@
from toolbox import update_ui
from toolbox import update_ui, promote_file_to_downloadzone, gen_time_str
from toolbox import CatchException, report_execption, write_results_to_file
import re
import unicodedata
fast_debug = False
from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
from .crazy_utils import read_and_clean_pdf_text
from .crazy_utils import input_clipping
def is_paragraph_break(match):
"""
根据给定的匹配结果来判断换行符是否表示段落分隔
如果换行符前为句子结束标志句号感叹号问号且下一个字符为大写字母则换行符更有可能表示段落分隔
也可以根据之前的内容长度来判断段落是否已经足够长
"""
prev_char, next_char = match.groups()
# 句子结束标志
sentence_endings = ".!?"
# 设定一个最小段落长度阈值
min_paragraph_length = 140
if prev_char in sentence_endings and next_char.isupper() and len(match.string[:match.start(1)]) > min_paragraph_length:
return "\n\n"
else:
return " "
def normalize_text(text):
"""
通过把连字ligatures等文本特殊符号转换为其基本形式来对文本进行归一化处理
例如将连字 "fi" 转换为 "f" "i"
"""
# 对文本进行归一化处理,分解连字
normalized_text = unicodedata.normalize("NFKD", text)
# 替换其他特殊字符
cleaned_text = re.sub(r'[^\x00-\x7F]+', '', normalized_text)
return cleaned_text
def clean_text(raw_text):
"""
对从 PDF 提取出的原始文本进行清洗和格式化处理
1. 对原始文本进行归一化处理
2. 替换跨行的连词
3. 根据 heuristic 规则判断换行符是否是段落分隔并相应地进行替换
"""
# 对文本进行归一化处理
normalized_text = normalize_text(raw_text)
# 替换跨行的连词
text = re.sub(r'(\w+-\n\w+)', lambda m: m.group(1).replace('-\n', ''), normalized_text)
# 根据前后相邻字符的特点,找到原文本中的换行符
newlines = re.compile(r'(\S)\n(\S)')
# 根据 heuristic 规则,用空格或段落分隔符替换原换行符
final_text = re.sub(newlines, lambda m: m.group(1) + is_paragraph_break(m) + m.group(2), text)
return final_text.strip()
def 解析PDF(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt):
import time, glob, os, fitz
print('begin analysis on:', file_manifest)
for index, fp in enumerate(file_manifest):
with fitz.open(fp) as doc:
file_content = ""
for page in doc:
file_content += page.get_text()
file_content = clean_text(file_content)
print(file_content)
file_write_buffer = []
for file_name in file_manifest:
print('begin analysis on:', file_name)
############################## <第 0 步切割PDF> ##################################
# 递归地切割PDF文件每一块尽量是完整的一个section比如introductionexperiment等必要时再进行切割
# 的长度必须小于 2500 个 Token
file_content, page_one = read_and_clean_pdf_text(file_name) # 尝试按照章节切割PDF
file_content = file_content.encode('utf-8', 'ignore').decode() # avoid reading non-utf8 chars
page_one = str(page_one).encode('utf-8', 'ignore').decode() # avoid reading non-utf8 chars
TOKEN_LIMIT_PER_FRAGMENT = 2500
prefix = "接下来请你逐文件分析下面的论文文件,概括其内容" if index==0 else ""
i_say = prefix + f'请对下面的文章片段用中文做一个概述,文件名是{os.path.relpath(fp, project_folder)},文章内容是 ```{file_content}```'
i_say_show_user = prefix + f'[{index}/{len(file_manifest)}] 请对下面的文章片段做一个概述: {os.path.abspath(fp)}'
chatbot.append((i_say_show_user, "[Local Message] waiting gpt response."))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
from .crazy_utils import breakdown_txt_to_satisfy_token_limit_for_pdf
from request_llm.bridge_all import model_info
enc = model_info["gpt-3.5-turbo"]['tokenizer']
def get_token_num(txt): return len(enc.encode(txt, disallowed_special=()))
paper_fragments = breakdown_txt_to_satisfy_token_limit_for_pdf(
txt=file_content, get_token_fn=get_token_num, limit=TOKEN_LIMIT_PER_FRAGMENT)
page_one_fragments = breakdown_txt_to_satisfy_token_limit_for_pdf(
txt=str(page_one), get_token_fn=get_token_num, limit=TOKEN_LIMIT_PER_FRAGMENT//4)
# 为了更好的效果我们剥离Introduction之后的部分如果有
paper_meta = page_one_fragments[0].split('introduction')[0].split('Introduction')[0].split('INTRODUCTION')[0]
############################## <第 1 步从摘要中提取高价值信息放到history中> ##################################
final_results = []
final_results.append(paper_meta)
if not fast_debug:
msg = '正常'
# ** gpt request **
gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(
inputs=i_say,
inputs_show_user=i_say_show_user,
llm_kwargs=llm_kwargs,
chatbot=chatbot,
history=[],
sys_prompt="总结文章。"
) # 带超时倒计时
############################## <第 2 步,迭代地历遍整个文章,提取精炼信息> ##################################
i_say_show_user = f'首先你在中文语境下通读整篇论文。'; gpt_say = "[Local Message] 收到。" # 用户提示
chatbot.append([i_say_show_user, gpt_say]); yield from update_ui(chatbot=chatbot, history=[]) # 更新UI
chatbot[-1] = (i_say_show_user, gpt_say)
history.append(i_say_show_user); history.append(gpt_say)
yield from update_ui(chatbot=chatbot, history=history, msg=msg) # 刷新界面
if not fast_debug: time.sleep(2)
iteration_results = []
last_iteration_result = paper_meta # 初始值是摘要
MAX_WORD_TOTAL = 4096 * 0.7
n_fragment = len(paper_fragments)
if n_fragment >= 20: print('文章极长,不能达到预期效果')
for i in range(n_fragment):
NUM_OF_WORD = MAX_WORD_TOTAL // n_fragment
i_say = f"Read this section, recapitulate the content of this section with less than {NUM_OF_WORD} Chinese characters: {paper_fragments[i]}"
i_say_show_user = f"[{i+1}/{n_fragment}] Read this section, recapitulate the content of this section with less than {NUM_OF_WORD} Chinese characters: {paper_fragments[i][:200]}"
gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(i_say, i_say_show_user, # i_say=真正给chatgpt的提问 i_say_show_user=给用户看的提问
llm_kwargs, chatbot,
history=["The main idea of the previous section is?", last_iteration_result], # 迭代上一次的结果
sys_prompt="Extract the main idea of this section with Chinese." # 提示
)
iteration_results.append(gpt_say)
last_iteration_result = gpt_say
all_file = ', '.join([os.path.relpath(fp, project_folder) for index, fp in enumerate(file_manifest)])
i_say = f'根据以上你自己的分析,对全文进行概括,用学术性语言写一段中文摘要,然后再写一段英文摘要(包括{all_file})。'
chatbot.append((i_say, "[Local Message] waiting gpt response."))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
if not fast_debug:
msg = '正常'
# ** gpt request **
############################## <第 3 步整理history提取总结> ##################################
final_results.extend(iteration_results)
final_results.append(f'Please conclude this paper discussed above。')
# This prompt is from https://github.com/kaixindelele/ChatPaper/blob/main/chat_paper.py
NUM_OF_WORD = 1000
i_say = """
1. Mark the title of the paper (with Chinese translation)
2. list all the authors' names (use English)
3. mark the first author's affiliation (output Chinese translation only)
4. mark the keywords of this article (use English)
5. link to the paper, Github code link (if available, fill in Github:None if not)
6. summarize according to the following four points.Be sure to use Chinese answers (proper nouns need to be marked in English)
- (1):What is the research background of this article?
- (2):What are the past methods? What are the problems with them? Is the approach well motivated?
- (3):What is the research methodology proposed in this paper?
- (4):On what task and what performance is achieved by the methods in this paper? Can the performance support their goals?
Follow the format of the output that follows:
1. Title: xxx\n\n
2. Authors: xxx\n\n
3. Affiliation: xxx\n\n
4. Keywords: xxx\n\n
5. Urls: xxx or xxx , xxx \n\n
6. Summary: \n\n
- (1):xxx;\n
- (2):xxx;\n
- (3):xxx;\n
- (4):xxx.\n\n
Be sure to use Chinese answers (proper nouns need to be marked in English), statements as concise and academic as possible,
do not have too much repetitive information, numerical values using the original numbers.
"""
# This prompt is from https://github.com/kaixindelele/ChatPaper/blob/main/chat_paper.py
file_write_buffer.extend(final_results)
i_say, final_results = input_clipping(i_say, final_results, max_token_limit=2000)
gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(
inputs=i_say,
inputs_show_user=i_say,
llm_kwargs=llm_kwargs,
chatbot=chatbot,
history=history,
sys_prompt="总结文章。"
) # 带超时倒计时
inputs=i_say, inputs_show_user='开始最终总结',
llm_kwargs=llm_kwargs, chatbot=chatbot, history=final_results,
sys_prompt= f"Extract the main idea of this paper with less than {NUM_OF_WORD} Chinese characters"
)
final_results.append(gpt_say)
file_write_buffer.extend([i_say, gpt_say])
############################## <第 4 步设置一个token上限> ##################################
_, final_results = input_clipping("", final_results, max_token_limit=3200)
yield from update_ui(chatbot=chatbot, history=final_results) # 注意这里的历史记录被替代了
chatbot[-1] = (i_say, gpt_say)
history.append(i_say); history.append(gpt_say)
yield from update_ui(chatbot=chatbot, history=history, msg=msg) # 刷新界面
res = write_results_to_file(history)
chatbot.append(("完成了吗?", res))
yield from update_ui(chatbot=chatbot, history=history, msg=msg) # 刷新界面
res = write_results_to_file(file_write_buffer, file_name=gen_time_str())
promote_file_to_downloadzone(res.split('\t')[-1], chatbot=chatbot)
yield from update_ui(chatbot=chatbot, history=final_results) # 刷新界面
@CatchException
@ -151,10 +137,7 @@ def 批量总结PDF文档(txt, llm_kwargs, plugin_kwargs, chatbot, history, syst
return
# 搜索需要处理的文件清单
file_manifest = [f for f in glob.glob(f'{project_folder}/**/*.pdf', recursive=True)] # + \
# [f for f in glob.glob(f'{project_folder}/**/*.tex', recursive=True)] + \
# [f for f in glob.glob(f'{project_folder}/**/*.cpp', recursive=True)] + \
# [f for f in glob.glob(f'{project_folder}/**/*.c', recursive=True)]
file_manifest = [f for f in glob.glob(f'{project_folder}/**/*.pdf', recursive=True)]
# 如果没找到任何文件
if len(file_manifest) == 0:

View File

@ -6,7 +6,7 @@ def 同时问询(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt
"""
txt 输入栏用户输入的文本例如需要翻译的一段话再例如一个包含了待处理文件的路径
llm_kwargs gpt模型参数如温度和top_p等一般原样传递下去就行
plugin_kwargs 插件模型的参数如温度和top_p等一般原样传递下去就行
plugin_kwargs 插件模型的参数用于灵活调整复杂功能的各种参数
chatbot 聊天显示框的句柄用于显示给用户
history 聊天历史前情提要
system_prompt 给gpt的静默提醒
@ -35,7 +35,7 @@ def 同时问询_指定模型(txt, llm_kwargs, plugin_kwargs, chatbot, history,
"""
txt 输入栏用户输入的文本例如需要翻译的一段话再例如一个包含了待处理文件的路径
llm_kwargs gpt模型参数如温度和top_p等一般原样传递下去就行
plugin_kwargs 插件模型的参数如温度和top_p等一般原样传递下去就行
plugin_kwargs 插件模型的参数用于灵活调整复杂功能的各种参数
chatbot 聊天显示框的句柄用于显示给用户
history 聊天历史前情提要
system_prompt 给gpt的静默提醒

View File

@ -1,13 +1,12 @@
from toolbox import CatchException, update_ui
from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
import datetime, re
import datetime
@CatchException
def 高阶功能模板函数(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
"""
txt 输入栏用户输入的文本例如需要翻译的一段话再例如一个包含了待处理文件的路径
llm_kwargs gpt模型参数如温度和top_p等一般原样传递下去就行
plugin_kwargs 插件模型的参数暂时没有用武之地
plugin_kwargs 插件模型的参数用于灵活调整复杂功能的各种参数
chatbot 聊天显示框的句柄用于显示给用户
history 聊天历史前情提要
system_prompt 给gpt的静默提醒
@ -19,34 +18,12 @@ def 高阶功能模板函数(txt, llm_kwargs, plugin_kwargs, chatbot, history, s
for i in range(5):
currentMonth = (datetime.date.today() + datetime.timedelta(days=i)).month
currentDay = (datetime.date.today() + datetime.timedelta(days=i)).day
i_say = f'历史中哪些事件发生在{currentMonth}{currentDay}日?用中文列举两条,然后分别给出描述事件的两个英文单词。' + '当你给出关键词时使用以下json格式{"KeyWords":[EnglishKeyWord1,EnglishKeyWord2]}'
i_say = f'历史中哪些事件发生在{currentMonth}{currentDay}日?列举两条并发送相关图片。发送图片时请使用Markdown将Unsplash API中的PUT_YOUR_QUERY_HERE替换成描述该事件的一个最重要的单词'
gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(
inputs=i_say, inputs_show_user=i_say,
llm_kwargs=llm_kwargs, chatbot=chatbot, history=[],
sys_prompt='输出格式示例1908年美国消防救援事业发展的“美国消防协会”成立。关键词{"KeyWords":["Fire","American"]}。'
sys_prompt="当你想发送一张照片时请使用Markdown, 并且不要有反斜线, 不要用代码块。使用 Unsplash API (https://source.unsplash.com/1280x720/? < PUT_YOUR_QUERY_HERE >)。"
)
gpt_say = get_images(gpt_say)
chatbot[-1] = (i_say, gpt_say)
history.append(i_say);history.append(gpt_say)
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 界面更新
def get_images(gpt_say):
def get_image_by_keyword(keyword):
import requests
from bs4 import BeautifulSoup
response = requests.get(f'https://wallhaven.cc/search?q={keyword}', timeout=2)
for image_element in BeautifulSoup(response.content, 'html.parser').findAll("img"):
if "data-src" in image_element: break
return image_element["data-src"]
for keywords in re.findall('{"KeyWords":\[(.*?)\]}', gpt_say):
keywords = [n.strip('"') for n in keywords.split(',')]
try:
description = keywords[0]
url = get_image_by_keyword(keywords[0])
img_tag = f"\n\n![{description}]({url})"
gpt_say += img_tag
except:
continue
return gpt_say
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 界面更新

View File

@ -6,7 +6,7 @@
version: '3'
services:
gpt_academic_nolocalllms:
image: ghcr.io/binary-husky/gpt_academic_nolocal:master
image: ghcr.io/binary-husky/gpt_academic_nolocal:master # (Auto Built by Dockerfile: docs/GithubAction+NoLocal)
environment:
# 请查阅 `config.py` 以查看所有的配置信息
API_KEY: ' sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx '
@ -33,7 +33,7 @@ services:
version: '3'
services:
gpt_academic_with_chatglm:
image: ghcr.io/binary-husky/gpt_academic_chatglm_moss:master
image: ghcr.io/binary-husky/gpt_academic_chatglm_moss:master # (Auto Built by Dockerfile: docs/Dockerfile+ChatGLM)
environment:
# 请查阅 `config.py` 以查看所有的配置信息
API_KEY: ' sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx,fkxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx '
@ -63,7 +63,7 @@ services:
version: '3'
services:
gpt_academic_with_rwkv:
image: fuqingxu/gpt_academic:jittorllms # [option 2] 如果需要运行ChatGLM本地模型
image: fuqingxu/gpt_academic:jittorllms
environment:
# 请查阅 `config.py` 以查看所有的配置信息
API_KEY: ' sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx,fkxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx '
@ -111,7 +111,7 @@ services:
version: '3'
services:
gpt_academic_with_latex:
image: ghcr.io/binary-husky/gpt_academic_with_latex:master
image: ghcr.io/binary-husky/gpt_academic_with_latex:master # (Auto Built by Dockerfile: docs/GithubAction+NoLocal+Latex)
environment:
# 请查阅 `config.py` 以查看所有的配置信息
API_KEY: ' sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx '

View File

@ -1667,5 +1667,294 @@
"段音频的主要内容": "The main content of the segment audio is",
"z$ 分别是空间直角坐标系中的三个坐标": "z$, respectively, are the three coordinates in the spatial rectangular coordinate system",
"这个是怎么识别的呢我也不清楚": "I'm not sure how this is recognized",
"从现在起": "From now on"
"从现在起": "From now on",
"连接bing搜索回答问题": "ConnectBingSearchAnswerQuestion",
"联网的ChatGPT_bing版": "OnlineChatGPT_BingEdition",
"Markdown翻译指定语言": "TranslateMarkdownToSpecifiedLanguage",
"Langchain知识库": "LangchainKnowledgeBase",
"Latex英文纠错加PDF对比": "CorrectEnglishInLatexWithPDFComparison",
"Latex输出PDF结果": "OutputPDFFromLatex",
"Latex翻译中文并重新编译PDF": "TranslateChineseToEnglishInLatexAndRecompilePDF",
"sprint亮靛": "SprintIndigo",
"寻找Latex主文件": "FindLatexMainFile",
"专业词汇声明": "ProfessionalTerminologyDeclaration",
"Latex精细分解与转化": "DecomposeAndConvertLatex",
"编译Latex": "CompileLatex",
"如果您是论文原作者": "If you are the original author of the paper",
"正在编译对比PDF": "Compiling the comparison PDF",
"将 \\include 命令转换为 \\input 命令": "Converting the \\include command to the \\input command",
"取评分最高者返回": "Returning the highest-rated one",
"不要修改!! 高危设置!通过修改此设置": "Do not modify!! High-risk setting! By modifying this setting",
"Tex源文件缺失": "Tex source file is missing!",
"6.25 加入判定latex模板的代码": "Added code to determine the latex template on June 25",
"正在精细切分latex文件": "Finely splitting the latex file",
"获取response失败": "Failed to get response",
"手动指定语言": "Manually specify the language",
"输入arxivID": "Enter arxivID",
"对输入的word文档进行摘要生成": "Generate a summary of the input word document",
"将指定目录下的PDF文件从英文翻译成中文": "Translate PDF files from English to Chinese in the specified directory",
"如果分析错误": "If the analysis is incorrect",
"尝试第": "Try the",
"用户填3": "User fills in 3",
"请在此处追加更细致的矫错指令": "Please append more detailed correction instructions here",
"为了防止大语言模型的意外谬误产生扩散影响": "To prevent the accidental spread of errors in large language models",
"前面是中文冒号": "The colon before is in Chinese",
"内含已经翻译的Tex文档": "Contains a Tex document that has been translated",
"成功啦": "Success!",
"刷新页面即可以退出UpdateKnowledgeArchive模式": "Refresh the page to exit UpdateKnowledgeArchive mode",
"或者不在环境变量PATH中": "Or not in the environment variable PATH",
"--读取文件": "--Read the file",
"才能继续下面的步骤": "To continue with the next steps",
"代理数据解析失败": "Proxy data parsing failed",
"详见项目主README.md": "See the main README.md of the project for details",
"临时存储用于调试": "Temporarily stored for debugging",
"屏蔽空行和太短的句子": "Filter out empty lines and sentences that are too short",
"gpt 多线程请求": "GPT multi-threaded request",
"编译已经开始": "Compilation has started",
"无法找到一个主Tex文件": "Cannot find a main Tex file",
"修复括号": "Fix parentheses",
"请您不要删除或修改这行警告": "Please do not delete or modify this warning",
"请登录OpenAI查看详情 https": "Please log in to OpenAI to view details at https",
"调用函数": "Call a function",
"请查看终端的输出或耐心等待": "Please check the output in the terminal or wait patiently",
"LatexEnglishCorrection+高亮修正位置": "Latex English correction + highlight correction position",
"行": "line",
"Newbing 请求失败": "Newbing request failed",
"转化PDF编译是否成功": "Check if the conversion to PDF and compilation were successful",
"建议更换代理协议": "Recommend changing the proxy protocol",
"========================================= 插件主程序1 =====================================================": "========================================= Plugin Main Program 1 =====================================================",
"终端": "terminal",
"请先上传文件素材": "Please upload file materials first",
"前面是中文逗号": "There is a Chinese comma in front",
"请尝试把以下指令复制到高级参数区": "Please try copying the following instructions to the advanced parameters section",
"翻译-": "Translation -",
"请耐心等待": "Please be patient",
"将前后断行符脱离": "Remove line breaks before and after",
"json等": "JSON, etc.",
"生成中文PDF": "Generate Chinese PDF",
"用红色标注处保留区": "Use red color to highlight the reserved area",
"对比PDF编译是否成功": "Compare if the PDF compilation was successful",
"回答完问题后": "After answering the question",
"其他操作系统表现未知": "Unknown performance on other operating systems",
"-构建知识库": "Build knowledge base",
"还原原文": "Restore original text",
"或者重启之后再度尝试": "Or try again after restarting",
"免费": "Free",
"仅在Windows系统进行了测试": "Tested only on Windows system",
"欢迎加REAME中的QQ联系开发者": "Feel free to contact the developer via QQ in REAME",
"当前知识库内的有效文件": "Valid files in the current knowledge base",
"您可以到Github Issue区": "You can go to the Github Issue area",
"刷新Gradio前端界面": "Refresh the Gradio frontend interface",
"吸收title与作者以上的部分": "Include the title and the above part of the author",
"给出一些判定模板文档的词作为扣分项": "Provide some words in the template document as deduction items",
"--读取参数": "-- Read parameters",
"然后进行问答": "And then perform question-answering",
"根据自然语言执行插件命令": "Execute plugin commands based on natural language",
"*{\\scriptsize\\textbf{警告": "*{\\scriptsize\\textbf{Warning",
"但请查收结果": "But please check the results",
"翻译内容可靠性无保障": "No guarantee of translation accuracy",
"寻找主文件": "Find the main file",
"消耗时间的函数": "Time-consuming function",
"当前语言模型温度设定": "Current language model temperature setting",
"这需要一段时间计算": "This requires some time to calculate",
"为啥chatgpt会把cite里面的逗号换成中文逗号呀": "Why does ChatGPT change commas inside 'cite' to Chinese commas?",
"发现已经存在翻译好的PDF文档": "Found an already translated PDF document",
"待提取的知识库名称id": "Knowledge base name ID to be extracted",
"文本碎片重组为完整的tex片段": "Reassemble text fragments into complete tex fragments",
"注意事项": "Notes",
"参数说明": "Parameter description",
"或代理节点": "Or proxy node",
"构建知识库": "Building knowledge base",
"报错信息如下. 如果是与网络相关的问题": "Error message as follows. If it is related to network issues",
"功能描述": "Function description",
"禁止移除或修改此警告": "Removal or modification of this warning is prohibited",
"Arixv翻译": "Arixv translation",
"读取优先级": "Read priority",
"包含documentclass关键字": "Contains the documentclass keyword",
"根据文本使用GPT模型生成相应的图像": "Generate corresponding images using GPT model based on the text",
"图像生成所用到的提示文本": "Prompt text used for image generation",
"Your account is not active. OpenAI以账户失效为由": "Your account is not active. OpenAI states that it is due to account expiration",
"快捷的调试函数": "Convenient debugging function",
"在多Tex文档中": "In multiple Tex documents",
"因此选择GenerateImage函数": "Therefore, choose the GenerateImage function",
"当前工作路径为": "The current working directory is",
"实际得到格式": "Obtained format in reality",
"这段代码定义了一个名为TempProxy的空上下文管理器": "This code defines an empty context manager named TempProxy",
"吸收其他杂项": "Absorb other miscellaneous items",
"请输入要翻译成哪种语言": "Please enter which language to translate into",
"的单词": "of the word",
"正在尝试自动安装": "Attempting automatic installation",
"如果有必要": "If necessary",
"开始下载": "Start downloading",
"项目Github地址 \\url{https": "Project GitHub address \\url{https",
"将根据报错信息修正tex源文件并重试": "The Tex source file will be corrected and retried based on the error message",
"发送至azure openai api": "Send to Azure OpenAI API",
"吸收匿名公式": "Absorb anonymous formulas",
"用该压缩包+ConversationHistoryArchive进行反馈": "Provide feedback using the compressed package + ConversationHistoryArchive",
"需要特殊依赖": "Requires special dependencies",
"还原部分原文": "Restore part of the original text",
"构建完成": "Build completed",
"解析arxiv网址失败": "Failed to parse arXiv URL",
"输入问题后点击该插件": "Click the plugin after entering the question",
"请求子进程": "Requesting subprocess",
"请务必用 pip install -r requirements.txt 指令安装依赖": "Please make sure to install the dependencies using the 'pip install -r requirements.txt' command",
"如果程序停顿5分钟以上": "If the program pauses for more than 5 minutes",
"转化PDF编译已经成功": "Conversion to PDF compilation was successful",
"虽然PDF生成失败了": "Although PDF generation failed",
"分析上述回答": "Analyze the above answer",
"吸收在42行以内的begin-end组合": "Absorb the begin-end combination within 42 lines",
"推荐http": "Recommend http",
"Latex没有安装": "Latex is not installed",
"用latex编译为PDF对修正处做高亮": "Compile to PDF using LaTeX and highlight the corrections",
"reverse 操作必须放在最后": "'reverse' operation must be placed at the end",
"AZURE OPENAI API拒绝了请求": "AZURE OPENAI API rejected the request",
"该项目的Latex主文件是": "The main LaTeX file of this project is",
"You are associated with a deactivated account. OpenAI以账户失效为由": "You are associated with a deactivated account. OpenAI considers it as an account expiration",
"它*必须*被包含在AVAIL_LLM_MODELS列表中": "It *must* be included in the AVAIL_LLM_MODELS list",
"未知指令": "Unknown command",
"尝试执行Latex指令失败": "Failed to execute the LaTeX command",
"摘要生成后的文档路径": "Path of the document after summary generation",
"GPT结果已输出": "GPT result has been outputted",
"使用Newbing": "Using Newbing",
"其他模型转化效果未知": "Unknown conversion effect of other models",
"P.S. 但愿没人把latex模板放在里面传进来": "P.S. Hopefully, no one passes a LaTeX template in it",
"定位主Latex文件": "Locate the main LaTeX file",
"后面是英文冒号": "English colon follows",
"文档越长耗时越长": "The longer the document, the longer it takes.",
"压缩包": "Compressed file",
"但通常不会出现在正文": "But usually does not appear in the body.",
"正在预热文本向量化模组": "Preheating text vectorization module",
"5刀": "5 dollars",
"提问吧! 但注意": "Ask questions! But be careful",
"发送至AZURE OPENAI API": "Send to AZURE OPENAI API",
"请仔细鉴别并以原文为准": "Please carefully verify and refer to the original text",
"如果需要使用AZURE 详情请见额外文档 docs\\use_azure.md": "If you need to use AZURE, please refer to the additional document docs\\use_azure.md for details",
"使用正则表达式查找半行注释": "Use regular expressions to find inline comments",
"只有第二步成功": "Only the second step is successful",
"P.S. 顺便把CTEX塞进去以支持中文": "P.S. By the way, include CTEX to support Chinese",
"安装方法https": "Installation method: https",
"则跳过GPT请求环节": "Then skip the GPT request process",
"请切换至“UpdateKnowledgeArchive”插件进行知识库访问": "Please switch to the 'UpdateKnowledgeArchive' plugin for knowledge base access",
"=================================== 工具函数 ===============================================": "=================================== Utility functions ===============================================",
"填入azure openai api的密钥": "Fill in the Azure OpenAI API key",
"上传Latex压缩包": "Upload LaTeX compressed file",
"远程云服务器部署": "Deploy to remote cloud server",
"用黑色标注转换区": "Use black color to annotate the conversion area",
"音频文件的路径": "Path to the audio file",
"必须包含documentclass": "Must include documentclass",
"再列出用户可能提出的三个问题": "List three more questions that the user might ask",
"根据需要切换prompt": "Switch the prompt as needed",
"将文件复制一份到下载区": "Make a copy of the file in the download area",
"次编译": "Second compilation",
"Latex文件融合完成": "LaTeX file merging completed",
"返回": "Return",
"后面是英文逗号": "Comma after this",
"对不同latex源文件扣分": "Deduct points for different LaTeX source files",
"失败啦": "Failed",
"编译BibTex": "Compile BibTeX",
"Linux下必须使用Docker安装": "Must install using Docker on Linux",
"报错信息": "Error message",
"删除或修改歧义文件": "Delete or modify ambiguous files",
"-预热文本向量化模组": "- Preheating text vectorization module",
"将每次对话记录写入Markdown格式的文件中": "Write each conversation record into a file in Markdown format",
"其他类型文献转化效果未知": "Unknown conversion effect for other types of literature",
"获取线程锁": "Acquire thread lock",
"使用英文": "Use English",
"如果存在调试缓存文件": "If there is a debug cache file",
"您需要首先调用构建知识库": "You need to call the knowledge base building first",
"原始PDF编译是否成功": "Whether the original PDF compilation is successful",
"生成 azure openai api请求": "Generate Azure OpenAI API requests",
"正在编译PDF": "Compiling PDF",
"仅调试": "Debug only",
"========================================= 插件主程序2 =====================================================": "========================================= Plugin Main Program 2 =====================================================",
"多线程翻译开始": "Multithreaded translation begins",
"出问题了": "There is a problem",
"版权归原文作者所有": "Copyright belongs to the original author",
"当前大语言模型": "Current large language model",
"目前对机器学习类文献转化效果最好": "Currently, the best conversion effect for machine learning literature",
"这个paper有个input命令文件名大小写错误": "This paper has an input command with a filename case error!",
"期望格式例如": "Expected format, for example",
"解决部分词汇翻译不准确的问题": "Resolve the issue of inaccurate translation for some terms",
"待注入的知识库名称id": "Name/ID of the knowledge base to be injected",
"精细切分latex文件": "Fine-grained segmentation of LaTeX files",
"永远给定None": "Always given None",
"work_folder = Latex预处理": "work_folder = LaTeX preprocessing",
"请直接去该路径下取回翻译结果": "Please directly go to the path to retrieve the translation results",
"寻找主tex文件": "Finding the main .tex file",
"模型参数": "Model parameters",
"返回找到的第一个": "Return the first one found",
"编译转化后的PDF": "Compile the converted PDF",
"\\SEAFILE_LOCALŅ03047\\我的资料库\\music\\Akie秋绘-未来轮廓.mp3": "\\SEAFILE_LOCALŅ03047\\My Library\\music\\Akie秋绘-未来轮廓.mp3",
"拆分过长的latex片段": "Splitting overly long LaTeX fragments",
"没有找到任何可读取文件": "No readable files found",
"暗色模式 / 亮色模式": "Dark mode / Light mode",
"检测到arxiv文档连接": "Detected arXiv document link",
"此插件Windows支持最佳": "This plugin has best support for Windows",
"from crazy_functions.虚空终端 import 终端": "from crazy_functions.null_terminal import Terminal",
"本地论文翻译": "Local paper translation",
"输出html调试文件": "Output HTML debugging file",
"以下所有配置也都支持利用环境变量覆写": "All the following configurations can also be overridden using environment variables",
"PDF文件所在的路径": "Path of the PDF file",
"也是可读的": "It is also readable",
"将消耗较长时间下载中文向量化模型": "Downloading Chinese vectorization model will take a long time",
"环境变量配置格式见docker-compose.yml": "See docker-compose.yml for the format of environment variable configuration",
"编译文献交叉引用": "Compile bibliographic cross-references",
"默认为default": "Default is 'default'",
"或者使用此插件继续上传更多文件": "Or use this plugin to continue uploading more files",
"该PDF由GPT-Academic开源项目调用大语言模型+Latex翻译插件一键生成": "This PDF is generated by the GPT-Academic open-source project using a large language model + LaTeX translation plugin",
"使用latexdiff生成论文转化前后对比": "Use latexdiff to generate before and after comparison of paper transformation",
"正在编译PDF文档": "Compiling PDF document",
"读取config.py文件中关于AZURE OPENAI API的信息": "Read the information about AZURE OPENAI API from the config.py file",
"配置教程&视频教程": "Configuration tutorial & video tutorial",
"临时地启动代理网络": "Temporarily start proxy network",
"临时地激活代理网络": "Temporarily activate proxy network",
"功能尚不稳定": "Functionality is unstable",
"默认为Chinese": "Default is Chinese",
"请查收结果": "Please check the results",
"将 chatglm 直接对齐到 chatglm2": "Align chatglm directly to chatglm2",
"中读取数据构建知识库": "Build a knowledge base by reading data in",
"用于给一小段代码上代理": "Used to proxy a small piece of code",
"分析结果": "Analysis results",
"依赖不足": "Insufficient dependencies",
"Markdown翻译": "Markdown translation",
"除非您是论文的原作者": "Unless you are the original author of the paper",
"test_LangchainKnowledgeBase读取": "test_LangchainKnowledgeBase read",
"将多文件tex工程融合为一个巨型tex": "Merge multiple tex projects into one giant tex",
"吸收iffalse注释": "Absorb iffalser comments",
"您接下来不能再使用其他插件了": "You can no longer use other plugins next",
"正在构建知识库": "Building knowledge base",
"需Latex": "Requires Latex",
"即找不到": "That is not found",
"保证括号正确": "Ensure parentheses are correct",
"= 2 通过一些Latex模板中常见": "= 2 through some common Latex templates",
"请立即终止程序": "Please terminate the program immediately",
"解压失败! 需要安装pip install rarfile来解压rar文件": "Decompression failed! Install 'pip install rarfile' to decompress rar files",
"请在此处给出自定义翻译命令": "Please provide custom translation command here",
"解压失败! 需要安装pip install py7zr来解压7z文件": "Decompression failed! Install 'pip install py7zr' to decompress 7z files",
"执行错误": "Execution error",
"目前仅支持GPT3.5/GPT4": "Currently only supports GPT3.5/GPT4",
"P.S. 顺便把Latex的注释去除": "P.S. Also remove comments from Latex",
"写出文件": "Write out the file",
"当前报错的latex代码处于第": "The current error in the LaTeX code is on line",
"主程序即将开始": "Main program is about to start",
"详情信息见requirements.txt": "See details in requirements.txt",
"释放线程锁": "Release thread lock",
"由于最为关键的转化PDF编译失败": "Due to the critical failure of PDF conversion and compilation",
"即将退出": "Exiting soon",
"尝试下载": "Attempting to download",
"删除整行的空注释": "Remove empty comments from the entire line",
"也找不到": "Not found either",
"从一批文件": "From a batch of files",
"编译结束": "Compilation finished",
"调用缓存": "Calling cache",
"只有GenerateImage和生成图像相关": "Only GenerateImage and image generation related",
"待处理的word文档路径": "Path of the word document to be processed",
"是否在提交时自动清空输入框": "Whether to automatically clear the input box upon submission",
"检查结果": "Check the result",
"生成时间戳": "Generate a timestamp",
"编译原始PDF": "Compile the original PDF",
"填入ENGINE": "Fill in ENGINE",
"填入api版本": "Fill in the API version",
"中文Bing版": "Chinese Bing version",
"当前支持的格式包括": "Currently supported formats include"
}

View File

@ -90,62 +90,29 @@
到现在为止,申请操作就完成了,需要记下来的有下面几个东西:
 密钥1或2都可以
● 密钥(对应AZURE_API_KEY1或2都可以
● 终结点
● 终结点 对应AZURE_ENDPOINT
 部署名对应AZURE_ENGINE不是模型名
● 部署名(不是模型名)
# 修改 config.py
```
AZURE_ENDPOINT = "填入终结点"
LLM_MODEL = "azure-gpt-3.5" # 指定启动时的默认模型当然事后从下拉菜单选也ok
AZURE_ENDPOINT = "填入终结点" # 见上述图片
AZURE_API_KEY = "填入azure openai api的密钥"
AZURE_API_VERSION = "2023-05-15" # 默认使用 2023-05-15 版本,无需修改
AZURE_ENGINE = "填入部署名"
```
# API的使用
接下来就是具体怎么使用API了还是可以参考官方文档[快速入门 - 开始通过 Azure OpenAI 服务使用 ChatGPT 和 GPT-4 - Azure OpenAI Service | Microsoft Learn](https://learn.microsoft.com/zh-cn/azure/cognitive-services/openai/chatgpt-quickstart?pivots=programming-language-python)
和openai自己的api调用有点类似都需要安装openai库不同的是调用方式
```
import openai
openai.api_type = "azure" #固定格式,无需修改
openai.api_base = os.getenv("AZURE_OPENAI_ENDPOINT") #这里填入“终结点”
openai.api_version = "2023-05-15" #固定格式,无需修改
openai.api_key = os.getenv("AZURE_OPENAI_KEY") #这里填入“密钥1”或“密钥2”
response = openai.ChatCompletion.create(
engine="gpt-35-turbo", #这里填入的不是模型名,是部署名
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Does Azure OpenAI support customer managed keys?"},
{"role": "assistant", "content": "Yes, customer managed keys are supported by Azure OpenAI."},
{"role": "user", "content": "Do other Azure Cognitive Services support this too?"}
]
)
print(response)
print(response['choices'][0]['message']['content'])
AZURE_ENGINE = "填入部署名" # 见上述图片
```
需要注意的是:
1.  engine那里填入的是部署名不是模型名
2.  通过openai库获得的这个 response 和通过 request 库访问 url 获得的 response 不同,不需要 decode已经是解析好的 json 了,直接根据键值读取即可。
更细节的使用方法详见官方API文档。
# 关于费用
Azure OpenAI API 还是需要一些费用的免费订阅只有1个月有效期费用如下
![image.png](https://note.youdao.com/yws/res/18095/WEBRESOURCEeba0ab6d3127b79e143ef2d5627c0e44)
Azure OpenAI API 还是需要一些费用的免费订阅只有1个月有效期
具体可以可以看这个网址 [Azure OpenAI 服务 - 定价| Microsoft Azure](https://azure.microsoft.com/zh-cn/pricing/details/cognitive-services/openai-service/?cdn=disable)

51
main.py
View File

@ -4,13 +4,12 @@ def main():
import gradio as gr
if gr.__version__ not in ['3.28.3','3.32.2']: assert False, "需要特殊依赖,请务必用 pip install -r requirements.txt 指令安装依赖详情信息见requirements.txt"
from request_llm.bridge_all import predict
from toolbox import format_io, find_free_port, on_file_uploaded, on_report_generated, get_conf, ArgsGeneralWrapper, DummyWith
from toolbox import format_io, find_free_port, on_file_uploaded, on_report_generated, get_conf, ArgsGeneralWrapper, load_chat_cookies, DummyWith
# 建议您复制一个config_private.py放自己的秘密, 如API和代理网址, 避免不小心传github被别人看到
proxies, WEB_PORT, LLM_MODEL, CONCURRENT_COUNT, AUTHENTICATION, CHATBOT_HEIGHT, LAYOUT, API_KEY, AVAIL_LLM_MODELS = \
get_conf('proxies', 'WEB_PORT', 'LLM_MODEL', 'CONCURRENT_COUNT', 'AUTHENTICATION', 'CHATBOT_HEIGHT', 'LAYOUT', 'API_KEY', 'AVAIL_LLM_MODELS')
proxies, WEB_PORT, LLM_MODEL, CONCURRENT_COUNT, AUTHENTICATION, CHATBOT_HEIGHT, LAYOUT, AVAIL_LLM_MODELS, AUTO_CLEAR_TXT = \
get_conf('proxies', 'WEB_PORT', 'LLM_MODEL', 'CONCURRENT_COUNT', 'AUTHENTICATION', 'CHATBOT_HEIGHT', 'LAYOUT', 'AVAIL_LLM_MODELS', 'AUTO_CLEAR_TXT')
ENABLE_AUDIO, AUTO_CLEAR_TXT = \
get_conf('ENABLE_AUDIO', 'AUTO_CLEAR_TXT')
# 如果WEB_PORT是-1, 则随机选取WEB端口
PORT = find_free_port() if WEB_PORT <= 0 else WEB_PORT
if not AUTHENTICATION: AUTHENTICATION = None
@ -47,23 +46,23 @@ def main():
proxy_info = check_proxy(proxies)
gr_L1 = lambda: gr.Row().style()
gr_L2 = lambda scale: gr.Column(scale=scale)
gr_L2 = lambda scale, elem_id: gr.Column(scale=scale, elem_id=elem_id)
if LAYOUT == "TOP-DOWN":
gr_L1 = lambda: DummyWith()
gr_L2 = lambda scale: gr.Row()
gr_L2 = lambda scale, elem_id: gr.Row()
CHATBOT_HEIGHT /= 2
cancel_handles = []
with gr.Blocks(title="ChatGPT 学术优化", theme=set_theme, analytics_enabled=False, css=advanced_css) as demo:
gr.HTML(title_html)
cookies = gr.State({'api_key': API_KEY, 'llm_model': LLM_MODEL})
cookies = gr.State(load_chat_cookies())
with gr_L1():
with gr_L2(scale=2):
chatbot = gr.Chatbot(label=f"当前模型:{LLM_MODEL}")
chatbot.style(height=CHATBOT_HEIGHT)
with gr_L2(scale=2, elem_id="gpt-chat"):
chatbot = gr.Chatbot(label=f"当前模型:{LLM_MODEL}", elem_id="gpt-chatbot")
if LAYOUT == "TOP-DOWN": chatbot.style(height=CHATBOT_HEIGHT)
history = gr.State([])
with gr_L2(scale=1):
with gr.Accordion("输入区", open=True) as area_input_primary:
with gr_L2(scale=1, elem_id="gpt-panel"):
with gr.Accordion("输入区", open=True, elem_id="input-panel") as area_input_primary:
with gr.Row():
txt = gr.Textbox(show_label=False, placeholder="Input question here.").style(container=False)
with gr.Row():
@ -76,14 +75,14 @@ def main():
with gr.Row():
audio_mic = gr.Audio(source="microphone", type="numpy", streaming=True, show_label=False)
with gr.Row():
status = gr.Markdown(f"Tip: 按Enter提交, 按Shift+Enter换行。当前模型: {LLM_MODEL} \n {proxy_info}")
with gr.Accordion("基础功能区", open=True) as area_basic_fn:
status = gr.Markdown(f"Tip: 按Enter提交, 按Shift+Enter换行。当前模型: {LLM_MODEL} \n {proxy_info}", elem_id="state-panel")
with gr.Accordion("基础功能区", open=True, elem_id="basic-panel") as area_basic_fn:
with gr.Row():
for k in functional:
if ("Visible" in functional[k]) and (not functional[k]["Visible"]): continue
variant = functional[k]["Color"] if "Color" in functional[k] else "secondary"
functional[k]["Button"] = gr.Button(k, variant=variant)
with gr.Accordion("函数插件区", open=True) as area_crazy_fn:
with gr.Accordion("函数插件区", open=True, elem_id="plugin-panel") as area_crazy_fn:
with gr.Row():
gr.Markdown("插件可读取“输入区”文本/路径作为参数(上传文件自动修正路径)")
with gr.Row():
@ -105,7 +104,7 @@ def main():
with gr.Row():
with gr.Accordion("点击展开“文件上传区”。上传本地文件/压缩包供函数插件调用。", open=False) as area_file_up:
file_upload = gr.Files(label="任何文件, 但推荐上传压缩文件(zip, tar)", file_count="multiple")
with gr.Accordion("更换模型 & SysPrompt & 交互界面布局", open=(LAYOUT == "TOP-DOWN")):
with gr.Accordion("更换模型 & SysPrompt & 交互界面布局", open=(LAYOUT == "TOP-DOWN"), elem_id="interact-panel"):
system_prompt = gr.Textbox(show_label=True, placeholder=f"System Prompt", label="System prompt", value=initial_prompt)
top_p = gr.Slider(minimum=-0, maximum=1.0, value=1.0, step=0.01,interactive=True, label="Top-p (nucleus sampling)",)
temperature = gr.Slider(minimum=-0, maximum=2.0, value=1.0, step=0.01, interactive=True, label="Temperature",)
@ -114,7 +113,7 @@ def main():
md_dropdown = gr.Dropdown(AVAIL_LLM_MODELS, value=LLM_MODEL, label="更换LLM模型/请求源").style(container=False)
gr.Markdown(description)
with gr.Accordion("备选输入区", open=True, visible=False) as area_input_secondary:
with gr.Accordion("备选输入区", open=True, visible=False, elem_id="input-panel2") as area_input_secondary:
with gr.Row():
txt2 = gr.Textbox(show_label=False, placeholder="Input question here.", label="输入区2").style(container=False)
with gr.Row():
@ -181,9 +180,9 @@ def main():
return {chatbot: gr.update(label="当前模型:"+k)}
md_dropdown.select(on_md_dropdown_changed, [md_dropdown], [chatbot] )
# 随变按钮的回调函数注册
def route(k, *args, **kwargs):
def route(request: gr.Request, k, *args, **kwargs):
if k in [r"打开插件列表", r"请先从插件列表中选择"]: return
yield from ArgsGeneralWrapper(crazy_fns[k]["Function"])(*args, **kwargs)
yield from ArgsGeneralWrapper(crazy_fns[k]["Function"])(request, *args, **kwargs)
click_handle = switchy_bt.click(route,[switchy_bt, *input_combo, gr.State(PORT)], output_combo)
click_handle.then(on_report_generated, [cookies, file_upload, chatbot], [cookies, file_upload, chatbot])
cancel_handles.append(click_handle)
@ -191,12 +190,6 @@ def main():
stopBtn.click(fn=None, inputs=None, outputs=None, cancels=cancel_handles)
stopBtn2.click(fn=None, inputs=None, outputs=None, cancels=cancel_handles)
def init_cookie(cookies, chatbot):
# 为每一位访问的用户赋予一个独一无二的uuid编码
cookies.update({'uuid': uuid.uuid4()})
return cookies
demo.load(init_cookie, inputs=[cookies, chatbot], outputs=[cookies])
if ENABLE_AUDIO:
from crazy_functions.live_audio.audio_io import RealtimeAudioDistribution
rad = RealtimeAudioDistribution()
@ -204,6 +197,14 @@ def main():
rad.feed(cookies['uuid'].hex, audio)
audio_mic.stream(deal_audio, inputs=[audio_mic, cookies])
def init_cookie(cookies, chatbot):
# 为每一位访问的用户赋予一个独一无二的uuid编码
cookies.update({'uuid': uuid.uuid4()})
return cookies
demo.load(init_cookie, inputs=[cookies, chatbot], outputs=[cookies])
demo.load(lambda: 0, inputs=None, outputs=None, _js='()=>{ChatBotHeight();}')
# gradio的inbrowser触发不太稳定回滚代码到原始的浏览器打开函数
def auto_opentab_delay():
import threading, webbrowser, time

View File

@ -33,7 +33,7 @@ import pickle
import time
CACHE_FOLDER = "gpt_log"
blacklist = ['multi-language', 'gpt_log', '.git', 'private_upload', 'multi_language.py']
blacklist = ['multi-language', 'gpt_log', '.git', 'private_upload', 'multi_language.py', 'build', '.github', '.vscode', '__pycache__', 'venv']
# LANG = "TraditionalChinese"
# TransPrompt = f"Replace each json value `#` with translated results in Traditional Chinese, e.g., \"原始文本\":\"翻譯後文字\". Keep Json format. Do not answer #."
@ -301,6 +301,7 @@ def step_1_core_key_translate():
elif isinstance(node, ast.ImportFrom):
for n in node.names:
if contains_chinese(n.name): syntax.append(n.name)
# if node.module is None: print(node.module)
for k in node.module.split('.'):
if contains_chinese(k): syntax.append(k)
return syntax
@ -310,6 +311,7 @@ def step_1_core_key_translate():
for root, dirs, files in os.walk(directory_path):
if any([b in root for b in blacklist]):
continue
print(files)
for file in files:
if file.endswith('.py'):
file_path = os.path.join(root, file)
@ -505,6 +507,6 @@ def step_2_core_key_translate():
with open(file_path_new, 'w', encoding='utf-8') as f:
f.write(content)
os.remove(file_path)
step_1_core_key_translate()
step_2_core_key_translate()
print('Finished, checkout generated results at ./multi-language/')

View File

@ -16,15 +16,9 @@ from toolbox import get_conf, trimmed_format_exc
from .bridge_chatgpt import predict_no_ui_long_connection as chatgpt_noui
from .bridge_chatgpt import predict as chatgpt_ui
from .bridge_azure_test import predict_no_ui_long_connection as azure_noui
from .bridge_azure_test import predict as azure_ui
from .bridge_chatglm import predict_no_ui_long_connection as chatglm_noui
from .bridge_chatglm import predict as chatglm_ui
from .bridge_newbing import predict_no_ui_long_connection as newbing_noui
from .bridge_newbing import predict as newbing_ui
# from .bridge_tgui import predict_no_ui_long_connection as tgui_noui
# from .bridge_tgui import predict as tgui_ui
@ -51,10 +45,11 @@ class LazyloadTiktoken(object):
return encoder.decode(*args, **kwargs)
# Endpoint 重定向
API_URL_REDIRECT, = get_conf("API_URL_REDIRECT")
API_URL_REDIRECT, AZURE_ENDPOINT, AZURE_ENGINE = get_conf("API_URL_REDIRECT", "AZURE_ENDPOINT", "AZURE_ENGINE")
openai_endpoint = "https://api.openai.com/v1/chat/completions"
api2d_endpoint = "https://openai.api2d.net/v1/chat/completions"
newbing_endpoint = "wss://sydney.bing.com/sydney/ChatHub"
azure_endpoint = AZURE_ENDPOINT + f'openai/deployments/{AZURE_ENGINE}/chat/completions?api-version=2023-05-15'
# 兼容旧版的配置
try:
API_URL, = get_conf("API_URL")
@ -124,10 +119,10 @@ model_info = {
},
# azure openai
"azure-gpt35":{
"fn_with_ui": azure_ui,
"fn_without_ui": azure_noui,
"endpoint": get_conf("AZURE_ENDPOINT"),
"azure-gpt-3.5":{
"fn_with_ui": chatgpt_ui,
"fn_without_ui": chatgpt_noui,
"endpoint": azure_endpoint,
"max_token": 4096,
"tokenizer": tokenizer_gpt35,
"token_cnt": get_token_num_gpt35,
@ -169,16 +164,6 @@ model_info = {
"tokenizer": tokenizer_gpt35,
"token_cnt": get_token_num_gpt35,
},
# newbing
"newbing": {
"fn_with_ui": newbing_ui,
"fn_without_ui": newbing_noui,
"endpoint": newbing_endpoint,
"max_token": 4096,
"tokenizer": tokenizer_gpt35,
"token_cnt": get_token_num_gpt35,
},
}
@ -267,6 +252,23 @@ if "newbing-free" in AVAIL_LLM_MODELS:
})
except:
print(trimmed_format_exc())
if "newbing" in AVAIL_LLM_MODELS: # same with newbing-free
try:
from .bridge_newbingfree import predict_no_ui_long_connection as newbingfree_noui
from .bridge_newbingfree import predict as newbingfree_ui
# claude
model_info.update({
"newbing": {
"fn_with_ui": newbingfree_ui,
"fn_without_ui": newbingfree_noui,
"endpoint": newbing_endpoint,
"max_token": 4096,
"tokenizer": tokenizer_gpt35,
"token_cnt": get_token_num_gpt35,
}
})
except:
print(trimmed_format_exc())
def LLM_CATCH_EXCEPTION(f):
"""

View File

@ -1,241 +0,0 @@
"""
该文件中主要包含三个函数
不具备多线程能力的函数
1. predict: 正常对话时使用具备完备的交互功能不可多线程
具备多线程调用能力的函数
2. predict_no_ui高级实验性功能模块调用不会实时显示在界面上参数简单可以多线程并行方便实现复杂的功能逻辑
3. predict_no_ui_long_connection在实验过程中发现调用predict_no_ui处理长文档时和openai的连接容易断掉这个函数用stream的方式解决这个问题同样支持多线程
"""
import logging
import traceback
import importlib
import openai
import time
# 读取config.py文件中关于AZURE OPENAI API的信息
from toolbox import get_conf, update_ui, clip_history, trimmed_format_exc
TIMEOUT_SECONDS, MAX_RETRY, AZURE_ENGINE, AZURE_ENDPOINT, AZURE_API_VERSION, AZURE_API_KEY = \
get_conf('TIMEOUT_SECONDS', 'MAX_RETRY',"AZURE_ENGINE","AZURE_ENDPOINT", "AZURE_API_VERSION", "AZURE_API_KEY")
def get_full_error(chunk, stream_response):
"""
获取完整的从Openai返回的报错
"""
while True:
try:
chunk += next(stream_response)
except:
break
return chunk
def predict(inputs, llm_kwargs, plugin_kwargs, chatbot, history=[], system_prompt='', stream = True, additional_fn=None):
"""
发送至azure openai api流式获取输出
用于基础的对话功能
inputs 是本次问询的输入
top_p, temperature是chatGPT的内部调优参数
history 是之前的对话列表注意无论是inputs还是history内容太长了都会触发token数量溢出的错误
chatbot 为WebUI中显示的对话列表修改它然后yeild出去可以直接修改对话界面内容
additional_fn代表点击的哪个按钮按钮见functional.py
"""
print(llm_kwargs["llm_model"])
if additional_fn is not None:
import core_functional
importlib.reload(core_functional) # 热更新prompt
core_functional = core_functional.get_core_functions()
if "PreProcess" in core_functional[additional_fn]: inputs = core_functional[additional_fn]["PreProcess"](inputs) # 获取预处理函数(如果有的话)
inputs = core_functional[additional_fn]["Prefix"] + inputs + core_functional[additional_fn]["Suffix"]
raw_input = inputs
logging.info(f'[raw_input] {raw_input}')
chatbot.append((inputs, ""))
yield from update_ui(chatbot=chatbot, history=history, msg="等待响应") # 刷新界面
payload = generate_azure_payload(inputs, llm_kwargs, history, system_prompt, stream)
history.append(inputs); history.append("")
retry = 0
while True:
try:
openai.api_type = "azure"
openai.api_version = AZURE_API_VERSION
openai.api_base = AZURE_ENDPOINT
openai.api_key = AZURE_API_KEY
response = openai.ChatCompletion.create(timeout=TIMEOUT_SECONDS, **payload);break
except:
retry += 1
chatbot[-1] = ((chatbot[-1][0], "获取response失败重试中。。。"))
retry_msg = f",正在重试 ({retry}/{MAX_RETRY}) ……" if MAX_RETRY > 0 else ""
yield from update_ui(chatbot=chatbot, history=history, msg="请求超时"+retry_msg) # 刷新界面
if retry > MAX_RETRY: raise TimeoutError
gpt_replying_buffer = ""
is_head_of_the_stream = True
if stream:
stream_response = response
while True:
try:
chunk = next(stream_response)
except StopIteration:
from toolbox import regular_txt_to_markdown; tb_str = '```\n' + trimmed_format_exc() + '```'
chatbot[-1] = (chatbot[-1][0], f"[Local Message] 远程返回错误: \n\n{tb_str} \n\n{regular_txt_to_markdown(chunk)}")
yield from update_ui(chatbot=chatbot, history=history, msg="远程返回错误:" + chunk) # 刷新界面
return
if is_head_of_the_stream and (r'"object":"error"' not in chunk):
# 数据流的第一帧不携带content
is_head_of_the_stream = False; continue
if chunk:
#print(chunk)
try:
if "delta" in chunk["choices"][0]:
if chunk["choices"][0]["finish_reason"] == "stop":
logging.info(f'[response] {gpt_replying_buffer}')
break
status_text = f"finish_reason: {chunk['choices'][0]['finish_reason']}"
gpt_replying_buffer = gpt_replying_buffer + chunk["choices"][0]["delta"]["content"]
history[-1] = gpt_replying_buffer
chatbot[-1] = (history[-2], history[-1])
yield from update_ui(chatbot=chatbot, history=history, msg=status_text) # 刷新界面
except Exception as e:
traceback.print_exc()
yield from update_ui(chatbot=chatbot, history=history, msg="Json解析不合常规") # 刷新界面
chunk = get_full_error(chunk, stream_response)
error_msg = chunk
yield from update_ui(chatbot=chatbot, history=history, msg="Json异常" + error_msg) # 刷新界面
return
def predict_no_ui_long_connection(inputs, llm_kwargs, history=[], sys_prompt="", observe_window=None, console_slience=False):
"""
发送至AZURE OPENAI API等待回复一次性完成不显示中间过程但内部用stream的方法避免中途网线被掐
inputs
是本次问询的输入
sys_prompt:
系统静默prompt
llm_kwargs
chatGPT的内部调优参数
history
是之前的对话列表
observe_window = None
用于负责跨越线程传递已经输出的部分大部分时候仅仅为了fancy的视觉效果留空即可observe_window[0]观测窗observe_window[1]看门狗
"""
watch_dog_patience = 5 # 看门狗的耐心, 设置5秒即可
payload = generate_azure_payload(inputs, llm_kwargs, history, system_prompt=sys_prompt, stream=True)
retry = 0
while True:
try:
openai.api_type = "azure"
openai.api_version = AZURE_API_VERSION
openai.api_base = AZURE_ENDPOINT
openai.api_key = AZURE_API_KEY
response = openai.ChatCompletion.create(timeout=TIMEOUT_SECONDS, **payload);break
except:
retry += 1
traceback.print_exc()
if retry > MAX_RETRY: raise TimeoutError
if MAX_RETRY!=0: print(f'请求超时,正在重试 ({retry}/{MAX_RETRY}) ……')
stream_response = response
result = ''
while True:
try: chunk = next(stream_response)
except StopIteration:
break
except:
chunk = next(stream_response) # 失败了,重试一次?再失败就没办法了。
if len(chunk)==0: continue
if not chunk.startswith('data:'):
error_msg = get_full_error(chunk, stream_response)
if "reduce the length" in error_msg:
raise ConnectionAbortedError("AZURE OPENAI API拒绝了请求:" + error_msg)
else:
raise RuntimeError("AZURE OPENAI API拒绝了请求" + error_msg)
if ('data: [DONE]' in chunk): break
delta = chunk["delta"]
if len(delta) == 0: break
if "role" in delta: continue
if "content" in delta:
result += delta["content"]
if not console_slience: print(delta["content"], end='')
if observe_window is not None:
# 观测窗,把已经获取的数据显示出去
if len(observe_window) >= 1: observe_window[0] += delta["content"]
# 看门狗,如果超过期限没有喂狗,则终止
if len(observe_window) >= 2:
if (time.time()-observe_window[1]) > watch_dog_patience:
raise RuntimeError("用户取消了程序。")
else: raise RuntimeError("意外Json结构"+delta)
if chunk['finish_reason'] == 'length':
raise ConnectionAbortedError("正常结束但显示Token不足导致输出不完整请削减单次输入的文本量。")
return result
def generate_azure_payload(inputs, llm_kwargs, history, system_prompt, stream):
"""
整合所有信息选择LLM模型生成 azure openai api请求为发送请求做准备
"""
conversation_cnt = len(history) // 2
messages = [{"role": "system", "content": system_prompt}]
if conversation_cnt:
for index in range(0, 2*conversation_cnt, 2):
what_i_have_asked = {}
what_i_have_asked["role"] = "user"
what_i_have_asked["content"] = history[index]
what_gpt_answer = {}
what_gpt_answer["role"] = "assistant"
what_gpt_answer["content"] = history[index+1]
if what_i_have_asked["content"] != "":
if what_gpt_answer["content"] == "": continue
messages.append(what_i_have_asked)
messages.append(what_gpt_answer)
else:
messages[-1]['content'] = what_gpt_answer['content']
what_i_ask_now = {}
what_i_ask_now["role"] = "user"
what_i_ask_now["content"] = inputs
messages.append(what_i_ask_now)
payload = {
"model": llm_kwargs['llm_model'],
"messages": messages,
"temperature": llm_kwargs['temperature'], # 1.0,
"top_p": llm_kwargs['top_p'], # 1.0,
"n": 1,
"stream": stream,
"presence_penalty": 0,
"frequency_penalty": 0,
"engine": AZURE_ENGINE
}
try:
print(f" {llm_kwargs['llm_model']} : {conversation_cnt} : {inputs[:100]} ..........")
except:
print('输入中可能存在乱码。')
return payload

View File

@ -22,8 +22,8 @@ import importlib
# config_private.py放自己的秘密如API和代理网址
# 读取时首先看是否存在私密的config_private配置文件不受git管控如果有则覆盖原config文件
from toolbox import get_conf, update_ui, is_any_api_key, select_api_key, what_keys, clip_history, trimmed_format_exc
proxies, API_KEY, TIMEOUT_SECONDS, MAX_RETRY = \
get_conf('proxies', 'API_KEY', 'TIMEOUT_SECONDS', 'MAX_RETRY')
proxies, TIMEOUT_SECONDS, MAX_RETRY, API_ORG = \
get_conf('proxies', 'TIMEOUT_SECONDS', 'MAX_RETRY', 'API_ORG')
timeout_bot_msg = '[Local Message] Request timeout. Network error. Please check proxy settings in config.py.' + \
'网络错误,检查代理服务器是否可用,以及代理设置的格式是否正确,格式须是[协议]://[地址]:[端口],缺一不可。'
@ -101,6 +101,8 @@ def predict_no_ui_long_connection(inputs, llm_kwargs, history=[], sys_prompt="",
if (time.time()-observe_window[1]) > watch_dog_patience:
raise RuntimeError("用户取消了程序。")
else: raise RuntimeError("意外Json结构"+delta)
if json_data['finish_reason'] == 'content_filter':
raise RuntimeError("由于提问含不合规内容被Azure过滤。")
if json_data['finish_reason'] == 'length':
raise ConnectionAbortedError("正常结束但显示Token不足导致输出不完整请削减单次输入的文本量。")
return result
@ -205,6 +207,7 @@ def predict(inputs, llm_kwargs, plugin_kwargs, chatbot, history=[], system_promp
chunk = get_full_error(chunk, stream_response)
chunk_decoded = chunk.decode()
error_msg = chunk_decoded
openai_website = ' 请登录OpenAI查看详情 https://platform.openai.com/signup'
if "reduce the length" in error_msg:
if len(history) >= 2: history[-1] = ""; history[-2] = "" # 清除当前溢出的输入history[-2] 是本次输入, history[-1] 是本次输出
history = clip_history(inputs=inputs, history=history, tokenizer=model_info[llm_kwargs['llm_model']]['tokenizer'],
@ -214,9 +217,13 @@ def predict(inputs, llm_kwargs, plugin_kwargs, chatbot, history=[], system_promp
elif "does not exist" in error_msg:
chatbot[-1] = (chatbot[-1][0], f"[Local Message] Model {llm_kwargs['llm_model']} does not exist. 模型不存在, 或者您没有获得体验资格.")
elif "Incorrect API key" in error_msg:
chatbot[-1] = (chatbot[-1][0], "[Local Message] Incorrect API key. OpenAI以提供了不正确的API_KEY为由, 拒绝服务.")
chatbot[-1] = (chatbot[-1][0], "[Local Message] Incorrect API key. OpenAI以提供了不正确的API_KEY为由, 拒绝服务. " + openai_website)
elif "exceeded your current quota" in error_msg:
chatbot[-1] = (chatbot[-1][0], "[Local Message] You exceeded your current quota. OpenAI以账户额度不足为由, 拒绝服务.")
chatbot[-1] = (chatbot[-1][0], "[Local Message] You exceeded your current quota. OpenAI以账户额度不足为由, 拒绝服务." + openai_website)
elif "account is not active" in error_msg:
chatbot[-1] = (chatbot[-1][0], "[Local Message] Your account is not active. OpenAI以账户失效为由, 拒绝服务." + openai_website)
elif "associated with a deactivated account" in error_msg:
chatbot[-1] = (chatbot[-1][0], "[Local Message] You are associated with a deactivated account. OpenAI以账户失效为由, 拒绝服务." + openai_website)
elif "bad forward key" in error_msg:
chatbot[-1] = (chatbot[-1][0], "[Local Message] Bad forward key. API2D账户额度不足.")
elif "Not enough point" in error_msg:
@ -241,6 +248,8 @@ def generate_payload(inputs, llm_kwargs, history, system_prompt, stream):
"Content-Type": "application/json",
"Authorization": f"Bearer {api_key}"
}
if API_ORG.startswith('org-'): headers.update({"OpenAI-Organization": API_ORG})
if llm_kwargs['llm_model'].startswith('azure-'): headers.update({"api-key": api_key})
conversation_cnt = len(history) // 2

View File

@ -1,254 +0,0 @@
"""
========================================================================
第一部分来自EdgeGPT.py
https://github.com/acheong08/EdgeGPT
========================================================================
"""
from .edge_gpt import NewbingChatbot
load_message = "等待NewBing响应。"
"""
========================================================================
第二部分子进程Worker调用主体
========================================================================
"""
import time
import json
import re
import logging
import asyncio
import importlib
import threading
from toolbox import update_ui, get_conf, trimmed_format_exc
from multiprocessing import Process, Pipe
def preprocess_newbing_out(s):
pattern = r'\^(\d+)\^' # 匹配^数字^
sub = lambda m: '('+m.group(1)+')' # 将匹配到的数字作为替换值
result = re.sub(pattern, sub, s) # 替换操作
if '[1]' in result:
result += '\n\n```reference\n' + "\n".join([r for r in result.split('\n') if r.startswith('[')]) + '\n```\n'
return result
def preprocess_newbing_out_simple(result):
if '[1]' in result:
result += '\n\n```reference\n' + "\n".join([r for r in result.split('\n') if r.startswith('[')]) + '\n```\n'
return result
class NewBingHandle(Process):
def __init__(self):
super().__init__(daemon=True)
self.parent, self.child = Pipe()
self.newbing_model = None
self.info = ""
self.success = True
self.local_history = []
self.check_dependency()
self.start()
self.threadLock = threading.Lock()
def check_dependency(self):
try:
self.success = False
import certifi, httpx, rich
self.info = "依赖检测通过等待NewBing响应。注意目前不能多人同时调用NewBing接口有线程锁否则将导致每个人的NewBing问询历史互相渗透。调用NewBing时会自动使用已配置的代理。"
self.success = True
except:
self.info = "缺少的依赖如果要使用Newbing除了基础的pip依赖以外您还需要运行`pip install -r request_llm/requirements_newbing.txt`安装Newbing的依赖。"
self.success = False
def ready(self):
return self.newbing_model is not None
async def async_run(self):
# 读取配置
NEWBING_STYLE, = get_conf('NEWBING_STYLE')
from request_llm.bridge_all import model_info
endpoint = model_info['newbing']['endpoint']
while True:
# 等待
kwargs = self.child.recv()
question=kwargs['query']
history=kwargs['history']
system_prompt=kwargs['system_prompt']
# 是否重置
if len(self.local_history) > 0 and len(history)==0:
await self.newbing_model.reset()
self.local_history = []
# 开始问问题
prompt = ""
if system_prompt not in self.local_history:
self.local_history.append(system_prompt)
prompt += system_prompt + '\n'
# 追加历史
for ab in history:
a, b = ab
if a not in self.local_history:
self.local_history.append(a)
prompt += a + '\n'
# if b not in self.local_history:
# self.local_history.append(b)
# prompt += b + '\n'
# 问题
prompt += question
self.local_history.append(question)
print('question:', prompt)
# 提交
async for final, response in self.newbing_model.ask_stream(
prompt=question,
conversation_style=NEWBING_STYLE, # ["creative", "balanced", "precise"]
wss_link=endpoint, # "wss://sydney.bing.com/sydney/ChatHub"
):
if not final:
print(response)
self.child.send(str(response))
else:
print('-------- receive final ---------')
self.child.send('[Finish]')
# self.local_history.append(response)
def run(self):
"""
这个函数运行在子进程
"""
# 第一次运行,加载参数
self.success = False
self.local_history = []
if (self.newbing_model is None) or (not self.success):
# 代理设置
proxies, = get_conf('proxies')
if proxies is None:
self.proxies_https = None
else:
self.proxies_https = proxies['https']
# cookie
NEWBING_COOKIES, = get_conf('NEWBING_COOKIES')
try:
cookies = json.loads(NEWBING_COOKIES)
except:
self.success = False
tb_str = '\n```\n' + trimmed_format_exc() + '\n```\n'
self.child.send(f'[Local Message] 不能加载Newbing组件。NEWBING_COOKIES未填写或有格式错误。')
self.child.send('[Fail]')
self.child.send('[Finish]')
raise RuntimeError(f"不能加载Newbing组件。NEWBING_COOKIES未填写或有格式错误。")
try:
self.newbing_model = NewbingChatbot(proxy=self.proxies_https, cookies=cookies)
except:
self.success = False
tb_str = '\n```\n' + trimmed_format_exc() + '\n```\n'
self.child.send(f'[Local Message] 不能加载Newbing组件。{tb_str}')
self.child.send('[Fail]')
self.child.send('[Finish]')
raise RuntimeError(f"不能加载Newbing组件。")
self.success = True
try:
# 进入任务等待状态
asyncio.run(self.async_run())
except Exception:
tb_str = '\n```\n' + trimmed_format_exc() + '\n```\n'
self.child.send(f'[Local Message] Newbing失败 {tb_str}.')
self.child.send('[Fail]')
self.child.send('[Finish]')
def stream_chat(self, **kwargs):
"""
这个函数运行在主进程
"""
self.threadLock.acquire()
self.parent.send(kwargs) # 发送请求到子进程
while True:
res = self.parent.recv() # 等待newbing回复的片段
if res == '[Finish]':
break # 结束
elif res == '[Fail]':
self.success = False
break
else:
yield res # newbing回复的片段
self.threadLock.release()
"""
========================================================================
第三部分主进程统一调用函数接口
========================================================================
"""
global newbing_handle
newbing_handle = None
def predict_no_ui_long_connection(inputs, llm_kwargs, history=[], sys_prompt="", observe_window=None, console_slience=False):
"""
多线程方法
函数的说明请见 request_llm/bridge_all.py
"""
global newbing_handle
if (newbing_handle is None) or (not newbing_handle.success):
newbing_handle = NewBingHandle()
observe_window[0] = load_message + "\n\n" + newbing_handle.info
if not newbing_handle.success:
error = newbing_handle.info
newbing_handle = None
raise RuntimeError(error)
# 没有 sys_prompt 接口因此把prompt加入 history
history_feedin = []
for i in range(len(history)//2):
history_feedin.append([history[2*i], history[2*i+1]] )
watch_dog_patience = 5 # 看门狗 (watchdog) 的耐心, 设置5秒即可
response = ""
observe_window[0] = "[Local Message]: 等待NewBing响应中 ..."
for response in newbing_handle.stream_chat(query=inputs, history=history_feedin, system_prompt=sys_prompt, max_length=llm_kwargs['max_length'], top_p=llm_kwargs['top_p'], temperature=llm_kwargs['temperature']):
observe_window[0] = preprocess_newbing_out_simple(response)
if len(observe_window) >= 2:
if (time.time()-observe_window[1]) > watch_dog_patience:
raise RuntimeError("程序终止。")
return preprocess_newbing_out_simple(response)
def predict(inputs, llm_kwargs, plugin_kwargs, chatbot, history=[], system_prompt='', stream = True, additional_fn=None):
"""
单线程方法
函数的说明请见 request_llm/bridge_all.py
"""
chatbot.append((inputs, "[Local Message]: 等待NewBing响应中 ..."))
global newbing_handle
if (newbing_handle is None) or (not newbing_handle.success):
newbing_handle = NewBingHandle()
chatbot[-1] = (inputs, load_message + "\n\n" + newbing_handle.info)
yield from update_ui(chatbot=chatbot, history=[])
if not newbing_handle.success:
newbing_handle = None
return
if additional_fn is not None:
import core_functional
importlib.reload(core_functional) # 热更新prompt
core_functional = core_functional.get_core_functions()
if "PreProcess" in core_functional[additional_fn]: inputs = core_functional[additional_fn]["PreProcess"](inputs) # 获取预处理函数(如果有的话)
inputs = core_functional[additional_fn]["Prefix"] + inputs + core_functional[additional_fn]["Suffix"]
history_feedin = []
for i in range(len(history)//2):
history_feedin.append([history[2*i], history[2*i+1]] )
chatbot[-1] = (inputs, "[Local Message]: 等待NewBing响应中 ...")
response = "[Local Message]: 等待NewBing响应中 ..."
yield from update_ui(chatbot=chatbot, history=history, msg="NewBing响应缓慢尚未完成全部响应请耐心完成后再提交新问题。")
for response in newbing_handle.stream_chat(query=inputs, history=history_feedin, system_prompt=system_prompt, max_length=llm_kwargs['max_length'], top_p=llm_kwargs['top_p'], temperature=llm_kwargs['temperature']):
chatbot[-1] = (inputs, preprocess_newbing_out(response))
yield from update_ui(chatbot=chatbot, history=history, msg="NewBing响应缓慢尚未完成全部响应请耐心完成后再提交新问题。")
if response == "[Local Message]: 等待NewBing响应中 ...": response = "[Local Message]: NewBing响应异常请刷新界面重试 ..."
history.extend([inputs, response])
logging.info(f'[raw_input] {inputs}')
logging.info(f'[response] {response}')
yield from update_ui(chatbot=chatbot, history=history, msg="完成全部响应,请提交新问题。")

View File

@ -89,9 +89,6 @@ class NewBingHandle(Process):
if a not in self.local_history:
self.local_history.append(a)
prompt += a + '\n'
# if b not in self.local_history:
# self.local_history.append(b)
# prompt += b + '\n'
# 问题
prompt += question
@ -101,7 +98,7 @@ class NewBingHandle(Process):
async for final, response in self.newbing_model.ask_stream(
prompt=question,
conversation_style=NEWBING_STYLE, # ["creative", "balanced", "precise"]
wss_link=endpoint, # "wss://sydney.bing.com/sydney/ChatHub"
wss_link=endpoint, # "wss://sydney.bing.com/sydney/ChatHub"
):
if not final:
print(response)
@ -121,14 +118,26 @@ class NewBingHandle(Process):
self.local_history = []
if (self.newbing_model is None) or (not self.success):
# 代理设置
proxies, = get_conf('proxies')
proxies, NEWBING_COOKIES = get_conf('proxies', 'NEWBING_COOKIES')
if proxies is None:
self.proxies_https = None
else:
self.proxies_https = proxies['https']
if (NEWBING_COOKIES is not None) and len(NEWBING_COOKIES) > 100:
try:
cookies = json.loads(NEWBING_COOKIES)
except:
self.success = False
tb_str = '\n```\n' + trimmed_format_exc() + '\n```\n'
self.child.send(f'[Local Message] NEWBING_COOKIES未填写或有格式错误。')
self.child.send('[Fail]'); self.child.send('[Finish]')
raise RuntimeError(f"NEWBING_COOKIES未填写或有格式错误。")
else:
cookies = None
try:
self.newbing_model = NewbingChatbot(proxy=self.proxies_https)
self.newbing_model = NewbingChatbot(proxy=self.proxies_https, cookies=cookies)
except:
self.success = False
tb_str = '\n```\n' + trimmed_format_exc() + '\n```\n'
@ -143,7 +152,7 @@ class NewBingHandle(Process):
asyncio.run(self.async_run())
except Exception:
tb_str = '\n```\n' + trimmed_format_exc() + '\n```\n'
self.child.send(f'[Local Message] Newbing失败 {tb_str}.')
self.child.send(f'[Local Message] Newbing 请求失败,报错信息如下. 如果是与网络相关的问题建议更换代理协议推荐http或代理节点 {tb_str}.')
self.child.send('[Fail]')
self.child.send('[Finish]')
@ -151,18 +160,14 @@ class NewBingHandle(Process):
"""
这个函数运行在主进程
"""
self.threadLock.acquire()
self.parent.send(kwargs) # 发送请求子进程
self.threadLock.acquire() # 获取线程锁
self.parent.send(kwargs) # 请求子进程
while True:
res = self.parent.recv() # 等待newbing回复的片段
if res == '[Finish]':
break # 结束
elif res == '[Fail]':
self.success = False
break
else:
yield res # newbing回复的片段
self.threadLock.release()
res = self.parent.recv() # 等待newbing回复的片段
if res == '[Finish]': break # 结束
elif res == '[Fail]': self.success = False; break # 失败
else: yield res # newbing回复的片段
self.threadLock.release() # 释放线程锁
"""

View File

@ -1,4 +1,4 @@
from .bridge_newbing import preprocess_newbing_out, preprocess_newbing_out_simple
from .bridge_newbingfree import preprocess_newbing_out, preprocess_newbing_out_simple
from multiprocessing import Process, Pipe
from toolbox import update_ui, get_conf, trimmed_format_exc
import threading

View File

@ -1,409 +0,0 @@
"""
========================================================================
第一部分来自EdgeGPT.py
https://github.com/acheong08/EdgeGPT
========================================================================
"""
import argparse
import asyncio
import json
import os
import random
import re
import ssl
import sys
import uuid
from enum import Enum
from typing import Generator
from typing import Literal
from typing import Optional
from typing import Union
import websockets.client as websockets
DELIMITER = "\x1e"
# Generate random IP between range 13.104.0.0/14
FORWARDED_IP = (
f"13.{random.randint(104, 107)}.{random.randint(0, 255)}.{random.randint(0, 255)}"
)
HEADERS = {
"accept": "application/json",
"accept-language": "en-US,en;q=0.9",
"content-type": "application/json",
"sec-ch-ua": '"Not_A Brand";v="99", "Microsoft Edge";v="110", "Chromium";v="110"',
"sec-ch-ua-arch": '"x86"',
"sec-ch-ua-bitness": '"64"',
"sec-ch-ua-full-version": '"109.0.1518.78"',
"sec-ch-ua-full-version-list": '"Chromium";v="110.0.5481.192", "Not A(Brand";v="24.0.0.0", "Microsoft Edge";v="110.0.1587.69"',
"sec-ch-ua-mobile": "?0",
"sec-ch-ua-model": "",
"sec-ch-ua-platform": '"Windows"',
"sec-ch-ua-platform-version": '"15.0.0"',
"sec-fetch-dest": "empty",
"sec-fetch-mode": "cors",
"sec-fetch-site": "same-origin",
"x-ms-client-request-id": str(uuid.uuid4()),
"x-ms-useragent": "azsdk-js-api-client-factory/1.0.0-beta.1 core-rest-pipeline/1.10.0 OS/Win32",
"Referer": "https://www.bing.com/search?q=Bing+AI&showconv=1&FORM=hpcodx",
"Referrer-Policy": "origin-when-cross-origin",
"x-forwarded-for": FORWARDED_IP,
}
HEADERS_INIT_CONVER = {
"authority": "edgeservices.bing.com",
"accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
"accept-language": "en-US,en;q=0.9",
"cache-control": "max-age=0",
"sec-ch-ua": '"Chromium";v="110", "Not A(Brand";v="24", "Microsoft Edge";v="110"',
"sec-ch-ua-arch": '"x86"',
"sec-ch-ua-bitness": '"64"',
"sec-ch-ua-full-version": '"110.0.1587.69"',
"sec-ch-ua-full-version-list": '"Chromium";v="110.0.5481.192", "Not A(Brand";v="24.0.0.0", "Microsoft Edge";v="110.0.1587.69"',
"sec-ch-ua-mobile": "?0",
"sec-ch-ua-model": '""',
"sec-ch-ua-platform": '"Windows"',
"sec-ch-ua-platform-version": '"15.0.0"',
"sec-fetch-dest": "document",
"sec-fetch-mode": "navigate",
"sec-fetch-site": "none",
"sec-fetch-user": "?1",
"upgrade-insecure-requests": "1",
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36 Edg/110.0.1587.69",
"x-edge-shopping-flag": "1",
"x-forwarded-for": FORWARDED_IP,
}
def get_ssl_context():
import certifi
ssl_context = ssl.create_default_context()
ssl_context.load_verify_locations(certifi.where())
return ssl_context
class NotAllowedToAccess(Exception):
pass
class ConversationStyle(Enum):
creative = "h3imaginative,clgalileo,gencontentv3"
balanced = "galileo"
precise = "h3precise,clgalileo"
CONVERSATION_STYLE_TYPE = Optional[
Union[ConversationStyle, Literal["creative", "balanced", "precise"]]
]
def _append_identifier(msg: dict) -> str:
"""
Appends special character to end of message to identify end of message
"""
# Convert dict to json string
return json.dumps(msg) + DELIMITER
def _get_ran_hex(length: int = 32) -> str:
"""
Returns random hex string
"""
return "".join(random.choice("0123456789abcdef") for _ in range(length))
class _ChatHubRequest:
"""
Request object for ChatHub
"""
def __init__(
self,
conversation_signature: str,
client_id: str,
conversation_id: str,
invocation_id: int = 0,
) -> None:
self.struct: dict = {}
self.client_id: str = client_id
self.conversation_id: str = conversation_id
self.conversation_signature: str = conversation_signature
self.invocation_id: int = invocation_id
def update(
self,
prompt,
conversation_style,
options,
) -> None:
"""
Updates request object
"""
if options is None:
options = [
"deepleo",
"enable_debug_commands",
"disable_emoji_spoken_text",
"enablemm",
]
if conversation_style:
if not isinstance(conversation_style, ConversationStyle):
conversation_style = getattr(ConversationStyle, conversation_style)
options = [
"nlu_direct_response_filter",
"deepleo",
"disable_emoji_spoken_text",
"responsible_ai_policy_235",
"enablemm",
conversation_style.value,
"dtappid",
"cricinfo",
"cricinfov2",
"dv3sugg",
]
self.struct = {
"arguments": [
{
"source": "cib",
"optionsSets": options,
"sliceIds": [
"222dtappid",
"225cricinfo",
"224locals0",
],
"traceId": _get_ran_hex(32),
"isStartOfSession": self.invocation_id == 0,
"message": {
"author": "user",
"inputMethod": "Keyboard",
"text": prompt,
"messageType": "Chat",
},
"conversationSignature": self.conversation_signature,
"participant": {
"id": self.client_id,
},
"conversationId": self.conversation_id,
},
],
"invocationId": str(self.invocation_id),
"target": "chat",
"type": 4,
}
self.invocation_id += 1
class _Conversation:
"""
Conversation API
"""
def __init__(
self,
cookies,
proxy,
) -> None:
self.struct: dict = {
"conversationId": None,
"clientId": None,
"conversationSignature": None,
"result": {"value": "Success", "message": None},
}
import httpx
self.proxy = proxy
proxy = (
proxy
or os.environ.get("all_proxy")
or os.environ.get("ALL_PROXY")
or os.environ.get("https_proxy")
or os.environ.get("HTTPS_PROXY")
or None
)
if proxy is not None and proxy.startswith("socks5h://"):
proxy = "socks5://" + proxy[len("socks5h://") :]
self.session = httpx.Client(
proxies=proxy,
timeout=30,
headers=HEADERS_INIT_CONVER,
)
for cookie in cookies:
self.session.cookies.set(cookie["name"], cookie["value"])
# Send GET request
response = self.session.get(
url=os.environ.get("BING_PROXY_URL")
or "https://edgeservices.bing.com/edgesvc/turing/conversation/create",
)
if response.status_code != 200:
response = self.session.get(
"https://edge.churchless.tech/edgesvc/turing/conversation/create",
)
if response.status_code != 200:
print(f"Status code: {response.status_code}")
print(response.text)
print(response.url)
raise Exception("Authentication failed")
try:
self.struct = response.json()
except (json.decoder.JSONDecodeError, NotAllowedToAccess) as exc:
raise Exception(
"Authentication failed. You have not been accepted into the beta.",
) from exc
if self.struct["result"]["value"] == "UnauthorizedRequest":
raise NotAllowedToAccess(self.struct["result"]["message"])
class _ChatHub:
"""
Chat API
"""
def __init__(self, conversation) -> None:
self.wss = None
self.request: _ChatHubRequest
self.loop: bool
self.task: asyncio.Task
print(conversation.struct)
self.request = _ChatHubRequest(
conversation_signature=conversation.struct["conversationSignature"],
client_id=conversation.struct["clientId"],
conversation_id=conversation.struct["conversationId"],
)
async def ask_stream(
self,
prompt: str,
wss_link: str,
conversation_style: CONVERSATION_STYLE_TYPE = None,
raw: bool = False,
options: dict = None,
) -> Generator[str, None, None]:
"""
Ask a question to the bot
"""
if self.wss and not self.wss.closed:
await self.wss.close()
# Check if websocket is closed
self.wss = await websockets.connect(
wss_link,
extra_headers=HEADERS,
max_size=None,
ssl=get_ssl_context()
)
await self._initial_handshake()
# Construct a ChatHub request
self.request.update(
prompt=prompt,
conversation_style=conversation_style,
options=options,
)
# Send request
await self.wss.send(_append_identifier(self.request.struct))
final = False
while not final:
objects = str(await self.wss.recv()).split(DELIMITER)
for obj in objects:
if obj is None or not obj:
continue
response = json.loads(obj)
if response.get("type") != 2 and raw:
yield False, response
elif response.get("type") == 1 and response["arguments"][0].get(
"messages",
):
resp_txt = response["arguments"][0]["messages"][0]["adaptiveCards"][
0
]["body"][0].get("text")
yield False, resp_txt
elif response.get("type") == 2:
final = True
yield True, response
async def _initial_handshake(self) -> None:
await self.wss.send(_append_identifier({"protocol": "json", "version": 1}))
await self.wss.recv()
async def close(self) -> None:
"""
Close the connection
"""
if self.wss and not self.wss.closed:
await self.wss.close()
class NewbingChatbot:
"""
Combines everything to make it seamless
"""
def __init__(
self,
cookies,
proxy
) -> None:
if cookies is None:
cookies = {}
self.cookies = cookies
self.proxy = proxy
self.chat_hub: _ChatHub = _ChatHub(
_Conversation(self.cookies, self.proxy),
)
async def ask(
self,
prompt: str,
wss_link: str,
conversation_style: CONVERSATION_STYLE_TYPE = None,
options: dict = None,
) -> dict:
"""
Ask a question to the bot
"""
async for final, response in self.chat_hub.ask_stream(
prompt=prompt,
conversation_style=conversation_style,
wss_link=wss_link,
options=options,
):
if final:
return response
await self.chat_hub.wss.close()
return None
async def ask_stream(
self,
prompt: str,
wss_link: str,
conversation_style: CONVERSATION_STYLE_TYPE = None,
raw: bool = False,
options: dict = None,
) -> Generator[str, None, None]:
"""
Ask a question to the bot
"""
async for response in self.chat_hub.ask_stream(
prompt=prompt,
conversation_style=conversation_style,
wss_link=wss_link,
raw=raw,
options=options,
):
yield response
async def close(self) -> None:
"""
Close the connection
"""
await self.chat_hub.close()
async def reset(self) -> None:
"""
Reset the conversation
"""
await self.close()
self.chat_hub = _ChatHub(_Conversation(self.cookies, self.proxy))

View File

@ -1,4 +1,5 @@
./docs/gradio-3.32.2-py3-none-any.whl
pydantic==1.10.11
tiktoken>=0.3.3
requests[socks]
transformers
@ -15,4 +16,4 @@ pymupdf
openai
numpy
arxiv
rich
rich

View File

@ -4,6 +4,7 @@ import time
import inspect
import re
import os
import gradio
from latex2mathml.converter import convert as tex2mathml
from functools import wraps, lru_cache
pj = os.path.join
@ -40,7 +41,7 @@ def ArgsGeneralWrapper(f):
"""
装饰器函数用于重组输入参数改变输入参数的顺序与结构
"""
def decorated(cookies, max_length, llm_model, txt, txt2, top_p, temperature, chatbot, history, system_prompt, plugin_advanced_arg, *args):
def decorated(request: gradio.Request, cookies, max_length, llm_model, txt, txt2, top_p, temperature, chatbot, history, system_prompt, plugin_advanced_arg, *args):
txt_passon = txt
if txt == "" and txt2 != "": txt_passon = txt2
# 引入一个有cookie的chatbot
@ -54,13 +55,21 @@ def ArgsGeneralWrapper(f):
'top_p':top_p,
'max_length': max_length,
'temperature':temperature,
'client_ip': request.client.host,
}
plugin_kwargs = {
"advanced_arg": plugin_advanced_arg,
}
chatbot_with_cookie = ChatBotWithCookies(cookies)
chatbot_with_cookie.write_list(chatbot)
yield from f(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, *args)
if cookies.get('lock_plugin', None) is None:
# 正常状态
yield from f(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, *args)
else:
# 处理个别特殊插件的锁定状态
module, fn_name = cookies['lock_plugin'].split('->')
f_hot_reload = getattr(importlib.import_module(module, fn_name), fn_name)
yield from f_hot_reload(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, *args)
return decorated
@ -68,8 +77,21 @@ def update_ui(chatbot, history, msg='正常', **kwargs): # 刷新界面
"""
刷新用户界面
"""
assert isinstance(chatbot, ChatBotWithCookies), "在传递chatbot的过程中不要将其丢弃。必要时可用clear将其清空然后用for+append循环重新赋值。"
yield chatbot.get_cookies(), chatbot, history, msg
assert isinstance(chatbot, ChatBotWithCookies), "在传递chatbot的过程中不要将其丢弃。必要时, 可用clear将其清空, 然后用for+append循环重新赋值。"
cookies = chatbot.get_cookies()
# 解决插件锁定时的界面显示问题
if cookies.get('lock_plugin', None):
label = cookies.get('llm_model', "") + " | " + "正在锁定插件" + cookies.get('lock_plugin', None)
chatbot_gr = gradio.update(value=chatbot, label=label)
if cookies.get('label', "") != label: cookies['label'] = label # 记住当前的label
elif cookies.get('label', None):
chatbot_gr = gradio.update(value=chatbot, label=cookies.get('llm_model', ""))
cookies['label'] = None # 清空label
else:
chatbot_gr = chatbot
yield cookies, chatbot_gr, history, msg
def update_ui_lastest_msg(lastmsg, chatbot, history, delay=1): # 刷新界面
"""
@ -192,7 +214,7 @@ def write_results_to_file(history, file_name=None):
# remove everything that cannot be handled by utf8
f.write(content.encode('utf-8', 'ignore').decode())
f.write('\n\n')
res = '以上材料已经被写入' + os.path.abspath(f'./gpt_log/{file_name}')
res = '以上材料已经被写入:\t' + os.path.abspath(f'./gpt_log/{file_name}')
print(res)
return res
@ -445,8 +467,11 @@ def promote_file_to_downloadzone(file, rename_file=None, chatbot=None):
import shutil
if rename_file is None: rename_file = f'{gen_time_str()}-{os.path.basename(file)}'
new_path = os.path.join(f'./gpt_log/', rename_file)
# 如果已经存在,先删除
if os.path.exists(new_path) and not os.path.samefile(new_path, file): os.remove(new_path)
# 把文件复制过去
if not os.path.exists(new_path): shutil.copyfile(file, new_path)
# 将文件添加到chatbot cookie中避免多用户干扰
if chatbot:
if 'file_to_promote' in chatbot._cookies: current = chatbot._cookies['file_to_promote']
else: current = []
@ -505,16 +530,24 @@ def on_report_generated(cookies, files, chatbot):
chatbot.append(['报告如何远程获取?', f'报告已经添加到右侧“文件上传区”(可能处于折叠状态),请查收。{file_links}'])
return cookies, report_files, chatbot
def load_chat_cookies():
API_KEY, LLM_MODEL, AZURE_API_KEY = get_conf('API_KEY', 'LLM_MODEL', 'AZURE_API_KEY')
if is_any_api_key(AZURE_API_KEY):
if is_any_api_key(API_KEY): API_KEY = API_KEY + ',' + AZURE_API_KEY
else: API_KEY = AZURE_API_KEY
return {'api_key': API_KEY, 'llm_model': LLM_MODEL}
def is_openai_api_key(key):
API_MATCH_ORIGINAL = re.match(r"sk-[a-zA-Z0-9]{48}$", key)
return bool(API_MATCH_ORIGINAL)
def is_azure_api_key(key):
API_MATCH_AZURE = re.match(r"[a-zA-Z0-9]{32}$", key)
return bool(API_MATCH_ORIGINAL) or bool(API_MATCH_AZURE)
return bool(API_MATCH_AZURE)
def is_api2d_key(key):
if key.startswith('fk') and len(key) == 41:
return True
else:
return False
API_MATCH_API2D = re.match(r"fk[a-zA-Z0-9]{6}-[a-zA-Z0-9]{32}$", key)
return bool(API_MATCH_API2D)
def is_any_api_key(key):
if ',' in key:
@ -523,10 +556,10 @@ def is_any_api_key(key):
if is_any_api_key(k): return True
return False
else:
return is_openai_api_key(key) or is_api2d_key(key)
return is_openai_api_key(key) or is_api2d_key(key) or is_azure_api_key(key)
def what_keys(keys):
avail_key_list = {'OpenAI Key':0, "API2D Key":0}
avail_key_list = {'OpenAI Key':0, "Azure Key":0, "API2D Key":0}
key_list = keys.split(',')
for k in key_list:
@ -537,7 +570,11 @@ def what_keys(keys):
if is_api2d_key(k):
avail_key_list['API2D Key'] += 1
return f"检测到: OpenAI Key {avail_key_list['OpenAI Key']}API2D Key {avail_key_list['API2D Key']}"
for k in key_list:
if is_azure_api_key(k):
avail_key_list['Azure Key'] += 1
return f"检测到: OpenAI Key {avail_key_list['OpenAI Key']} 个, Azure Key {avail_key_list['Azure Key']} 个, API2D Key {avail_key_list['API2D Key']}"
def select_api_key(keys, llm_model):
import random
@ -552,8 +589,12 @@ def select_api_key(keys, llm_model):
for k in key_list:
if is_api2d_key(k): avail_key_list.append(k)
if llm_model.startswith('azure-'):
for k in key_list:
if is_azure_api_key(k): avail_key_list.append(k)
if len(avail_key_list) == 0:
raise RuntimeError(f"您提供的api-key不满足要求不包含任何可用于{llm_model}的api-key。您可能选择了错误的模型或请求源。")
raise RuntimeError(f"您提供的api-key不满足要求不包含任何可用于{llm_model}的api-key。您可能选择了错误的模型或请求源右下角更换模型菜单中可切换openai,azure和api2d请求源")
api_key = random.choice(avail_key_list) # 随机负载均衡
return api_key

View File

@ -1,5 +1,5 @@
{
"version": 3.42,
"version": 3.44,
"show_feature": true,
"new_feature": "完善本地Latex矫错和翻译功能 <-> 增加gpt-3.5-16k的支持 <-> 新增最强Arxiv论文翻译插件 <-> 修复gradio复制按钮BUG <-> 修复PDF翻译的BUG, 新增HTML中英双栏对照 <-> 添加了OpenAI图片生成插件 <-> 添加了OpenAI音频转文本总结插件 <-> 通过Slack添加对Claude的支持"
"new_feature": "[改善UI] 动态ChatBot窗口高度 <-> 修复Azure接口的BUG <-> 完善多语言模块 <-> 完善本地Latex矫错和翻译功能 <-> 增加gpt-3.5-16k的支持 <-> 新增最强Arxiv论文翻译插件 <-> 修复gradio复制按钮BUG <-> 修复PDF翻译的BUG, 新增HTML中英双栏对照 <-> 添加了OpenAI图片生成插件"
}