diff --git a/README.md b/README.md index d894348..285c10d 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,5 @@ + + # ChatGPT 学术优化 **如果喜欢这个项目,请给它一个Star;如果你发明了更好用的学术快捷键,欢迎发issue或者pull requests** @@ -16,6 +18,14 @@ https://github.com/polarwinkel/mdtex2html 项目使用OpenAI的gpt-3.5-turbo模型,期待gpt-4早点放宽门槛😂 ``` +> **Note** +> +> 1.请注意只有“红颜色”标识的函数插件(按钮)才支持读取文件。目前暂不能完善地支持pdf格式文献的翻译解读,尚不支持word格式文件的读取。 +> +> 2.本项目中每个文件的功能都在自译解[`project_self_analysis.md`](https://github.com/binary-husky/chatgpt_academic/wiki/chatgpt-academic%E9%A1%B9%E7%9B%AE%E8%87%AA%E8%AF%91%E8%A7%A3%E6%8A%A5%E5%91%8A)详细说明。随着版本的迭代,您也可以随时自行点击相关函数插件,调用GPT重新生成项目的自我解析报告。常见问题汇总在[`wiki`](https://github.com/binary-husky/chatgpt_academic/wiki/%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98)当中。 +> +> 3.如果您不太习惯部分中文命名的函数,您可以随时点击相关函数插件,调用GPT一键生成纯英文的项目源代码。 +
功能 | 描述 @@ -34,12 +44,13 @@ chat分析报告生成 | [实验性功能] 运行后自动生成总结汇报 公式显示 | 可以同时显示公式的tex形式和渲染形式 图片显示 | 可以在markdown中显示图片 支持GPT输出的markdown表格 | 可以输出支持GPT的markdown表格 +…… | ……
- 新界面
- +
@@ -71,33 +82,57 @@ chat分析报告生成 | [实验性功能] 运行后自动生成总结汇报 -## 直接运行 (Windows or Linux or MacOS) +## 直接运行 (Windows, Linux or MacOS) -``` sh -# 下载项目 +### 1. 下载项目 +```sh git clone https://github.com/binary-husky/chatgpt_academic.git cd chatgpt_academic -# 在config.py中,配置 海外Proxy 和 OpenAI API KEY -- 1.如果你在国内,需要设置海外代理才能够使用 OpenAI API,你可以通过 config.py 文件来进行设置。 -- 2.配置 OpenAI API KEY。你需要在 OpenAI 官网上注册并获取 API KEY。一旦你拿到了 API KEY,在 config.py 文件里配置好即可。 -# 安装依赖 -python -m pip install -r requirements.txt -# 运行 -python main.py - -# 测试实验性功能 -## 测试C++项目头文件分析 -input区域 输入 ./crazy_functions/test_project/cpp/libJPG , 然后点击 "[实验] 解析整个C++项目(input输入项目根路径)" -## 测试给Latex项目写摘要 -input区域 输入 ./crazy_functions/test_project/latex/attention , 然后点击 "[实验] 读tex论文写摘要(input输入项目根路径)" -## 测试Python项目分析 -input区域 输入 ./crazy_functions/test_project/python/dqn , 然后点击 "[实验] 解析整个py项目(input输入项目根路径)" -## 测试自我代码解读 -点击 "[实验] 请解析并解构此项目本身" -## 测试实验功能模板函数(要求gpt回答几个数的平方是什么),您可以根据此函数为模板,实现更复杂的功能 -点击 "[实验] 实验功能函数模板" ``` +### 2. 配置API_KEY和代理设置 + +在`config.py`中,配置 海外Proxy 和 OpenAI API KEY,说明如下 +``` +1. 如果你在国内,需要设置海外代理才能够顺利使用 OpenAI API,设置方法请仔细阅读config.py(1.修改其中的USE_PROXY为True; 2.按照说明修改其中的proxies)。 +2. 配置 OpenAI API KEY。你需要在 OpenAI 官网上注册并获取 API KEY。一旦你拿到了 API KEY,在 config.py 文件里配置好即可。 +3. 与代理网络有关的issue(网络超时、代理不起作用)汇总到 https://github.com/binary-husky/chatgpt_academic/issues/1 +``` +(P.S. 程序运行时会优先检查是否存在名为`config_private.py`的私密配置文件,并用其中的配置覆盖`config.py`的同名配置。因此,如果您能理解我们的配置读取逻辑,我们强烈建议您在`config.py`旁边创建一个名为`config_private.py`的新配置文件,并把`config.py`中的配置转移(复制)到`config_private.py`中。`config_private.py`不受git管控,可以让您的隐私信息更加安全。) + + +### 3. 安装依赖 +```sh +# (选择一)推荐 +python -m pip install -r requirements.txt + +# (选择二)如果您使用anaconda,步骤也是类似的: +# (选择二.1)conda create -n gptac_venv python=3.11 +# (选择二.2)conda activate gptac_venv +# (选择二.3)python -m pip install -r requirements.txt + +# 备注:使用官方pip源或者阿里pip源,其他pip源(如清华pip)有可能出问题,临时换源方法: +# python -m pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/ +``` + +### 4. 运行 +```sh +python main.py +``` + +### 5. 测试实验性功能 +``` +- 测试C++项目头文件分析 + input区域 输入 `./crazy_functions/test_project/cpp/libJPG` , 然后点击 "[实验] 解析整个C++项目(input输入项目根路径)" +- 测试给Latex项目写摘要 + input区域 输入 `./crazy_functions/test_project/latex/attention` , 然后点击 "[实验] 读tex论文写摘要(input输入项目根路径)" +- 测试Python项目分析 + input区域 输入 `./crazy_functions/test_project/python/dqn` , 然后点击 "[实验] 解析整个py项目(input输入项目根路径)" +- 测试自我代码解读 + 点击 "[实验] 请解析并解构此项目本身" +- 测试实验功能模板函数(要求gpt回答历史上的今天发生了什么),您可以根据此函数为模板,实现更复杂的功能 + 点击 "[实验] 实验功能函数模板" +``` ## 使用docker (Linux) @@ -115,7 +150,7 @@ docker run --rm -it --net=host gpt-academic # 测试实验性功能 ## 测试自我代码解读 点击 "[实验] 请解析并解构此项目本身" -## 测试实验功能模板函数(要求gpt回答几个数的平方是什么),您可以根据此函数为模板,实现更复杂的功能 +## 测试实验功能模板函数(要求gpt回答历史上的今天发生了什么),您可以根据此函数为模板,实现更复杂的功能 点击 "[实验] 实验功能函数模板" ##(请注意在docker中运行时,需要额外注意程序的文件访问权限问题) ## 测试C++项目头文件分析 @@ -127,6 +162,13 @@ input区域 输入 ./crazy_functions/test_project/python/dqn , 然后点击 "[ ``` +## 其他部署方式 +- 使用WSL2(Windows Subsystem for Linux 子系统) +请访问[部署wiki-1](https://github.com/binary-husky/chatgpt_academic/wiki/%E4%BD%BF%E7%94%A8WSL2%EF%BC%88Windows-Subsystem-for-Linux-%E5%AD%90%E7%B3%BB%E7%BB%9F%EF%BC%89%E9%83%A8%E7%BD%B2) + +- nginx远程部署 +请访问[部署wiki-2](https://github.com/binary-husky/chatgpt_academic/wiki/%E8%BF%9C%E7%A8%8B%E9%83%A8%E7%BD%B2%E7%9A%84%E6%8C%87%E5%AF%BC) + ## 自定义新的便捷按钮(学术快捷键自定义) 打开functional.py,添加条目如下,然后重启程序即可。(如果按钮已经添加成功并可见,那么前缀、后缀都支持热修改,无需重启程序即可生效。) @@ -166,11 +208,12 @@ python check_proxy.py ## 兼容性测试 ### 图片显示: +
- - +
+ ### 如果一个程序能够读懂并剖析自己:
diff --git a/check_proxy.py b/check_proxy.py index 39c8972..a6919dd 100644 --- a/check_proxy.py +++ b/check_proxy.py @@ -21,6 +21,7 @@ def check_proxy(proxies): if __name__ == '__main__': import os; os.environ['no_proxy'] = '*' # 避免代理网络产生意外污染 - try: from config_private import proxies # 放自己的秘密如API和代理网址 os.path.exists('config_private.py') - except: from config import proxies - check_proxy(proxies) \ No newline at end of file + from toolbox import get_conf + proxies, = get_conf('proxies') + check_proxy(proxies) + \ No newline at end of file diff --git a/config.py b/config.py index 644bd4d..7fc73db 100644 --- a/config.py +++ b/config.py @@ -1,5 +1,5 @@ # API_KEY = "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" 此key无效 -API_KEY = "sk-此处填API秘钥" +API_KEY = "sk-此处填API密钥" API_URL = "https://api.openai.com/v1/chat/completions" # 改为True应用代理 @@ -36,8 +36,3 @@ CONCURRENT_COUNT = 100 # 设置用户名和密码 AUTHENTICATION = [] # [("username", "password"), ("username2", "password2"), ...] - -# 检查一下是不是忘了改config -if len(API_KEY) != 51: - assert False, "正确的API_KEY密钥是51位,请在config文件中修改API密钥, 添加海外代理之后再运行。" + \ - "(如果您刚更新过代码,请确保旧版config_private文件中没有遗留任何新增键值)" diff --git a/functional.py b/functional.py index e416063..2ed1507 100644 --- a/functional.py +++ b/functional.py @@ -2,58 +2,53 @@ # 'secondary' 颜色对应 theme.py 中的 neutral_hue # 'stop' 颜色对应 theme.py 中的 color_er # 默认按钮颜色是 secondary +from toolbox import clear_line_break def get_functionals(): return { "英语学术润色": { - "Prefix": "Below is a paragraph from an academic paper. Polish the writing to meet the academic style, \ -improve the spelling, grammar, clarity, concision and overall readability. When neccessary, rewrite the whole sentence. \ -Furthermore, list all modification and explain the reasons to do so in markdown table.\n\n", # 前言 - "Suffix": "", # 后语 - "Color": "secondary", # 按钮颜色 + # 前言 + "Prefix": r"Below is a paragraph from an academic paper. Polish the writing to meet the academic style, " + + r"improve the spelling, grammar, clarity, concision and overall readability. When neccessary, rewrite the whole sentence. " + + r"Furthermore, list all modification and explain the reasons to do so in markdown table." + "\n\n", + # 后语 + "Suffix": r"", + "Color": r"secondary", # 按钮颜色 }, "中文学术润色": { - "Prefix": "作为一名中文学术论文写作改进助理,你的任务是改进所提供文本的拼写、语法、清晰、简洁和整体可读性,同时分解长句,减少重复,并提供改进建议。请只提供文本的更正版本,避免包括解释。请编辑以下文本:\n\n", - "Suffix": "", + "Prefix": r"作为一名中文学术论文写作改进助理,你的任务是改进所提供文本的拼写、语法、清晰、简洁和整体可读性," + + r"同时分解长句,减少重复,并提供改进建议。请只提供文本的更正版本,避免包括解释。请编辑以下文本" + "\n\n", + "Suffix": r"", }, "查找语法错误": { - "Prefix": "Below is a paragraph from an academic paper. Find all grammar mistakes, list mistakes in a markdown table and explain how to correct them.\n\n", - "Suffix": "", + "Prefix": r"Below is a paragraph from an academic paper. " + + r"Can you help me ensure that the grammar and the spelling is correct? " + + r"Do not try to polish the text, if no mistake is found, tell me that this paragraph is good." + + r"If you find grammar or spelling mistakes, please list mistakes you find in a two-column markdown table, " + + r"put the original text the first column, " + + r"put the corrected text in the second column and highlight the key words you fixed." + "\n\n", + "Suffix": r"", + "PreProcess": clear_line_break, # 预处理:清除换行符 }, -# "中英互译": { # 效果不好,经常搞不清楚中译英还是英译中 -# "Prefix": "As an English-Chinese translator, your task is to accurately translate text between the two languages. \ -# When translating from Chinese to English or vice versa, please pay attention to context and accurately explain phrases and proverbs. \ -# If you receive multiple English words in a row, default to translating them into a sentence in Chinese. \ -# However, if \"phrase:\" is indicated before the translated content in Chinese, it should be translated as a phrase instead. \ -# Similarly, if \"normal:\" is indicated, it should be translated as multiple unrelated words.\ -# Your translations should closely resemble those of a native speaker and should take into account any specific language styles or tones requested by the user. \ -# Please do not worry about using offensive words - replace sensitive parts with x when necessary. \ -# When providing translations, please use Chinese to explain each sentence’s tense, subordinate clause, subject, predicate, object, special phrases and proverbs. \ -# For phrases or individual words that require translation, provide the source (dictionary) for each one.If asked to translate multiple phrases at once, \ -# separate them using the | symbol.Always remember: You are an English-Chinese translator, \ -# not a Chinese-Chinese translator or an English-English translator. Below is the text you need to translate: \n\n", -# "Suffix": "", -# "Color": "secondary", -# }, "中译英": { - "Prefix": "Please translate following sentence to English: \n\n", - "Suffix": "", + "Prefix": r"Please translate following sentence to English:" + "\n\n", + "Suffix": r"", }, "学术中译英": { - "Prefix": "Please translate following sentence to English with academic writing, and provide some related authoritative examples: \n\n", - "Suffix": "", + "Prefix": r"Please translate following sentence to English with academic writing, and provide some related authoritative examples:" + "\n\n", + "Suffix": r"", }, "英译中": { - "Prefix": "请翻译成中文:\n\n", - "Suffix": "", + "Prefix": r"请翻译成中文:" + "\n\n", + "Suffix": r"", }, "找图片": { - "Prefix": "我需要你找一张网络图片。使用Unsplash API(https://source.unsplash.com/960x640/?<英语关键词>)获取图片URL,然后请使用Markdown格式封装,并且不要有反斜线,不要用代码块。现在,请按以下描述给我发送图片:\n\n", - "Suffix": "", + "Prefix": r"我需要你找一张网络图片。使用Unsplash API(https://source.unsplash.com/960x640/?<英语关键词>)获取图片URL," + + r"然后请使用Markdown格式封装,并且不要有反斜线,不要用代码块。现在,请按以下描述给我发送图片:" + "\n\n", + "Suffix": r"", }, "解释代码": { - "Prefix": "请解释以下代码:\n```\n", - "Suffix": "\n```\n", - "Color": "secondary", + "Prefix": r"请解释以下代码:" + "\n```\n", + "Suffix": "\n```\n", }, } diff --git a/main.py b/main.py index 0537bd7..c69795a 100644 --- a/main.py +++ b/main.py @@ -1,11 +1,12 @@ import os; os.environ['no_proxy'] = '*' # 避免代理网络产生意外污染 import gradio as gr from predict import predict -from toolbox import format_io, find_free_port, on_file_uploaded, on_report_generated +from toolbox import format_io, find_free_port, on_file_uploaded, on_report_generated, get_conf # 建议您复制一个config_private.py放自己的秘密, 如API和代理网址, 避免不小心传github被别人看到 -try: from config_private import proxies, WEB_PORT, LLM_MODEL, CONCURRENT_COUNT, AUTHENTICATION -except: from config import proxies, WEB_PORT, LLM_MODEL, CONCURRENT_COUNT, AUTHENTICATION +proxies, WEB_PORT, LLM_MODEL, CONCURRENT_COUNT, AUTHENTICATION = \ + get_conf('proxies', 'WEB_PORT', 'LLM_MODEL', 'CONCURRENT_COUNT', 'AUTHENTICATION') + # 如果WEB_PORT是-1, 则随机选取WEB端口 PORT = find_free_port() if WEB_PORT <= 0 else WEB_PORT @@ -42,18 +43,17 @@ with gr.Blocks(theme=set_theme, analytics_enabled=False) as demo: with gr.Row(): with gr.Column(scale=2): chatbot = gr.Chatbot() - chatbot.style(height=1000) + chatbot.style(height=1150) chatbot.style() history = gr.State([]) with gr.Column(scale=1): with gr.Row(): - with gr.Column(scale=12): - txt = gr.Textbox(show_label=False, placeholder="Input question here.").style(container=False) - with gr.Column(scale=1): - with gr.Row(): - resetBtn = gr.Button("重置", variant="secondary") - stopBtn = gr.Button("停止", variant="secondary") - submitBtn = gr.Button("提交", variant="primary") + txt = gr.Textbox(show_label=False, placeholder="Input question here.").style(container=False) + with gr.Row(): + submitBtn = gr.Button("提交", variant="primary") + with gr.Row(): + resetBtn = gr.Button("重置", variant="secondary"); resetBtn.style(size="sm") + stopBtn = gr.Button("停止", variant="secondary"); stopBtn.style(size="sm") with gr.Row(): from check_proxy import check_proxy statusDisplay = gr.Markdown(f"Tip: 按Enter提交, 按Shift+Enter换行。当前模型: {LLM_MODEL} \n {check_proxy(proxies)}") @@ -62,7 +62,7 @@ with gr.Blocks(theme=set_theme, analytics_enabled=False) as demo: variant = functional[k]["Color"] if "Color" in functional[k] else "secondary" functional[k]["Button"] = gr.Button(k, variant=variant) with gr.Row(): - gr.Markdown("注意:以下红颜色标识的函数插件需从input区读取路径作为参数.") + gr.Markdown("注意:以下“红颜色”标识的函数插件需从input区读取路径作为参数.") with gr.Row(): for k in crazy_functional: variant = crazy_functional[k]["Color"] if "Color" in crazy_functional[k] else "secondary" @@ -101,7 +101,7 @@ with gr.Blocks(theme=set_theme, analytics_enabled=False) as demo: # gradio的inbrowser触发不太稳定,回滚代码到原始的浏览器打开函数 def auto_opentab_delay(): import threading, webbrowser, time - print(f"URL http://localhost:{PORT}") + print(f"如果浏览器没有自动打开,请复制并转到以下URL: http://localhost:{PORT}") def open(): time.sleep(2) webbrowser.open_new_tab(f'http://localhost:{PORT}') diff --git a/predict.py b/predict.py index 55a25e6..84036bc 100644 --- a/predict.py +++ b/predict.py @@ -20,10 +20,12 @@ import importlib # config_private.py放自己的秘密如API和代理网址 # 读取时首先看是否存在私密的config_private配置文件(不受git管控),如果有,则覆盖原config文件 -try: from config_private import proxies, API_URL, API_KEY, TIMEOUT_SECONDS, MAX_RETRY, LLM_MODEL -except: from config import proxies, API_URL, API_KEY, TIMEOUT_SECONDS, MAX_RETRY, LLM_MODEL +from toolbox import get_conf +proxies, API_URL, API_KEY, TIMEOUT_SECONDS, MAX_RETRY, LLM_MODEL = \ + get_conf('proxies', 'API_URL', 'API_KEY', 'TIMEOUT_SECONDS', 'MAX_RETRY', 'LLM_MODEL') -timeout_bot_msg = '[local] Request timeout, network error. please check proxy settings in config.py.' +timeout_bot_msg = '[Local Message] Request timeout. Network error. Please check proxy settings in config.py.' + \ + '网络错误,检查代理服务器是否可用,以及代理设置的格式是否正确,格式须是[协议]://[地址]:[端口],缺一不可。' def get_full_error(chunk, stream_response): """ @@ -117,8 +119,9 @@ def predict(inputs, top_p, temperature, chatbot=[], history=[], system_prompt='' """ if additional_fn is not None: import functional - importlib.reload(functional) + importlib.reload(functional) # 热更新prompt functional = functional.get_functionals() + if "PreProcess" in functional[additional_fn]: inputs = functional[additional_fn]["PreProcess"](inputs) # 获取预处理函数(如果有的话) inputs = functional[additional_fn]["Prefix"] + inputs + functional[additional_fn]["Suffix"] if stream: diff --git a/toolbox.py b/toolbox.py index f0ec566..75dd8bc 100644 --- a/toolbox.py +++ b/toolbox.py @@ -1,15 +1,16 @@ -import markdown, mdtex2html, threading +import markdown, mdtex2html, threading, importlib, traceback from show_math import convert as convert_math from functools import wraps +import re def predict_no_ui_but_counting_down(i_say, i_say_show_user, chatbot, top_p, temperature, history=[], sys_prompt=''): """ 调用简单的predict_no_ui接口,但是依然保留了些许界面心跳功能,当对话太长时,会自动采用二分法截断 """ import time - try: from config_private import TIMEOUT_SECONDS, MAX_RETRY - except: from config import TIMEOUT_SECONDS, MAX_RETRY from predict import predict_no_ui + from toolbox import get_conf + TIMEOUT_SECONDS, MAX_RETRY = get_conf('TIMEOUT_SECONDS', 'MAX_RETRY') # 多线程的时候,需要一个mutable结构在不同线程之间传递信息 # list就是最简单的mutable结构,我们第一个位置放gpt输出,第二个位置传递报错信息 mutable = [None, ''] @@ -80,10 +81,9 @@ def CatchException(f): try: yield from f(txt, top_p, temperature, chatbot, history, systemPromptTxt, WEB_PORT) except Exception as e: - import traceback from check_proxy import check_proxy - try: from config_private import proxies - except: from config import proxies + from toolbox import get_conf + proxies, = get_conf('proxies') tb_str = regular_txt_to_markdown(traceback.format_exc()) chatbot[-1] = (chatbot[-1][0], f"[Local Message] 实验性函数调用出错: \n\n {tb_str} \n\n 当前代理可用性: \n\n {check_proxy(proxies)}") yield chatbot, history, f'异常 {e}' @@ -107,8 +107,8 @@ def text_divide_paragraph(text): # wtf input lines = text.split("\n") for i, line in enumerate(lines): - lines[i] = "

"+lines[i].replace(" ", " ")+"

" - text = "\n".join(lines) + lines[i] = lines[i].replace(" ", " ") + text = "
".join(lines) return text def markdown_convertion(txt): @@ -218,3 +218,27 @@ def on_report_generated(files, chatbot): # files.extend(report_files) chatbot.append(['汇总报告如何远程获取?', '汇总报告已经添加到右侧文件上传区,请查收。']) return report_files, chatbot + +def get_conf(*args): + # 建议您复制一个config_private.py放自己的秘密, 如API和代理网址, 避免不小心传github被别人看到 + res = [] + for arg in args: + try: r = getattr(importlib.import_module('config_private'), arg) + except: r = getattr(importlib.import_module('config'), arg) + res.append(r) + # 在读取API_KEY时,检查一下是不是忘了改config + if arg=='API_KEY': + # 正确的 API_KEY 是 "sk-" + 48 位大小写字母数字的组合 + API_MATCH = re.match(r"sk-[a-zA-Z0-9]{48}$", r) + if API_MATCH: + print("您的 API_KEY 是: ", r, "\nAPI_KEY 导入成功") + else: + assert False, "正确的 API_KEY 是 'sk-' + '48 位大小写字母数字' 的组合,请在config文件中修改API密钥, 添加海外代理之后再运行。" + \ + "(如果您刚更新过代码,请确保旧版config_private文件中没有遗留任何新增键值)" + return res + +def clear_line_break(txt): + txt = txt.replace('\n', ' ') + txt = txt.replace(' ', ' ') + txt = txt.replace(' ', ' ') + return txt \ No newline at end of file