官网:Vanna.AI – Personalized AI SQL Agent
本文以 mysql-ollama-chromadb 为例。
一、前置要求
1、安装 ollama,见官网:Ollama
自行拉取安装所需模型即可
2、安装 python ,官网下载即可
# 配置下全局镜像源
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
二、使用虚拟环境开发
cd your_project
# 这里使用 uv 来作为管理
pip install uv
# 安装依赖
uv pip install 'vanna[chromadb,ollama,mysql]'
# 进入虚拟环境
.venv\Scripts\activate
三、使用
1、编写代码,新建 main.py ,安装官方文档示例进行配置
使用 Ollama 和 ChromaDB 生成 SQL for MySQL – Vanna.AI
完整示例代码如下:(注意修改配置)
from vanna.ollama import Ollama
from vanna.chromadb import ChromaDB_VectorStore
from vanna.flask import VannaFlaskApp
class MyVanna(ChromaDB_VectorStore, Ollama):
def __init__(self, config=None):
ChromaDB_VectorStore.__init__(self, config=config)
Ollama.__init__(self, config=config)
# 配置 ollama 模型
vn = MyVanna(config={'model': 'deepseek-r1:7b'})
# 连接数据库
vn.connect_to_mysql(host='my-host', dbname='my-db', user='my-user', password='my-password', port=123)
# ------------------ 训练信息 ------------------
# The information schema query may need some tweaking depending on your database. This is a good starting point.
# 读取表元信息
df_information_schema = vn.run_sql("SELECT * FROM INFORMATION_SCHEMA.COLUMNS")
# This will break up the information schema into bite-sized chunks that can be referenced by the LLM
plan = vn.get_training_plan_generic(df_information_schema)
plan
# If you like the plan, then uncomment this and run it to train
vn.train(plan=plan)
'''
# The following are methods for adding training data. Make sure you modify the examples to match your database.
# DDL statements are powerful because they specify table names, colume names, types, and potentially relationships
vn.train(ddl="""
CREATE TABLE IF NOT EXISTS my-table (
id INT PRIMARY KEY,
name VARCHAR(100),
age INT
)
""")
# Sometimes you may want to add documentation about your business terminology or definitions.
vn.train(documentation="Our business defines OTIF score as the percentage of orders that are delivered on time and in full")
# You can also add SQL queries to your training data. This is useful if you have some queries already laying around.
# You can just copy and paste those from your editor to begin generating new SQL.
vn.train(sql="SELECT * FROM my-table WHERE name = 'John Doe'")
# At any time you can inspect what training data the package is able to reference
training_data = vn.get_training_data()
training_data
# You can remove training data if there's obsolete/incorrect information.
vn.remove_training_data(id='1-ddl')
## Asking the AI
# Whenever you ask a new question, it will find the 10 most relevant pieces of training data and use it as part of the LLM prompt to generate the SQL.
vn.ask(question=...)
'''
# ------------------ 训练信息 ------------------
# 启动 WebUI
app = VannaFlaskApp(vn)
app.run()
注意:这里训练如果数据表有变更可以多次训练
2、使用WebUI
直接 python main.py
命令启动运行!
期间可能会下载其他依赖,耐心等待多试几次即可。这里可能需要依赖魔法网络,或可参考 huggingface 的镜像网站解决:参见 HF-Mirror

启动成功后会显示地址,打开即可!
WebUI 提供 :训练数据(DDL/文档/sql)、删除、问答训练标记等功能,从这里开始炼丹~
训练部分可参考文章:
How to Train – Vanna.AI Documentation
https://juejin.cn/post/7457182240075612201