使用vanna AI 进行text2sql的业务炼丹体验

本文以 mysql-ollama-chromadb 为例。

一、前置要求

1、安装 ollama，见官网：Ollama

自行拉取安装所需模型即可

2、安装 python ，官网下载即可

# 配置下全局镜像源
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

二、使用虚拟环境开发

cd your_project

# 这里使用 uv 来作为管理
pip install uv

# 安装依赖
uv pip install 'vanna[chromadb,ollama,mysql]'

# 进入虚拟环境
.venv\Scripts\activate

三、使用

1、编写代码，新建 main.py ，安装官方文档示例进行配置

使用 Ollama 和 ChromaDB 生成 SQL for MySQL – Vanna.AI

完整示例代码如下：（注意修改配置）

from vanna.ollama import Ollama
from vanna.chromadb import ChromaDB_VectorStore
from vanna.flask import VannaFlaskApp


class MyVanna(ChromaDB_VectorStore, Ollama):
    def __init__(self, config=None):
        ChromaDB_VectorStore.__init__(self, config=config)
        Ollama.__init__(self, config=config)

# 配置 ollama 模型
vn = MyVanna(config={'model': 'deepseek-r1:7b'})
# 连接数据库
vn.connect_to_mysql(host='my-host', dbname='my-db', user='my-user', password='my-password', port=123)



# ------------------ 训练信息 ------------------

# The information schema query may need some tweaking depending on your database. This is a good starting point.
# 读取表元信息
df_information_schema = vn.run_sql("SELECT * FROM INFORMATION_SCHEMA.COLUMNS")

# This will break up the information schema into bite-sized chunks that can be referenced by the LLM
plan = vn.get_training_plan_generic(df_information_schema)
plan

# If you like the plan, then uncomment this and run it to train
vn.train(plan=plan)

'''
# The following are methods for adding training data. Make sure you modify the examples to match your database.

# DDL statements are powerful because they specify table names, colume names, types, and potentially relationships
vn.train(ddl="""
    CREATE TABLE IF NOT EXISTS my-table (
        id INT PRIMARY KEY,
        name VARCHAR(100),
        age INT
    )
""")

# Sometimes you may want to add documentation about your business terminology or definitions.
vn.train(documentation="Our business defines OTIF score as the percentage of orders that are delivered on time and in full")

# You can also add SQL queries to your training data. This is useful if you have some queries already laying around. 
# You can just copy and paste those from your editor to begin generating new SQL.
vn.train(sql="SELECT * FROM my-table WHERE name = 'John Doe'")

# At any time you can inspect what training data the package is able to reference
training_data = vn.get_training_data()
training_data


# You can remove training data if there's obsolete/incorrect information. 
vn.remove_training_data(id='1-ddl')

## Asking the AI
# Whenever you ask a new question, it will find the 10 most relevant pieces of training data and use it as part of the LLM prompt to generate the SQL.
vn.ask(question=...)
'''

# ------------------ 训练信息 ------------------



# 启动 WebUI
app = VannaFlaskApp(vn)
app.run()

注意：这里训练如果数据表有变更可以多次训练

2、使用WebUI

直接 python main.py 命令启动运行！

期间可能会下载其他依赖，耐心等待多试几次即可。这里可能需要依赖魔法网络，或可参考 huggingface 的镜像网站解决：参见 HF-Mirror

启动成功后会显示地址，打开即可！

WebUI 提供：训练数据（DDL/文档/sql）、删除、问答训练标记等功能，从这里开始炼丹～

训练部分可参考文章：
How to Train – Vanna.AI Documentation
https://juejin.cn/post/7457182240075612201

使用vanna AI 进行text2sql的业务炼丹体验

一、前置要求

二、使用虚拟环境开发

三、使用

发表回复取消回复

近期文章

近期评论

恰饭链接

分类

归档

友情链接

分析统计

utm

一、前置要求

二、使用虚拟环境开发

三、使用

发表回复 取消回复

近期文章

近期评论

恰饭链接

分类

归档

友情链接

分析统计

utm

发表回复取消回复