Implementation:Openai Evals SnowflakeConnection
| Knowledge Sources | |
|---|---|
| Domains | Evaluation, Database Utilities |
| Last Updated | 2026-02-14 10:00 GMT |
Overview
Concrete database connection wrapper for Snowflake provided by the evals library.
Description
SnowflakeConnection is a utility class that simplifies connecting to and querying a Snowflake data warehouse. It provides lazy connection management, context-manager-based cursor handling, and a robust retry mechanism for transient database errors. The module also defines SnowflakeError, a custom exception raised when the initial connection fails.
Key behaviours:
- Lazy connection -- The actual
snowflake.connector.connect()call is deferred until the first operation that needs it. The private method_ensure_connected()checks whetherself.ctxis already set; if not, it establishes the connection. This avoids unnecessary connections when the object is instantiated but not immediately used. - Authentication modes -- Two authentication strategies are supported: (1) password-based authentication using explicit
user/passwordparameters or theSNOWFLAKE_USERNAME/SNOWFLAKE_PASSWORDenvironment variables; (2) browser-based SSO authentication (via the"externalbrowser"authenticator), which is automatically selected when neither a username nor password is available, falling back to the system$USERvariable. - Environment variable resolution -- Connection parameters (
account,user,password,database) are resolved by a priority chain: explicit constructor argument first, then the corresponding environment variable (SNOWFLAKE_ACCOUNT,SNOWFLAKE_USERNAME,SNOWFLAKE_PASSWORD,SNOWFLAKE_DATABASE). A private helper_first_not_none()implements this fallback logic. - Context manager -- The
__call__method is a@contextmanagerthat yields a database cursor and automatically closes it when the block exits, enabling thewith conn() as cs:pattern. - Query methods --
query()executes a single SQL statement (or batch viamany=True) and optionally returns results as a Pandas DataFrame (pandas_out=True) or a plain list (list_out=True).robust_query()wrapsquery()in a retry loop that catchesOperationalErrorandProgrammingError, sleeping 5 seconds between attempts, with an optionalmax_trialscap. - Logging suppression -- On connection, the Snowflake connector's verbose logging is suppressed to
WARNINGlevel, and theresult_batchlogger is set toCRITICALto hide a known benign error message about empty result sets.
Usage
Import SnowflakeConnection when you need to read from or write to a Snowflake warehouse within the evals framework (e.g. loading evaluation datasets, persisting results). The snowflake-connector-python package must be installed, and the appropriate environment variables must be set for the chosen authentication mode.
Code Reference
Source Location
- Repository: Openai_Evals
- File: evals/utils/snowflake.py
- Lines: 1-127
Signature
class SnowflakeError(Exception):
pass
class SnowflakeConnection:
def __init__(
self,
autocommit=True,
account: Optional[str] = None,
user: Optional[str] = None,
password: Optional[str] = None,
database: Optional[str] = None,
warehouse: Optional[str] = None,
paramstyle="pyformat",
):
def _ensure_connected(self):
def cursor(self, *args, **kwargs):
@contextmanager
def __call__(self, *args, **kwargs):
def query(self, *args, many=False, pandas_out=False, list_out=False, **kwargs):
def robust_query(self, max_trials: Optional[int] = None, *args, **kwargs):
Import
from evals.utils.snowflake import SnowflakeConnection, SnowflakeError
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| autocommit | bool |
No (default True) |
Whether to auto-commit each statement. Set to False for explicit transaction control.
|
| account | Optional[str] |
No | Snowflake account identifier. Falls back to SNOWFLAKE_ACCOUNT env var.
|
| user | Optional[str] |
No | Snowflake username. Falls back to SNOWFLAKE_USERNAME env var, then to system $USER (with browser auth).
|
| password | Optional[str] |
No | Snowflake password. Falls back to SNOWFLAKE_PASSWORD env var. If both user and password are None, browser-based SSO is used.
|
| database | Optional[str] |
No | Target database name. Falls back to SNOWFLAKE_DATABASE env var.
|
| warehouse | Optional[str] |
No (default None) |
Snowflake warehouse to use for compute. |
| paramstyle | str |
No (default "pyformat") |
Parameter binding style for SQL queries (e.g. "pyformat", "qmark").
|
Outputs
| Name | Type | Description |
|---|---|---|
| query result | None, pandas.DataFrame, or list |
query() returns None by default, a DataFrame if pandas_out=True, or a list of tuples if list_out=True.
|
| robust_query result | Same as query() |
Same as query(), but retries on transient OperationalError and ProgrammingError.
|
Usage Examples
from evals.utils.snowflake import SnowflakeConnection, SnowflakeError
# Basic connection using environment variables
conn = SnowflakeConnection(
warehouse="COMPUTE_WH",
)
# Execute a simple query, get results as a list
results = conn.query(
"SELECT id, name FROM my_table WHERE status = %(status)s",
{"status": "active"},
list_out=True,
)
print(results) # [(1, "Alice"), (2, "Bob"), ...]
# Get results as a Pandas DataFrame
df = conn.query(
"SELECT * FROM eval_results LIMIT 100",
pandas_out=True,
)
print(df.head())
# Use context manager for manual cursor control
with conn() as cursor:
cursor.execute("INSERT INTO logs (msg) VALUES (%s)", ("eval started",))
# Robust query with retry (up to 5 attempts)
conn.robust_query(
max_trials=5,
"INSERT INTO results (score) VALUES (%(score)s)",
{"score": 0.95},
)
# Explicit connection parameters (password auth)
conn = SnowflakeConnection(
account="my_org-my_account",
user="eval_service",
password="s3cret",
database="EVALS_DB",
warehouse="COMPUTE_WH",
)
# Handle connection errors
try:
conn._ensure_connected()
except SnowflakeError as e:
print(f"Connection failed: {e}")