Implementation:Iterative Dvc Testing Workspace Tests
| Knowledge Sources | |
|---|---|
| Domains | Testing, Workspace_Operations |
| Last Updated | 2026-02-10 10:00 GMT |
Overview
Reusable test suite classes for validating DVC workspace operations including imports, URL listing, and remote-targeted transfers. These classes are designed to be inherited by backend-specific test modules (e.g., S3, GCS, SSH) so that each storage backend can run the same set of workspace operation tests against its own fixtures. The module contains five primary test classes covering file/directory import, version-aware import, URL listing, URL-based get, and remote-targeted add/import workflows.
Source: dvc/testing/workspace_tests.py (401 lines)
Signature
class TestImport:
def test_import(self, tmp_dir, dvc, workspace): ...
def test_import_dir(self, tmp_dir, dvc, workspace, stage_md5, dir_md5): ...
def test_import_empty_dir(self, tmp_dir, dvc, workspace, is_object_storage): ...
class TestImportURLVersionAware:
def test_import_file(self, tmp_dir, dvc, remote_version_aware): ...
def test_import_dir(self, tmp_dir, dvc, remote_version_aware): ...
def test_import_no_download(self, tmp_dir, dvc, remote_version_aware, scm): ...
def match_files(fs, entries, expected): ...
class TestLsUrl:
def test_file(self, cloud, fname): ...
def test_dir(self, cloud): ...
def test_recursive(self, cloud): ...
def test_nonexistent(self, cloud): ...
class TestGetUrl:
def test_get_file(self, cloud, tmp_dir): ...
def test_get_dir(self, cloud, tmp_dir): ...
def test_get_url_to_dir(self, cloud, tmp_dir, dname): ...
def test_get_url_nonexistent(self, cloud): ...
class TestToRemote:
def test_add_to_remote(self, tmp_dir, dvc, remote, workspace): ...
def test_import_url_to_remote_file(self, tmp_dir, dvc, workspace, remote): ...
def test_import_url_to_remote_dir(self, tmp_dir, dvc, workspace, remote): ...
Import
from dvc.testing.workspace_tests import TestImport, TestLsUrl, TestGetUrl, TestToRemote
Key Classes
TestImport
Tests basic import of files and directories from a workspace remote using dvc.imp_url("remote://workspace/..."). Includes three test methods:
| Method | Description |
|---|---|
test_import |
Imports a single file from the workspace remote and verifies content and clean status |
test_import_dir |
Imports a nested directory structure and verifies file contents, directory layout, and optional .dvc file content via stage_md5 / dir_md5 fixtures
|
test_import_empty_dir |
Imports an empty directory, handling object storage backends (which use a trailing-slash empty file) vs. filesystem backends |
The class provides overridable fixtures (stage_md5, dir_md5, is_object_storage) that default to pytest.skip(), allowing backend-specific test modules to supply concrete values.
TestImportURLVersionAware
Tests version-aware imports that track version IDs (e.g., S3 object versioning). Covers file import, directory import, and no-download import modes. Key behaviors tested:
- Verifying
can_pushisFalseon version-aware outputs - Detecting update availability via
dvc.status()when the remote file changes - Running
dvc.update()and confirming the new version is fetched - Checking that
version_idchanges across updates whiledef_pathstays the same - Testing
no_download=Truemode with subsequentdvc.pull()and Git tag-based checkout
TestLsUrl
Tests ls_url() for listing files and directories at external URLs. Uses the cloud fixture and the helper function match_files(fs, entries, expected) for assertion. Tests include:
- Parameterized file listing: Tests listing of files at paths
"foo","foo.dvc", and"dir/foo" - Directory listing: Lists immediate children of a directory
- Recursive listing: Tests recursive listing with various
maxdepthvalues (0, 1, 2, and unlimited) - Nonexistent path: Verifies
URLMissingErroris raised
TestGetUrl
Tests Repo.get_url() for downloading files and directories from external URLs:
| Method | Description |
|---|---|
test_get_file |
Downloads a single file and verifies content |
test_get_dir |
Downloads a directory and verifies structure and content |
test_get_url_to_dir |
Parameterized test downloading into existing directories (".", "dir", "dir/subdir")
|
test_get_url_nonexistent |
Verifies URLMissingError for nonexistent URLs
|
TestToRemote
Tests to_remote=True workflows where data is transferred directly to a DVC remote without downloading locally:
| Method | Description |
|---|---|
test_add_to_remote |
Uses dvc.add(url, to_remote=True) to add a file directly to the remote cache; verifies the .dvc file is created but local file does not exist, and the cached content matches
|
test_import_url_to_remote_file |
Uses dvc.imp_url(url, to_remote=True) for a single file; verifies dependency tracking, hash info, and cached content
|
test_import_url_to_remote_dir |
Uses dvc.imp_url(url, to_remote=True) for a directory; verifies the .dir manifest in the cache contains correct relpaths and each file part is stored correctly
|
Helper Function
def match_files(fs, entries, expected):
"""Assert that entries match expected by comparing normalized (path, isdir) tuples."""
entries_content = {(fs.normpath(d["path"]), d["isdir"]) for d in entries}
expected_content = {(fs.normpath(d["path"]), d["isdir"]) for d in expected}
assert entries_content == expected_content
Dependencies
| Dependency | Usage |
|---|---|
pytest |
Test framework, fixtures, parametrize, skip |
funcy.first |
Retrieve first element from iterables (used in version-aware tests) |
dvc.exceptions.URLMissingError |
Expected exception for nonexistent URLs |
dvc.repo.Repo |
Repo.get_url() static method
|
dvc.repo.ls_url |
ls_url() and parse_external_url() functions
|
dvc.utils.fs.remove |
File removal utility |