Filesystem Walker¶
Random file system navigator.
Classes:
| Name | Description |
|---|---|
FSWalker |
Abstract file system walker. |
FSEntry |
Lightweight wrapper for os.DirEntry with only path and name. |
FSPachinkoPin |
Represents a 'pin' on the Pachinko board. |
PachinkoFSWalker |
Simulates a Pachinko machine. |
FSWalker()
dataclass
¶
FSEntry(path: str, stem: str, ext: str, size: int)
dataclass
¶
Lightweight wrapper for os.DirEntry with only path and name.
Methods:
| Name | Description |
|---|---|
from_direntry |
Create a lightweight FSEntry from an os.DirEntry. |
__hash__ |
Return the hash based on the file path. |
__fspath__ |
Return the file system path representation. |
FSPachinkoPin(path: str, subdirs: list[str] = list(), files: list[FSEntry] = list(), is_scanned: bool = False, is_exhausted: bool = False)
dataclass
¶
Represents a 'pin' on the Pachinko board.
PachinkoFSWalker(root: str, quota: DiversityQuota, validator: FileValidator, rng: Random, should_follow_symlink: bool, board: dict[str, FSPachinkoPin] = dict())
dataclass
¶
Bases: FSWalker
Simulates a Pachinko machine.
For every file needed, we 'drop' a search cursor from the Root. It bounces randomly down directory paths until it settles on a file.
Methods:
| Name | Description |
|---|---|
__post_init__ |
Initialize the board with the root pin. |
reset |
Reset the walker and quota for a new batch. |
walk |
Continuously drop balls until the board is empty. |
drop |
Drop a ball from the root. |
get_valid_subdirs |
Get valid subdirectories for a given pin. |
mark_exhausted |
Mark a pin and all its subdirs as exhausted. |
should_descend |
Decide whether to descend into a subdir or select a file. |
scan |
Only look at the OS file system when a ball hits a specific folder for the first time. |
__post_init__() -> None
¶
Initialize the board with the root pin.
reset() -> None
¶
Reset the walker and quota for a new batch.
walk() -> Iterator[FSEntry]
¶
Continuously drop balls until the board is empty.
drop() -> FSEntry | None
¶
Drop a ball from the root.
get_valid_subdirs(pin: FSPachinkoPin) -> list[str]
¶
Get valid subdirectories for a given pin.
mark_exhausted(pin: FSPachinkoPin) -> None
¶
Mark a pin and all its subdirs as exhausted.
should_descend(*, has_subdirs: bool, has_files: bool) -> bool
¶
Decide whether to descend into a subdir or select a file.
scan(pin: FSPachinkoPin) -> None
¶
Only look at the OS file system when a ball hits a specific folder for the first time.