fix(lfx): read local files under S3 storage to unblock the Assistant#13862
fix(lfx): read local files under S3 storage to unblock the Assistant#13862Cristhianzl wants to merge 1 commit into
Conversation
WalkthroughThe PR updates file-reading and file-size helpers to treat existing absolute local paths as local during S3 mode. It also adds regression tests for local-path reads, S3-key routing, and parse_text_file_to_data with a real local file. ChangesLocal files during S3 storage mode
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 9✅ Passed checks (9 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
✅ Test Coverage AdvisorNo source changes detected without accompanying tests. Thanks for keeping coverage up! 🎉
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## release-1.11.0 #13862 +/- ##
==================================================
- Coverage 59.65% 59.07% -0.58%
==================================================
Files 2346 2347 +1
Lines 224374 225111 +737
Branches 31392 34472 +3080
==================================================
- Hits 133841 132981 -860
- Misses 89000 90595 +1595
- Partials 1533 1535 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/lfx/src/lfx/base/data/storage_utils.py`:
- Around line 27-42: The helper _is_existing_local_file is classifying absolute
paths too narrowly by requiring the file to already exist, which lets missing
absolute local paths fall through to parse_storage_path() and surface the wrong
error. Update _is_existing_local_file so any absolute path is treated as a local
file candidate, and let the later filesystem read/stat path in the storage
helpers surface FileNotFoundError instead of ValueError. Make the same
adjustment anywhere this helper is used so absolute paths are handled
consistently before S3 key parsing.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 1bcfd800-c931-45b4-84dd-32bf3e60c90b
📒 Files selected for processing (3)
src/lfx/src/lfx/base/data/storage_utils.pysrc/lfx/tests/unit/base/data/test_storage_utils.pysrc/lfx/tests/unit/base/data/test_utils.py
| def _is_existing_local_file(file_path: str) -> bool: | ||
| """Return True when file_path is an absolute path to a real local file. | ||
|
|
||
| Under S3 storage some callers still hand us genuine local paths (e.g. the | ||
| Langflow Assistant injects the installed lfx components dir into a Directory | ||
| node). A real local file is never an S3 key, so it must be read from disk | ||
| instead of being parsed as "flow_id/filename". See issue #13798. | ||
|
|
||
| The check is restricted to ABSOLUTE paths on purpose: S3 keys are always | ||
| relative ("flow_id/filename"), so requiring an absolute path makes it | ||
| impossible to mistake a relative S3 key for a same-named file that happens | ||
| to exist relative to the process CWD. | ||
| """ | ||
| try: | ||
| path_obj = Path(file_path) | ||
| return path_obj.is_absolute() and path_obj.is_file() |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Classify absolute local paths before checking existence.
Absolute local paths are never valid S3 keys. With the current helper, a missing path like /tmp/missing.txt still falls through to parse_storage_path() and raises ValueError instead of the documented FileNotFoundError. Treat any absolute path as local here and let the filesystem read/stat surface the real error.
Suggested fix
-def _is_existing_local_file(file_path: str) -> bool:
- """Return True when file_path is an absolute path to a real local file.
+def _is_absolute_local_path(file_path: str) -> bool:
+ """Return True when file_path is an absolute local filesystem path.
@@
"""
try:
- path_obj = Path(file_path)
- return path_obj.is_absolute() and path_obj.is_file()
+ return Path(file_path).is_absolute()
except OSError:
return False
@@
- if _is_existing_local_file(file_path):
+ if _is_absolute_local_path(file_path):
return Path(file_path).read_bytes()
@@
- if _is_existing_local_file(file_path):
+ if _is_absolute_local_path(file_path):
return Path(file_path).stat().st_sizeAlso applies to: 91-93, 180-182
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/lfx/src/lfx/base/data/storage_utils.py` around lines 27 - 42, The helper
_is_existing_local_file is classifying absolute paths too narrowly by requiring
the file to already exist, which lets missing absolute local paths fall through
to parse_storage_path() and surface the wrong error. Update
_is_existing_local_file so any absolute path is treated as a local file
candidate, and let the later filesystem read/stat path in the storage helpers
surface FileNotFoundError instead of ValueError. Make the same adjustment
anywhere this helper is used so absolute paths are handled consistently before
S3 key parsing.
|
Nice fix! Checking for existing local files before treating paths as S3 keys prevents the Assistant's local component paths from being parsed as storage keys. The absolute-path restriction is a smart guard against false positives. |
What
When
LANGFLOW_STORAGE_TYPE=s3is configured, the storage-aware file readers (read_file_bytes,read_file_text, andget_file_size) incorrectly routed every path through the S3 key parser, including genuine local filesystem paths.As a result, attempting to read an absolute local path raised:
This broke the Langflow Assistant. It injects the installed
lfxcomponents directory into aDirectorynode (inject_lfx_components_path), which scans the local filesystem and passes absolute file paths toparse_text_file_to_data(). Since those paths were incorrectly treated as S3 object keys, the read failed with:Fixes #13798.
Summary by CodeRabbit