Skip to content

feat(python-notebook-migration): add LLM client for notebook-to-workflow conversion#5260

Open
zyratlo wants to merge 6 commits into
apache:mainfrom
zyratlo:migration-tool-llm-client
Open

feat(python-notebook-migration): add LLM client for notebook-to-workflow conversion#5260
zyratlo wants to merge 6 commits into
apache:mainfrom
zyratlo:migration-tool-llm-client

Conversation

@zyratlo

@zyratlo zyratlo commented May 28, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this PR?

Introduces the frontend LLM session class that converts a Jupyter notebook into a Texera workflow JSON plus a bidirectional cell to operator mapping, along with the prompt library it uses. Two files under frontend/src/app/workspace/service/notebook-migration/, totalling ~700 lines (~410 of which is prompt text).

migration-llm.ts — defines NotebookMigrationLLM, an @Injectable class wrapping a Vercel AI SDK chat session against the LiteLLM proxy already exposed on main at /api/chat/completion.

  • initialize(modelType, apiKey) — builds an OpenAI-compatible chat client via createOpenAI({ baseURL: AppSettings.getApiEndpoint() }), seeds the message history with Texera documentation as system messages.
  • verifyConnection() — does a 10-token ping call to validate that the API key works against the configured model.
  • convertNotebookToWorkflow(notebook) — extracts code cells (each tagged with a UUID in metadata.uuid), sends WORKFLOW_PROMPT + the notebook to get a JSON of UDF operators / edges, then sends MAPPING_PROMPT to get the cell↔operator mapping. Assembles a complete Texera workflow JSON (PythonUDFV2 operators with stub input/output ports, links derived from the LLM's edge list, default settings) plus a bidirectional operator_to_cell / cell_to_operator mapping. Returns both as a JSON string.
  • close() — clears the message history and the model reference.

migration-prompts.ts — string constants used by migration-llm.ts: TEXERA_OVERVIEW, TUPLE_DOCUMENTATION, TABLE_DOCUMENTATION, OPERATOR_DOCUMENTATION, UDF_INPUT_PORT_DOCUMENTATION, EXAMPLE_OF_GOOD_CONVERSION, VISUALIZER_DOCUMENTATION, EXAMPLE_OF_MULTIPLE_UDF_CONVERSION, WORKFLOW_PROMPT, MAPPING_PROMPT.

Any related issues, documentation, discussions?

Closes #5259
Parent issue #4301

How was this PR tested?

No unit tests were included for these reasons:

  • A large portion of the changes are prompt text, which are not testable, only readable. However the prompt text can be changed to improve the performance of the LLM.
  • Testing would require mocking a significant amount of logic that will be introduced in later PRs, since the logic in migration-llm.ts is parsing a response.

However I am open to writing tests based on review feedback.

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.7)

@github-actions github-actions Bot added the frontend Changes related to the frontend GUI label May 28, 2026
@codecov-commenter

codecov-commenter commented May 28, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 52.93%. Comparing base (891d2ad) to head (b900b9d).

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #5260      +/-   ##
============================================
- Coverage     52.95%   52.93%   -0.03%     
- Complexity     2627     2630       +3     
============================================
  Files          1090     1090              
  Lines         42210    42210              
  Branches       4534     4534              
============================================
- Hits          22353    22344       -9     
- Misses        18546    18556      +10     
+ Partials       1311     1310       -1     
Flag Coverage Δ *Carryforward flag
access-control-service 70.91% <ø> (ø) Carriedforward from a090029
agent-service 34.36% <ø> (ø) Carriedforward from a090029
amber 53.15% <ø> (+0.03%) ⬆️ Carriedforward from a090029
computing-unit-managing-service 1.65% <ø> (ø) Carriedforward from a090029
config-service 56.71% <ø> (ø) Carriedforward from a090029
file-service 57.06% <ø> (ø) Carriedforward from a090029
frontend 47.86% <ø> (-0.07%) ⬇️
pyamber 89.77% <ø> (ø) Carriedforward from a090029
python 90.73% <ø> (ø) Carriedforward from a090029
workflow-compiling-service 58.69% <ø> (ø) Carriedforward from a090029

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Yicong-Huang Yicong-Huang changed the title feat(python-notebook-migration, frontend): add LLM client for notebook-to-workflow conversion feat(python-notebook-migration): add LLM client for notebook-to-workflow conversion May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

frontend Changes related to the frontend GUI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Notebook Migration] Add LLM client for notebook-to-workflow conversion

2 participants