adding docs for the llm chat interface#552
Open
mohalkh5 wants to merge 1 commit into
Open
Conversation
monaghaa
requested changes
Jun 26, 2026
monaghaa
left a comment
Contributor
There was a problem hiding this comment.
This looks like more than it is.... To shorten up the launch instructions into more of a "Quick start", I rearranged some materials in that section and put several of the configuration-related steps into a drop-down.
| The **LLM Chat Interface** is an interactive Open OnDemand application that provides a browser-based chat experience powered by [Chainlit](https://docs.chainlit.io) and [Ollama](https://ollama.com). When you launch a session, the application starts an Ollama server on a GPU-equipped compute node and connects it to CURC-hosted large language models (LLMs). You can ask questions, draft and debug code, summarize documents, and analyze images (with vision-capable models), all from your web browser. | ||
|
|
||
| ```{important} | ||
| The LLM Chat Interface is currently offered as a beta service. Functionality, available models, and resource allocations may change as we gather feedback and refine the service. Please report issues or suggestions through the [CURC support form](https://colorado.service-now.com/req_portal?id=ucb_sc_rc_form). |
Contributor
There was a problem hiding this comment.
Suggested change
| The LLM Chat Interface is currently offered as a beta service. Functionality, available models, and resource allocations may change as we gather feedback and refine the service. Please report issues or suggestions through the [CURC support form](https://colorado.service-now.com/req_portal?id=ucb_sc_rc_form). | |
| The LLM Chat Interface is currently offered as a beta service. Functionality, available models, and resource allocations may change as we gather feedback and refine the service. Please report issues or make suggestions through the [CURC support form](https://colorado.service-now.com/req_portal?id=ucb_sc_rc_form). |
| 2. Navigate to either the **Interactive Apps** drop-down menu or the **My Interactive Sessions** tab and select **LLM Chat Interface**. | ||
| 3. Review the launch form fields: | ||
|
|
||
| - Ollama model path — Select which model library Ollama should load. The default **CURC LLM Models** uses CURC-hosted models. You may also provide the **absolute path** to your own Ollama model directory, if you have pulled or fine-tuned models there. See [Ollama documentation](../ai-ml/llms.md#ollama) for more details. |
Contributor
There was a problem hiding this comment.
Suggested change
| - Ollama model path — Select which model library Ollama should load. The default **CURC LLM Models** uses CURC-hosted models. You may also provide the **absolute path** to your own Ollama model directory, if you have pulled or fine-tuned models there. See [Ollama documentation](../ai-ml/llms.md#ollama) for more details. | |
| - Ollama model path — Select which model library Ollama should load. The default **CURC LLM Models** uses CURC-hosted models. You may also provide the **absolute path** to your own Ollama model directory, if you have downloaded or fine-tuned models there. See [Ollama documentation](../ai-ml/llms.md#ollama) for more details. |
| 4. If you selected **Preset configuration**, choose **10 cores, 1 GPU, 1 hour**. This submits your job to Alpine with one GPU, which is required to run the LLM backend. | ||
|
|
||
| ```{important} | ||
| When you launch with **Preset configuration**, your job is submitted to Alpine **testing** hardware (`atesting_a100` partition, `testing` QoS). Testing resources are shared and **limited in capacity**, so sessions may queue during high demand, run for a maximum of **one hour**, and are not intended for sustained or large-scale work. For longer runtimes or heavier workloads, use **Custom configuration** and request appropriate resources. |
Contributor
There was a problem hiding this comment.
Suggested change
| When you launch with **Preset configuration**, your job is submitted to Alpine **testing** hardware (`atesting_a100` partition, `testing` QoS). Testing resources are shared and **limited in capacity**, so sessions may queue during high demand, run for a maximum of **one hour**, and are not intended for sustained or large-scale work. For longer runtimes or heavier workloads, use **Custom configuration** and request appropriate resources. |
| @@ -0,0 +1,164 @@ | |||
| # LLM Chat Interface | |||
|
|
|||
| The **LLM Chat Interface** is an interactive Open OnDemand application that provides a browser-based chat experience powered by [Chainlit](https://docs.chainlit.io) and [Ollama](https://ollama.com). When you launch a session, the application starts an Ollama server on a GPU-equipped compute node and connects it to CURC-hosted large language models (LLMs). You can ask questions, draft and debug code, summarize documents, and analyze images (with vision-capable models), all from your web browser. | |||
Contributor
There was a problem hiding this comment.
Suggested change
| The **LLM Chat Interface** is an interactive Open OnDemand application that provides a browser-based chat experience powered by [Chainlit](https://docs.chainlit.io) and [Ollama](https://ollama.com). When you launch a session, the application starts an Ollama server on a GPU-equipped compute node and connects it to CURC-hosted large language models (LLMs). You can ask questions, draft and debug code, summarize documents, and analyze images (with vision-capable models), all from your web browser. | |
| The **LLM Chat Interface** is an interactive Open OnDemand application that provides a browser-based chat experience to support a variety of [use cases](#use-cases). It is powered by [Chainlit](https://docs.chainlit.io) and [Ollama](https://ollama.com). When you launch a session, the application starts an Ollama server on a GPU-equipped compute node and connects it to CURC-hosted large language models (LLMs). You can ask questions, draft and debug code, summarize documents, and analyze images (with vision-capable models), all from your web browser. |
| - Ollama model path — Select which model library Ollama should load. The default **CURC LLM Models** uses CURC-hosted models. You may also provide the **absolute path** to your own Ollama model directory, if you have pulled or fine-tuned models there. See [Ollama documentation](../ai-ml/llms.md#ollama) for more details. | ||
| - Configuration type — Choose Preset configuration (recommended for most users) or Custom configuration for advanced resource control. For details on these options, see [Configuring Open OnDemand interactive applications](./configuring_apps.md). | ||
|
|
||
| 4. If you selected **Preset configuration**, choose **10 cores, 1 GPU, 1 hour**. This submits your job to Alpine with one GPU, which is required to run the LLM backend. |
Contributor
There was a problem hiding this comment.
Suggested change
| 4. If you selected **Preset configuration**, choose **10 cores, 1 GPU, 1 hour**. This submits your job to Alpine with one GPU, which is required to run the LLM backend. | |
| - If you selected **Preset configuration**, choose **10 cores, 1 GPU, 1 hour**. This submits your job to Alpine with one GPU, which is required to run the LLM backend. |
|
|
||
| 5. If you selected **Custom configuration**, you **must** request at least one GPU in the **gres** field (for example, `gpu:1`). See the [Limitations](#limitations) section for guidance on GPU memory (VRAM) and model size. | ||
| 6. Click **Launch** and wait for your session to start. When the job is ready, click **Connect to LLM Chat Interface** to open the chat in a new browser tab. | ||
|
|
Contributor
There was a problem hiding this comment.
Suggested change
| ```{important} |
| 5. If you selected **Custom configuration**, you **must** request at least one GPU in the **gres** field (for example, `gpu:1`). See the [Limitations](#limitations) section for guidance on GPU memory (VRAM) and model size. | ||
| 6. Click **Launch** and wait for your session to start. When the job is ready, click **Connect to LLM Chat Interface** to open the chat in a new browser tab. | ||
|
|
||
|
|
Contributor
There was a problem hiding this comment.
Suggested change
| When you launch with **Preset configuration**, your job is submitted to Alpine **testing** hardware (`atesting_a100` partition, `testing` QoS). Testing resources are shared and **limited in capacity**, so sessions may queue during high demand. Preset sessions run for a maximum of **one hour**, and are not intended for sustained or large-scale work. For longer runtimes or heavier workloads, use **Custom configuration** and request appropriate resources (jobs will be subject to queue waits and may not start immediately). | |
|
|
||
| 4. If you selected **Preset configuration**, choose **10 cores, 1 GPU, 1 hour**. This submits your job to Alpine with one GPU, which is required to run the LLM backend. | ||
|
|
||
| ```{important} |
Contributor
There was a problem hiding this comment.
Suggested change
| ```{important} |
|
|
||
| ```{important} | ||
| When you launch with **Preset configuration**, your job is submitted to Alpine **testing** hardware (`atesting_a100` partition, `testing` QoS). Testing resources are shared and **limited in capacity**, so sessions may queue during high demand, run for a maximum of **one hour**, and are not intended for sustained or large-scale work. For longer runtimes or heavier workloads, use **Custom configuration** and request appropriate resources. | ||
| ``` |
| 1. Log in to [Open OnDemand](https://curc.readthedocs.io/en/latest/open_ondemand/index.html) using your CURC credentials. | ||
| 2. Navigate to either the **Interactive Apps** drop-down menu or the **My Interactive Sessions** tab and select **LLM Chat Interface**. | ||
| 3. Review the launch form fields: | ||
|
|
Contributor
There was a problem hiding this comment.
Suggested change
| ::::{dropdown} Additional guidance for launch form fields | |
| :icon: note |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.