Skip to content

Leo/relevance search#1126

Open
bfliao wants to merge 17 commits intomainfrom
leo/relevance-search
Open

Leo/relevance search#1126
bfliao wants to merge 17 commits intomainfrom
leo/relevance-search

Conversation

@bfliao
Copy link
Copy Markdown

@bfliao bfliao commented Apr 3, 2026

Problem:

Current catalog search returns results ranked purely by fuzzy match score. Users who regularly search for the same courses (e.g. their major requirements) have to scroll to the same results every session when they type the same search term. There's no personalization such that when users click on some classes in the past, these classes will be prioritized in the returned query.

Scope
Goals:

  • Retention of click history (how many times an user visits a particular class )
  • Re-rank search results after fuzzy search using history (local or server-side)
  • Work for anonymous users (no login required)

Non-Goals:

  • Modification on fuzzy search - breach of abstraction
  • Collaborative filtering ("people who took X also took Y")

@bfliao bfliao requested a review from PineND April 3, 2026 20:50
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 55905fbd93

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread apps/frontend/src/app/Catalog/index.tsx Outdated
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 7, 2026

Linting Failed

Note: The status check will always pass. Run npm run lint -- --continue to see the full output locally.

Click to expand lint output

> lint
> turbo run lint --continue --output-logs=errors-only


Attention:
Turborepo now collects completely anonymous telemetry regarding usage.
This information is used to shape the Turborepo roadmap and prioritize features.
You can learn more, including how to opt-out if you'd not like to participate in this anonymous program, by visiting the following URL:
https://turborepo.com/docs/telemetry

• Packages in scope: @repo/BtLL, @repo/common, @repo/eslint-config, @repo/gql-typedefs, @repo/shared, @repo/sis-api, @repo/storybook, @repo/theme, @repo/typescript-config, ag-frontend, api-sandbox, backend, datapuller, frontend, staff-frontend
• Running lint in 15 packages
• Remote caching disabled
�[;31mbackend:lint�[;0m
cache miss, executing ed9b309386c61683

> lint
> eslint src/


/home/runner/work/berkeleytime/berkeleytime/apps/backend/src/modules/semantic-search/client.ts
  46:13  error  Empty block statement  no-empty

✖ 1 problem (1 error, 0 warnings)

npm error Lifecycle script `lint` failed with error:
npm error code 1
npm error path /home/runner/work/berkeleytime/berkeleytime/apps/backend
npm error workspace backend
npm error location /home/runner/work/berkeleytime/berkeleytime/apps/backend
npm error command failed
npm error command sh -c eslint src/
[WARN] command finished with error, but continuing...
::error::backend#lint: command (/home/runner/work/berkeleytime/berkeleytime/apps/backend) /opt/hostedtoolcache/node/22.12.0/x64/bin/npm run lint exited (1)

 Tasks:    6 successful, 7 total
Cached:    0 cached, 7 total
  Time:    12.22s 
Failed:    backend#lint

 ERROR  run failed: command  exited (1)

@PineND
Copy link
Copy Markdown
Member

PineND commented Apr 7, 2026

Moving this request to draft until more code is added

@PineND PineND marked this pull request as draft April 7, 2026 23:50
@bfliao bfliao marked this pull request as ready for review April 16, 2026 05:42
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5b0c06bdac

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +283 to +287
for (const click of missingClicks) {
const item = catalogById.get(click.courseNumber);
if (!item) continue;
scoreMap.set(item, 0.6);
items.push(item);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Deduplicate missing-click entries before appending

RecentType.CourseClick stores each click with a new timestamp, so recentClicks can contain the same courseNumber multiple times. This loop appends one catalog item per click without deduping (and without updating coveredIds), so a single course can be inserted repeatedly into catalogSearch.results and totalCount can be overstated when that course was missing from Fuse hits. Please dedupe by course id (or mark ids as covered while appending).

Useful? React with 👍 / 👎.

@bfliao
Copy link
Copy Markdown
Author

bfliao commented Apr 18, 2026

@PineND merge?

@PineND
Copy link
Copy Markdown
Member

PineND commented Apr 23, 2026

Implementation blocker is that cache control for pagination is now private, so everytime someone fetch the catalog via search we create a unique cached response. Hit rate basically go to 0 and our system will get a lot more load. We would have to seperate the query into 2 layer, a generic layer that have higher hit rate, then reranking after. Very complex and im not sure if its worth it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants