CiteSource 0.2.0 (CRAN Submission Ready) by TNRiley · Pull Request #256 · ESHackathon/CiteSource

TNRiley · 2026-05-15T17:44:37Z

No description provided.

update with current main

- Add expand_metadata_columns() for expanding multiple columns efficiently - Add expand_single_metadata_column() for single column operations - Standardizes separator pattern to ',\\s*' for consistent handling - Reduces code duplication and improves maintainability

- Replace base R strsplit/table approach with tidyr-based helper - Standardizes separator pattern to ',\\s*' for consistent handling - More readable and maintainable code - Output format remains compatible with existing code

- Replace three separate separate_rows() calls with single expand_metadata_columns() call - More efficient single-pass expansion of all metadata columns - Standardizes separator pattern to ',\\s*' - Maintains same functionality and output format

- Replace three separate if blocks with a loop using column mapping - Use expand_single_metadata_column() instead of separate_rows() - Standardizes separator pattern to ',\\s*' - More maintainable and efficient code - Maintains all existing functionality including label warnings

- Update calculate_initial_records() to use expand_single_metadata_column() - Update calculate_detailed_records() to use expand_single_metadata_column() - Update calculate_phase_records() to use expand_single_metadata_column() and expand_metadata_columns() - Standardizes separator pattern to ',\\s*' throughout - More efficient and maintainable code

- Replace two separate_rows() calls with single expand_metadata_columns() call - Standardizes separator pattern to ',\\s*' - More efficient single-pass expansion - Removes redundant str_trim() calls (handled by helper function)

- Replace separate_rows() with expand_single_metadata_column() - Use proper column reference instead of positional index - Standardizes separator pattern to ',\\s*' - More maintainable and consistent code

- Fix incorrect calculation: use nrow(rv) instead of nrow(rv) - rv is expanded metadata (count_unique output) which inflates the count - rv is the actual unique citations after deduplication - Add calculation of duplicates removed (n_citations - n_unique_records) - Improve message formatting with: - Clear line breaks for readability - Number formatting with commas (e.g., 9,175 instead of 9175) - More informative structure showing totals, unique, and duplicates removed - Apply fixes to both app.R and app2.R - Resolves issue where message showed more unique citations than total citations

- Replace separate_rows() calls in unique_separated_phase() with expand_metadata_columns() - Replace separate_rows() + mutate(trimws) + filter in detailed_table_data() with expand_single_metadata_column() - Standardizes separator pattern to ',\\\\s*' throughout - More efficient and consistent with R package functions

- Replace separate_rows() calls in unique_separated_phase() with expand_metadata_columns() - Replace separate_rows() + mutate(trimws) + filter in detailed_table_data() with expand_single_metadata_column() - Replace separate_rows() in completeness_data() with expand_single_metadata_column() - Standardizes separator pattern to ',\\\\s*' throughout - More efficient and consistent with R package functions

- Update all calls to expand_metadata_columns() and expand_single_metadata_column() to use CiteSource::: prefix - This allows Shiny apps to access internal (non-exported) helper functions - Fixes error: 'could not find function expand_single_metadata_column' - Helper functions remain internal (not exported) as per best practices - Apply changes to both app.R and app2.R

- Exclude 'unknown' cite_source values after expanding metadata columns - Records with 'screened' or 'final' labels intentionally have empty cite_source - These get converted to 'unknown' during deduplication with show_unknown_tags=TRUE - Including 'unknown' in detailed record table causes: * Misleading source row that isn't actually a search source * Incorrect counts (Records Imported, Distinct Records) * Skewed percentage calculations (Source Contribution %, etc.) * Affected Total row calculations - Filtering 'unknown' ensures table only analyzes actual search sources - Consistent with calculate_phase_records() which already filters 'unknown' - Does not affect other tables/visuals (precision/sensitivity table uses different function) - Apply fix to both app.R and app2.R

- Added interactive "Use This" buttons to Card View for granular selection of Title, Author, Abstract, and Journal. - Implemented custom JavaScript to handle cross-column button toggling and provide immediate visual feedback. - Added server-side listener `input$field_preference_click` to store user preferences. - Implemented `apply_field_preferences` logic to overwrite surviving records with user selections in the final merged dataset. - Updated "Default" badge logic to correctly handle cases where one value is missing. - Fixed `record_id` vs `duplicate_id` column resolution during merge. - Resolved "unknown column" warnings by properly initializing `field_preferences` before JSON serialization.

- Add parameters use_custom, blocking_rounds, validation_criteria to dedup_citations(). - When use_custom=TRUE, call internal dedup_citations_custom() with configurable blocking rounds and validation criteria; fall back to default ASySD on error or if custom implementation is not loaded. - New R/dedup_custom.R: custom ASySD wrapper with default blocking rounds and validation criteria matching ASySD behaviour, plus optional stats on which criteria identified duplicate pairs. Shiny app (inst/shiny-app/CiteSource/app2.R): - Wire in custom deduplication option so users can run configurable dedup from the app (e.g. UI for use_custom and related options). Documentation: - Add AUTHOR_HANDLING.md: end-to-end documentation of author name handling, including expected format (e.g. "Last, First and Last, First"), cleaning rules, and behaviour in import, cleaning, deduplication, citation generation, export, and data conversion.

…nette revisions - Shiny app.R: full UI overhaul with bslib cards, workflow stepper, export hub, bidirectional filters, smart empty states, card view dedup, global.R bootstrap - Vendor ASySD deduplication engine into R/asys_dedup.R; remove ASySD GitHub dependency - Drop app2.R (superseded by refactored app.R) - Migrate all pipes to native |>; vectorize APA citation generation - CRAN compliance: specific @importFrom declarations, globalVariables, remove plogr - Add renv and renv.lock for reproducible dependency management - Revise vignettes: benchmark testing, screening phases, db-validation, topic coverage - Add search string comparison vignette - Fix deployment workflow: restore rsconnect + deploy steps, add RENV_CONFIG_SNAPSHOT_VALIDATE - Update CITATION.cff to v0.2.0 with release date - NEWS.md: full v0.2.0 changelog - Add custom_dedup_notes.md to .gitignore (local reference doc, not for commit)

- Remove citesource_working_example and citesource_benchmark_testing (deleted) - Add citesource_search_string_comparison (new) - Change url from http to https to match DESCRIPTION

- Remove deprecated record_counts_table(), record_summary_table(), precision_sensitivity_table() from R/tables.R - Add CITESOURCE2_ANALYSIS.md to .Rbuildignore - Fix pkgdown workflow: install from local source (devtools::install) instead of remotes::install_github() which installed from main branch

split(.$type) and split(.$facet) use magrittr's dot which is not supported by the native pipe - replace with anonymous function wrappers

calculate_initial_records(), calculate_detailed_records(), and calculate_phase_records() moved to count.R; create_initial_record_table(), create_detailed_record_table(), and create_precision_sensitivity_table() moved to tables.R. Delete new_count_and_table.R.

- Add CITATION.cff, REQUIREMENTS.*, ASySD info to .Rbuildignore - Document show_labels and log_scale params in plot_source_overlap_heatmap - Fix devtools::install(upgrade = 'never') to upgrade = FALSE (older devtools compat)

…urce pak (used by extra-packages) marks transitive deps as 'deps' source type, which shinyapps.io rejects. renv::install installs into renv's own library with proper GitHub source metadata that rsconnect and shinyapps.io both understand.

Use renv::install for CiteSource

Fix workflow conditional syntax for GitHub Pages deploy step

- Fix Title to use title case per CRAN policy - Rewrite Description to not start with package name - Add CRAN eval guard to all vignettes so chunks are skipped when vignette data is absent (R CMD check), preventing file-not-found errors - Exclude vignette data directories, shinytest fixtures, and .claude worktree from build via .Rbuildignore, reducing tarball from 31 MB to 2.6 MB

- Update.Rbuildignore to prevent worktree from entering tarball - Quote 'RIS' and 'CSV' in Description to avoid spell-check flags - Replace pre-built with prebuilt in Description to avoid hyphen-split flag - Fix maintainer email to tnriley@gmail.com - Replace \dontrun{} with \donttest{} in reimport_csv example - Update cran-comments.md: correct note count, add reverse dependency statement

- Remove "An R Package for" from Title field per CRAN policy - Add single quotes around 'shiny' in Description field - Update RoxygenNote and add Config/roxygen2/version after roxygen2 upgrade

- Add @return/@value tags to all five flagged exported functions: export_bib, export_ris, plot_contributions, plot_source_overlap_upset, reimport_ris; side-effect functions use "No return value" phrasing - Remove default filename values from export_csv(), export_ris(), and export_bib() so functions no longer write to getwd() by default - Update examples to use tempfile() instead of bare filenames - Fix write_refs() fallback (file=TRUE) to write to tempdir() - Fix reimport_csv() example: replace \donttest{} with if(interactive()) to prevent execution of a non-existent path during --run-donttest check - Regenerate all affected .Rd files via devtools::document()

- Add CRAN-SUBMISSION to .Rbuildignore to suppress top-level file note - Add resubmission section to cran-comments.md with point-by-point responses to CRAN reviewer feedback - Update R CMD check results to reflect clean 0/0/1 result - Track CRAN-SUBMISSION artifact from initial submission

Replace .data$id with -"id" in across() call per tidyselect 1.2.0

Previously, confirming manual duplicate pairs triggered generate_dup_id and merge_metadata across all unique citations regardless of how many pairs were selected. For large datasets this was unnecessarily slow. Now only the records whose duplicate_id appears in additional_pairs are processed. Unaffected records are passed through unchanged and recombined with bind_rows, turning an O(N) operation into O(manual pairs).

Adds a `fields` parameter to `export_csv()` with three modes: - "full" (default): all columns, reimportable into CiteSource - "standard": core bibliographic + cite_source/label/string, suitable for import into RELApp and other screening tools - character vector: user-defined custom column selection Non-full exports emit a warning that the file cannot be reimported into CiteSource via reimport_csv(). In the Shiny app, adds a radio button preset selector and a conditional custom column picker to the CSV download card. Also fixes the Shiny CSV download handler to use export_csv() (which correctly sets row.names = FALSE) instead of write.csv().

…el position - Standard preset label now lists exact fields instead of vague description - Custom column picker marks required CiteSource reimport fields with ' *' - Adds legend note '* required for CiteSource reimport' in custom picker - Moves Cite CiteSource panel to top of export tab

Renames citesource_analysis_across_screening_phases.rmd to .Rmd so pkgdown finds it on case-sensitive Linux CI runners. Fixes broken expression syntax in manual deploy workflow condition.

TNRiley and others added 30 commits November 19, 2025 21:25

Merge pull request #241 from ESHackathon/main

1ff003f

update with current main

Refactor export_csv() to use expand_single_metadata_column()

89b799d

- Replace separate_rows() with expand_single_metadata_column() - Use proper column reference instead of positional index - Standardizes separator pattern to ',\\s*' - More maintainable and consistent code

removed test metadata completeness feature maybe future dev

c174fcf

removed "got to pair #" function in card view options

5d2c396

added information on how deduplication works

6e94404

Documentation

b7077ec

Fix _pkgdown.yml: remove deleted vignettes, add new one, fix URL scheme

665a765

- Remove citesource_working_example and citesource_benchmark_testing (deleted) - Add citesource_search_string_comparison (new) - Change url from http to https to match DESCRIPTION

Documentation

0233ecc

Fix magrittr dot placeholder incompatibility with native pipe

b70e7b6

split(.$type) and split(.$facet) use magrittr's dot which is not supported by the native pipe - replace with anonymous function wrappers

Documentation

e1c673e

Fix CRAN check warnings and CI workflow error

b520410

- Add CITATION.cff, REQUIREMENTS.*, ASySD info to .Rbuildignore - Document show_labels and log_scale params in plot_source_overlap_heatmap - Fix devtools::install(upgrade = 'never') to upgrade = FALSE (older devtools compat)

Documentation

e98de1a

Update cran-comments.md with clean check results and vendored code note

12e6b5b

Merge branch 'dev' into Trevor-Patch

c2ffc88

TNRiley and others added 15 commits May 13, 2026 20:13

Merge pull request #254 from ESHackathon/fix-citesource-dep

43e9cfb

Use renv::install for CiteSource

Fix workflow conditional syntax for GitHub Pages deploy step

30b6603

Merge pull request #255 from ESHackathon/fix-citesource-dep

fafd647

Fix workflow conditional syntax for GitHub Pages deploy step

updates for CRAN submission

73bdb3a

Documentation

9ab814d

Merge branch 'dev' of https://github.com/ESHackathon/CiteSource into dev

e01038f

Documentation

d33896d

description update - prebuilt caught in spellcheck so removed

ad97566

Merge branch 'dev' of https://github.com/ESHackathon/CiteSource into dev

8c3cd76

update gitignore

3c48992

Updated CRAN comments with win-devel result

a003088

update page URL for landing page and vignettes

febbdd8

TNRiley requested review from DrMattG, LukasWallrich and kaitlynhair May 15, 2026 17:44

TNRiley and others added 12 commits May 18, 2026 09:57

Fix progress bar crash when importing 100+ files in Shiny

05abbea

Fix integer/double type error in cli progress bar total

f4ac05f

Fix DESCRIPTION title formatting and quote package names

cd08b9a

- Remove "An R Package for" from Title field per CRAN policy - Add single quotes around 'shiny' in Description field - Update RoxygenNote and add Config/roxygen2/version after roxygen2 upgrade

Add Win-devel check results to cran-comments

728bc0a

Fix tidyselect deprecation warning in generate_apa_reference

6b2a4bd

Replace .data$id with -"id" in across() call per tidyselect 1.2.0

Documentation

c502d73

fix: rename vignette to .Rmd and fix manual workflow deploy condition

0019839

Renames citesource_analysis_across_screening_phases.rmd to .Rmd so pkgdown finds it on case-sensitive Linux CI runners. Fixes broken expression syntax in manual deploy workflow condition.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CiteSource 0.2.0 (CRAN Submission Ready)#256

CiteSource 0.2.0 (CRAN Submission Ready)#256
TNRiley wants to merge 68 commits into
mainfrom
dev

TNRiley commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

TNRiley commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants