Skip to content

Add trim_cr utility function and apply it to CSV parsing in multiple files#417

Open
drbergman wants to merge 1 commit into
MathCancer:developmentfrom
drbergman:fix-parse-cr-lines-unix
Open

Add trim_cr utility function and apply it to CSV parsing in multiple files#417
drbergman wants to merge 1 commit into
MathCancer:developmentfrom
drbergman:fix-parse-cr-lines-unix

Conversation

@drbergman
Copy link
Copy Markdown
Collaborator

Fix CRLF line endings breaking CSV parsers on Unix/macOS

Problem

On Unix and macOS, std::getline does not strip the carriage return (\r) from lines with Windows-style CRLF (\r\n) endings. This is a common real-world scenario: Excel exports CSV with CRLF line endings even on macOS, and files authored on Windows carry the same encoding. Every line retains a trailing \r after getline, causing string comparisons to silently fail — e.g. "default\r" != "default", "oxygen\r" != "oxygen". This affects cell loading, rules parsing, and substrate initial conditions.

Closes / related: #272

Prior to this fix, loading a CRLF cell CSV would produce console output like:

Loading cells from detailed (v2) CSV file ./config/cells.csv ... 
 not found!ll_Definition for default
Warning! CSV file requests creating cell type default
        at 0 0 0 but I don't recognize that type. Skipping cell!

Creating default (type=0) at 10 0 0 

The first cell is skipped because "default\r" does not match "default". The last line of the file succeeds because it has no trailing \r (no CRLF after the final newline). The garbled not found!ll_Definition for default line is caused by the \r returning the cursor to column 0 mid-output and overwriting part of the previous message.

What changed

Added trim_cr(std::string& line) as a shared inline utility in BioFVM/BioFVM_utilities.h. It strips a trailing \r if present and is a no-op on LF-only files. It is called immediately after every std::getline that reads from a file in the affected parsers.

Files changed

File Change
BioFVM/BioFVM_utilities.h Added inline void trim_cr(std::string& line)
BioFVM/BioFVM_microenvironment.cpp Added #include "BioFVM_utilities.h"; call trim_cr after each file getline in load_initial_conditions_from_csv
modules/PhysiCell_geometry.cpp Call trim_cr after each file getline in load_cells_csv, load_cells_csv_v1, load_cells_csv_v2
core/PhysiCell_rules.cpp Call trim_cr after each file getline in parse_csv_rules_v0, parse_csv_rules_v1, parse_csv_rules_v3

Testing

Create a minimal cell CSV:

x,y,z,type
0,0,0,default
10,0,0,default

To save as CRLF in VS Code: Click the LF or CRLF indicator in the bottom-right status bar → select CRLF → save.

To save as LF in VS Code: Same control → select LF → save.

Confirm cells load correctly with both versions. Prior to this fix, the CRLF version would fail to match cell type names and skip all cells or exit with an error.

Copilot AI review requested due to automatic review settings May 18, 2026 19:46
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses CRLF (\r\n) line endings causing trailing \r characters to remain after std::getline on Unix/macOS, which breaks CSV parsing and downstream string comparisons in PhysiCell/BioFVM loaders and parsers.

Changes:

  • Added a shared inline trim_cr(std::string&) utility in BioFVM/BioFVM_utilities.h.
  • Applied trim_cr immediately after std::getline in multiple CSV-reading code paths (cells CSV loaders, rules CSV parsers, and substrate initial conditions CSV loader).
  • Added an explicit include of BioFVM_utilities.h where needed.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
modules/PhysiCell_geometry.cpp Trims trailing \r for cell CSV parsing (v1/v2 and version detection).
core/PhysiCell_rules.cpp Trims trailing \r for multiple rules CSV parser versions.
BioFVM/BioFVM_utilities.h Introduces trim_cr as a reusable inline helper.
BioFVM/BioFVM_microenvironment.cpp Trims trailing \r in substrate initial conditions CSV loader and adds required include.
Comments suppressed due to low confidence (1)

BioFVM/BioFVM_microenvironment.cpp:1479

  • char c = line.c_str()[0]; (and later fixed-position indexing into line) assumes std::getline succeeded and that the line has enough characters. If the CSV is empty or the first line is blank/short, this is undefined behavior. Add a guard for line.empty() (and length checks before using [2] / [4]) and emit a clear error if the header/first row is missing or malformed.
	std::string line; 
	std::getline( file , line );
	trim_cr(line);
	char c = line.c_str()[0];
	std::vector<int> substrate_indices;
	bool header_provided = false;
	if( c == 'X' || c == 'x' )
	{ 
		// do not support this with a header yet
		if ((line.c_str()[2] != 'Y' && line.c_str()[2] != 'y') || (line.c_str()[4] != 'Z' && line.c_str()[4] != 'z'))
		{

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread BioFVM/BioFVM_microenvironment.cpp
Comment thread BioFVM/BioFVM_utilities.h
Comment thread modules/PhysiCell_geometry.cpp
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants