Closed Bug 24712 Opened 25 years ago Closed 25 years ago

Regexp greedy back-tracking failure

Categories

(Core :: JavaScript Engine, defect, P3)

x86
All
defect

Tracking

()

VERIFIED FIXED

People

(Reporter: rogerl, Assigned: rogerl)

Details

(Keywords: js1.5)

From 'Brad Diller <bdiller@my-deja.com>': After recently upgrading my JS engine to the current source, I discovered that there is a bug in the Regular Expression object. My version of the interpreter is unable to correctly process the following regular expression objects that I employ to parse key/values pairs from an ini-formatted text file. The following code reproduces the regular expression bug: function ProcessIni(sections) { this.sections = new Object(sections); } ProcessIni.identifierExpr = "([\\S]+([ \\t]+[\\S]+)*)"; ProcessIni.keyValuePair = ProcessIni.identifierExpr + "[ \\t]*=[ \\t]*" + ProcessIni.identifierExpr; // return an array of section's key/value pairs ProcessIni.prototype.getProfileSection = function(appName, buffer) { appName = appName.toLowerCase(); var section = new Object(); var re = new RegExp("\\s*\\[" + appName + "\\]([^[]*)" + "(\\[" + ProcessIni.identifierExpr + "\\])*", "i"); if ( (results = re.exec(buffer)) != null ) { var data = results[1]; if (data != null && data.length > 0) { re = new RegExp(ProcessIni.keyValuePair, "g"); RegExp.multiline = true; while ( (results = re.exec(data)) != null) { section[results[1].toLowerCase()] = results[3]; } } } this.sections[appName] = section; } var processIni = new ProcessIni(); processIni.getProfileSection("course", "[Course]\n" + " Course_Creator = Test Suite\n" + " Course_ID = TEST-COURSE-1\n" + " Course_System = Microsoft Visual Basic 4.0\n" + " Course_Title = TEST SUITE GENERATED COURSE\n" + " Level = 1\n" + " Max_Fields_CST = 5\n" + " Total_AUs = 11\n" + " Total_Blocks = 4\n" + " Version = 2.2\n" + " \n" + " \n" + "[Course_Behavior]\n" + " Max_Normal = 3\n" ); var courseTitle = processIni.sections["course"]["course_title"]; if (courseTitle == null) { print("No Course_Title found"); } else print("Course.Course_Title = " + courseTitle); __________ The JS shell is failing to parse the key/value pairs in the course subsection of the passed buffer, consequently, no “Course_Title” key/value is found. I believe revisions 3.19 jsregexp.c and 3.6 regexp.h are the culprit. After reverting to revisions 3.18 jsregexp.c and 3.5 of jsregexp.h, and rebuilding the JS engine, the interpreter correctly parsed the key/value pairs in the passed string buffer. *************************************************************************** I think this can be reduced to the following failure : re = /([\S]+([ \t]+[\S]+)*)[ \t]*=[ \t]*[\S]+/ re.exec("Course_Creator = Test"); which returns null.
Keywords: js1.5
Fixed - greedy recurser was mishandling the back-track to zero case. Also the parenCount wasn't getting reset and so bogus junk accumulated in the parens state.
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
Added testcase ecma_3/RegExp/regress-24712.js
Marking verified.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.