Posted in ColdFusion | Posted on 02-02-2010 | 3,063 views
A few days ago I saw on Twitter a request for code that would convert roman numerals to decimal. CFLib has a UDF for going from decimal to Roman, but not the other way. I did a bit of searching and while I found a bunch of code libraries, I didn't find one that explained the logic behind the translation. Finally I came across this page: Roman Numerals, which I thought explained the issue very nicely. The basic process to convert from Roman to decimal is:
1) Read the numbers from left to right.
2) Each number is added to the next...
3) Except when the next number is larger than the current number. Then you take the pair and do a subtraction.
So with this logic in mind, I came up with the following UDF. It assumes valid Roman numerals for input. But it seems to work ok.
2 var romans = {};
3 var result = 0;
4 var pos = 1;
5 var char = "";
6 var thisSum = "";
7 var nextchar = "";
8
9 romans["I"] = 1;
10 romans["V"] = 5;
11 romans["X"] = 10;
12 romans["L"] = 50;
13 romans["C"] = 100;
14 romans["D"] = 500;
15 romans["M"] = 1000;
16
17 while(pos lte len(input)) {
18 char = mid(input, pos, 1);
19 //are we NOT at the end?
20 if(pos != len(input)) {
21 //check my next character - if bigger, replace with a sub
22 nextchar = mid(input, pos+1, 1);
23 if(romans[char] < romans[nextchar]) {
24 thisSum = romans[nextchar] - romans[char];
25 result += thisSum;
26 pos+=2;
27 } else {
28 result += romans[char];
29 pos++;
30 }
31 } else {
32 result += romans[char];
33 pos++;
34 }
35 }
36
37 return result;
38}
You can see how it follows the basic, 'left to right, add the numbers together' process, and how it notices when the current character has a higher number to the right of it. I wrote up a quick test script for it like so:
2<cfloop index="input" list="#inputs#">
3 <cfoutput>
4 #input#=#romantodec(input)#<br/>
5 </cfoutput>
6</cfloop>
Which produced:
XX=20
XI=11
IV=4
VIII=8
MC=1100
DL=550
XL=40
You can download this UDF at CFLib now: romanToDecimal
p.s. Sorry for those still waiting for UDF approval at CFLib. It is a volunteer process (myself, Scott Pinkston, Todd Sharp) so be patient!


#NumberFormat(1999, "roman")#
Which gives you:
MCMXCIX
We could write a rule that loops for IIN and simply replaces it with Val(N)-2.
Now I'm going to ask you to put up or shut up! ;) If you can find me proof that IIX (or IIC, etc) is valid, I'll support it. ;)
IIC is not even a valid Roman numeral (because you can't subtract 2 directly from 100; you would need to write it as XCIIX, for 10 less than 100, then 2 less than 10).
Also...
This form of notation closely follows Latin language usage, in which the number 18 is pronounced as duodeviginti, meaning two [deducted] from twenty (duo-de-viginti), and 19 is pronounced undeviginti, meaning one [deducted] from twenty (un-de-viginti).
So, if you can have 2 from 20, IIXX would be valid and come up wirth 18.
On a last note, it is clear that the rules are not really rules and have been changed over the last 2000 years. If IIX is not valid, at least, it shoud not retuen 10.
Good post BTW. Thank you for sharing.
http://en.wikipedia.org/wiki/Roman_numerals
If you can come up with a mod to the UDF to make it support XXY where X < Y, then I'll put it in. Otherwise, I can live with it. ;)
@Raymond +1 - seems as if all the converters out there use a similar approach.
if((pos + 2) < len(input) ){
nextchar2 = mid(input, pos+2, 1);
} else {//set nextchar2 to one will not allow anything to be smaller than it.
nextchar2 = 'I';
}
if(romans[char] == romans[nextchar] && romans[nexchar] < romans[nextchar2] ){
thisSum = romans[nextchar2] - romans[nextchar] - romans[char];
result +=thisSum
pos+=2;
}else if(romans[char] < romans[nextchar]) {
thisSum = romans[nextchar] - romans[char];
result += thisSum;
pos+=2;
} else {
result += romans[char];
pos++;
}
if((pos + 2) < len(input) ){
nextchar2 = mid(input, pos+2, 1);
} else {//set nextchar2 to one will not allow anything to be smaller than it.
nextchar2 = 'I';
}
if(romans[char] == romans[nextchar] && romans[nexchar] < romans[nextchar2] ){
thisSum = romans[nextchar2] - romans[nextchar] - romans[char];
result +=thisSum
pos+=3;
}else if(romans[char] < romans[nextchar]) {
thisSum = romans[nextchar] - romans[char];
result += thisSum;
pos+=2;
} else {
result += romans[char];
pos++;
}
Either way, if IIX is not valid, it certainly should not return 10. It should return INVALID.
This post makes me want to play with a very simple tokenizer. I don't know why, this is just really an interesting problem. Take the "comment" tag as an example. It is only meaningful in the following combination:
<!---
This means the parser has to read in 5 characters to build it... but it can't (say its an HTML comment, not a CFML one), then suddenly, it has to take the 4 preceding characters and treat them as individual tokens.
Maybe this is only interesting to me :)
http://www.jacfb.com/index.cfm/2010/2/4/Translatin...
Hmm, it keeps telling me my comment is spam.
That being said - your mod looks perfect! It works. But my ego forbids me from truly accepting that so I'm going to delete your comment and remove your BlogCFC from the Internet. Thanks for playing!
(No, instead, I'm going to update the CFLib version. Thanks!)
Invalid CFML construct found on line 33 at column 22.
ColdFusion was looking at the following text:
{
The CFML compiler was processing:
* a script statement beginning with "var" on line 33, column 9.
* a script statement beginning with "function" on line 32, column 1.
* a cfscript tag beginning on line 22, column 2.
The error occurred in D:\Inetpub\serv\roman.cfm: line 33
31 : */
32 : function romantodec(input) {
33 : var romans = {};
34 : var result = 0;
35 : var pos = 1;
[Add Comment] [Subscribe to Comments]