Twitter: raymondcamden


Address: Lafayette, LA, USA

Question for readers: Site Map of a ColdFusion site?

01-10-2014 5,400 views ColdFusion 18 Comments

This email came in while I was in Ohio and I'm really stumped by it. Anyone out there have a suggestion for creating a "map" of a ColdFusion app?

I recently inherited a ColdFusion application where no one is left who knows it from the code. They tell me that one of the things with this app is that when a feature or fix is implemented, often days or weeks later a new error will pop up from that earlier implementation.

I was hoping you could help direct me to a way to create a map of the site, similar to what Visio does, only instead of following href, it would also follow cfinclude, cfhttp, etc, ignoring any conditional statements. Similar to a database model where you can track table relationships with primary and foreign keys, I'd like to create a map of the application so I know how each page or file accesses and is accessed by each other page or file.

18 Comments

  • Gene Shats #
    Commented on 01-10-2014 at 12:47 PM
    Great question!
  • Commented on 01-10-2014 at 1:09 PM
    I am not sure how you go about displaying it graphically, but you could CFdirectory over the hold list and search the file for CFINCLUDES if found add it to the list.

    1. Use CFDirectory to get list of pages
    2. Search for cflinclude & extract the filename (Regex Probably)
    3. Update the table accordingly

    Table
    -----------
    Pageid, Pagename, ParentPageId, PageChecked

    Then use a recursion function to display the results. I often use these functions for generating Category / Sub Category / Sub Sub Sub Category Navs.

    The only issue I see off hand is you would have to make sure the script handled multuple parent
    page_ids

    That's my thought right off the bat... After this was working one could add cfhttp etc....

    Randy
  • Commented on 01-10-2014 at 1:26 PM
    As I starting to code a demo for this, I do see that you would need to account for directory structure as well for the end report.
  • Gene Shats #
    Commented on 01-10-2014 at 1:39 PM
    Randy,
    How would you handle mixtures of relative and absolute paths in the links? Also, do you think there might be a way to use the new html5/css3/javascript libraries to draw a visual map?
  • Chip #
    Commented on 01-10-2014 at 1:41 PM
    I think Randy is on the right track, I'd suggest adding counters to the report so you could develop a heat map type of view based on the commonality of the files.

    Though if you capture that data, you might want to look at feeding it into something like Gephi (https://gephi.org/). Based on that, you could size your nodes based on the heat map type values from the earlier report for node size and relationships to populate the vertices.

    I think that outputs to Sigma.js as well (http://sigmajs.org/) for browsing.
  • Commented on 01-10-2014 at 1:50 PM
    Gene,

    Good question.

    Probably use expandPath to expand Relative paths.. I know CF and/or CFLIB has some pretty cool functions for extracting filenames.

    I think the end result would be getting all the files into a full path so you could keep track of which files you have parsed and more importantly you need to know the path to properly map parent files with their child files.

    Not sure on the last part of your question.

    Randy
  • roger tubby #
    Commented on 01-10-2014 at 2:35 PM
    I've been faced with a similar problem and needed to do some coverage analysis of some sites that have hundreds of (probably) unused templates, CFCs, etc.

    I looked at capturing the GetDebugger() events and parsing the trace of calls but this is obviously expensive and required debugging to be turned on (not good for production.)

    I then did an injection of a simple Trace(GetCurrentTemplatePath()) call at the beginning of each CFML page and have collected some useful coverage data.

    It's strange that this type of info is not more readily available from the framework. I also find it interesting that adding new templates/resource is so much easier than finding out which ones can be removed - there must be a philosophical reason for this.
  • Michael Long #
    Commented on 01-10-2014 at 5:25 PM
    "I also find it interesting that adding new templates/resource is so much easier than finding out which ones can be removed - there must be a philosophical reason for this."

    Roger, the thing is you can't tell what files might be "safe" to remove from a simple site map or tree walk.

    LANDING_PAGE.CFM might have no links to it from other CFM pages on your site, but zap it and you might take down your company's Google Adwords campaign. Is AFFILIATE.CFM unused, or is a link to it only handed out to clients via email?

    Is an image resource "unused" with no references to it in any file, or is it referenced in a database table?

    Basically, it all boils down to metadata that exists outside of the system itself.
  • Tami Burke #
    Commented on 01-11-2014 at 7:50 AM
    Years ago, Rizal Firmansyah had started an app called 'CF Project Cleaner' that would traverse a CF project for unused files and housekeeping. purposes. I don't know if it ever got out of beta, I tried it, and it looked promising at the time (circa CF6-7ish). Perhaps a reach out to him may help. (http://www.masrizal.com/index.cfm?fuseaction=idea....)
  • Tami Burke #
    Commented on 01-11-2014 at 7:57 AM
    Another option I have used in the past, while not elegant, is to open
    Dreamweaver and do search entire site for a file. Not automated, and I can also do it in CFBuilder, but Dreamweaver is faster and doesn't interfere with my CFB coded progress.
  • Michael Long #
    Commented on 01-11-2014 at 8:08 AM
    A long time ago I wrote an object-oriented ORM that ran under CF 3.1. Usage was...

    $lt;cf_do class="member" method="new" result="member">

    At some level, that would eventually translate to the cfinclude tag needed to include the code for the desired method.

    $lt;cfinclude template="#classpath##method#.cfm">

    Note that the path and name in the tag are being assembled dynamically. The upshot here is that again a traditional code walk would find none of those "method" files.

    The same would hold true of "dispatch" frameworks like Fusebox (action="xyz").
  • Paul Rowe #
    Commented on 01-13-2014 at 9:37 AM
    I remember considering this sort of thing while I was working at a previous job. My idea, since the site I was working with had a lot of files that were no longer used, was to design something like a spider. There are a few things that can cause trouble for you.

    Start with your entry points and follow a href, form action, cfinclude, cfmodule, cfobject, and cflocation tags. Look for calls to custom tags. When you're evaluating relative paths, I believe they're relative to the file where caller scope is when you're in a custom tag, so your logic needs to account for that.

    Just to throw a wrench in the works, though, the site I was working with would store file paths in a database and you'd have a dynamic value for the a href or cfinclude tags on certain pages.
  • Bret #
    Commented on 01-14-2014 at 7:34 AM
    I think this might help, but I have not used it personally...

    "Website Cartographer is a tool that allows you to map out your site's structure."
    http://cartographer.riaforge.org/

    "Unlike traditional website mapping tools, which work just by spidering links, Website Cartographer analyses the applications you use (e.g. Fusebox, BlogCFC, etc) and catches all public pages, regardless of where they are linked from (or indeed if the links are correct). In addition to this, Website Cartographer will also make educated guesses to determine additional information (for example, page priority, last modified, and so on) for each page, allowing for exports which are more useful than those produced by other tools."
  • Michael Long #
    Commented on 01-14-2014 at 9:10 AM
    Came up with another example. I have a shopping cart system where people can buy certain products that need special handling on the site.

    In one case, people can buy an email subscription, so code needs to be run to add them to the proper list for the correct length of time. Upgrading a subscription needs to add a year to the length of the current sub, and so on.

    This is done by having a product field that contains a set of keywords that are in fact module names: xyz-new, xyz-renew. When a product is purchased, those modules are called so that their code can be run and their actions performed.

    Again, those are includes where the template is dynamic (e.g. template="#product.module#"). Since the names only exist in a database, they'd look abandoned to a tree-walk routine.
  • roger tubby #
    Commented on 01-14-2014 at 9:21 AM
    The "Website Cartographer" looks promising - at least on paper. There doesn't seem to be any meat behind the writeup and it has been downloaded 0 times.

    If anyone is interested in my implementation of the Trace() insertion logic I could package it up. It will only record templates/components that are actually called so it is more of a coverage analysis than a full site treemap.
  • Commented on 01-14-2014 at 6:00 PM
    I use Xmind for logic mapping. It's not a direct dump from CF, but it's the best tool out there for creating relationships and logic trees. The free version has virtually everything you need. Cross platform and super quick to learn. I love it. We use it at work (ad agency) all the time.

    http://www.xmind.net
  • Commented on 01-18-2014 at 11:11 AM
    Tool like this would be priceless. I end up with projects that nobody else wants or can't handle, full of spaghetti code and tons of unused files.
  • Sean #
    Commented on 01-19-2014 at 12:02 AM
    I inherited a large, scatterbrained CF app, parts of the code of which were as much as 12 years old, and there were thousands --really, thousands-- of 'test' and obsolete files included in the dir structure. (You know the drill...you have order.cfm as a production file and then you look in the dir on the server and there are 18 variants of it with people's initials and the word 'test' and dates and all kinds of stuff...people still test in production.) To make things worse, some of the 'test'-named files are actually in use as includes and scheduled tasks and such. Of course, there was also a decade's worth of images and PDFs and such. I decided the most efficient way to strip out the crap was to start a blank git repo, and set up a virtual machine with CF/SQL Server/IIS on it, then change the hosts file to point all possible domain names / machine names to localhost, and then seal it off from any other outbound connections. Then I started trying to use the app/site locally on that virtual machine, using a combination of Chrome Dev Tools and IIS logs / CF errors to determine what was missing. It took about a week to get 90% or so into the repo. Then I handed out copies of the VHD to my team and had them start testing in the same isolated environment and reporting any missing items or issues. Soon I had 95%. When I felt comfortable, I used the repo to deploy to my integration and staging environments and shifted their testing there, and eventually I started deploying to the existing site, knowing that any files I had missed would simply continue to exist in production as before, and whenever they were noticed to be missing, I could add them to the repo. I'm happy to say that I feel I have >99.5% of the necessary files in the repo, I can deploy smoothly, and the repo contains something like 5% of the number of files that the production directory has. Once I hit a certain benchmark, I'm going to archive the existing production directory and deploy only my repo files plus non-tracked assets (thousands of images that change regularly, etc).

    Now to rebuild the whole dang app/site from scratch with a modern approach...wish me luck!

Post Reply

Please refrain from posting large blocks of code as a comment. Use Pastebin or Gists instead. Text wrapped in asterisks (*) will be bold and text wrapped in underscores (_) will be italicized.

Leave this field empty