If you follow me on Twitter you may have recently seen me mention "Project Picard" a few times. While I still can't publicly talk about what it entails, I'm hoping to share what I've learned while working on it. Picard makes use of ColdFusion 9 and ORM. While I've played a bit with ORM (check out my demo content management system here) this current project is definitely much larger and more complex than anything I've done with ORM before. I recently ran into a problem that I want to share. Please remember this is new to me and that I might not do the best job explaining it. I do think it is something other people will run into so I wanted to share my findings as soon as possible.
So what's the problem? This issue came about when I stored an entity (an entity is just another way of saying a persistent CFC, and yes, I'll probably say CFC and entity interchangeably) in the Session scope. Everything with this particular CFC worked fine while testing, but once I actually stored the entity I ran into odd errors.
The entity was a User component. I'll share the code here just so you can see what it looks like.
component persistent="true" {
property name="id" generator="native" ormtype="integer" fieldtype="id";
property name="usertype" fieldType="many-to-one" cfc="usertype" fkcolumn="typeidfk";
property name="userstatus" fieldType="many-to-one" cfc="userstatus" fkcolumn="statusidfk";
property name="guid" ormtype="string";
property name="email" ormtype="string";
property name="nickname" ormtype="string";
property name="firstname" ormtype="string";
property name="lastname" ormtype="string";
}
There isn't much to this component yet. But make note of the of the many-to-one fields. Those were a late addition to my code base. After I added them and tried to use on, in this case, usertype, I got an odd error:
Message could not initialize proxy - no Session
Detail
Extended Info
Tag Context E (-1)
E (-1)
I was so confused by the E(-1) that I missed the clue as to what was truly going wrong here: "no Session."
What happened here is rather simple if you understand two basic concepts. First off, Hibernate has a concept of session. Do not confuse this with ColdFusion's Session scope. I think it's best to think of the Hibernate session much like a ColdFusion Request scope instance. The Hibernate session represent the current request and handles all data manipulation. If you - for example, ask for an object and then change it, the session will know this and will handle the updates. As another example, if you ask for the same object twice in one request, Hibernate is smart enough to know it doesn't have to go back to the database. Please read Mark Mandel's excellent blog post on this for more details. (Explaining Hibernate Sessions)
So why do we care? Hibernate tries its best to be as lazy as possible. As we are all good programmers, we know that laziness is simply a way to be as efficient as possible. In this case - notice the many-to-one relationships? Hibernate said to itself, "Hey, this requires me to get more data and create more objects, and you know what, Ray may not even use them. So I'll wait till he asks for them."
This efficiency though is exactly what bit me in the rear. On user login I had asked Hibernate for the User entity. I copied this to my Session scope. On a later request I looked at the related property userType, and since Hibernate had never loaded this I got the error. The "no Session" message basically meant that Hibernate no longer knew how to deal with it and simply gave up.
I spoke with Rupesh of Adobe on this and with his help was able to come up with two "rules" you may want to keep in mind.
- When running a getWhatever on a entity persisted, any property that is lazy loaded will throw an error. I can get around this a few ways.
a) Use entityMerge. This forces the entity back into the current Hibernate session.
b) Disable the lazy loading. This is what I did for Picard. I just added lazy="false" to the two properties.
c) My least favorite option - before storing the entity in the Session, simply load the related properties. For example:
ray = new User();
ray.getUserType();
session.user = ray;
Notice I run getUserType but don't actually store the result.
- If I actually want to change the entity stored in the persistent scope I should entityMerge the component before running entitySave. In theory one could build a method in User.cfc to wrap this for you. That way I could do session.user.save() and it would merge itself and entitySave itself as well.
I hope this makes sense. If not, please let me know. So many of us are used to storing are CFCs in the persistent scopes, I can definitely see running into this issue if we aren't careful.
Archived Comments
Oh nice, I didn't know about the EntityMerge() method. Is that an undocumented method? I can't find it in the documentation (although, it wouldn't be the first time that the Adobe docs Search didn't come up with a match).
I don't have access to the Public CF9 docs - but it is definitely documented _post_ that.
Sounds good. Cool stuff.
"entity" in database logical term means a table ('table' is database phyiscal term)
nice, thanks for the tip.
Can a persistent component still extend another component? My personal ORM code includes quite a bit of "default" functionality.
I don't see why not. Picard makes use of inheritance in one area but those CFCs aren't persisted.
@Michael,
Here are some blog posts having to do with CF9 ORM and inheritance:
http://www.bennadel.com/blo...
http://www.silverwareconsul...
Ray,
This is exactly what I figured a lot of CF developers are going to do with Hibernate ;o)
This is why it is important to also understand Hibernate Object State when working with CF9 ORM.
I have another article here:
http://www.compoundtheory.c...
That attempts to explain it.
But the short version is - when you put the object in CF Session, and go to another request, it becomes _detached_. Which means it no longer has access to any Hibernate Session, and thefore, can't lazy load (or do quite a few things).
To get it to work again, you have to move it back to a _persistent_ state, and there are several strategies for doing this. Check out my blog post for more details.
By any wild chance is there a way to tell if an object is detached?
I don't believe there is, but you should generally know by what you are doing with your objects, and knowing the life-cycle of your Hibernate Session.
Generally speaking, its best to avoid detached objects whenever you can.
It would be nice if there was a simple way to tell Hibernate to load _everything_. Ie, get this ob and fully load it. Yes I can force it by calling getX, getY, etc, but I'd like to be able to do that when calling entityNew.
@Ray, I think you can already do it. Check the doc.
Sorry - but I'm not seeing that. If you can find the function/feature, I'd appreciate it.
You can do it with HQL (I'm pretty sure you can do it with Criteria as well, although I tend towards using HQL, and Criteria isn't openly exposed in CF).
Check out fetch joins:
http://docs.jboss.org/hiber...
So in this case it would be something like:
select user from User as user inner join fetch user.usertype inner join fetch user.userstatus
(On a site note, why 'usertype' and 'userstatus', the 'user' portion is kinda redundant, as it's on the User objects anyway... but that's just me being nitpicky ;o) )
Interesting. I still think it would be something that should belong in entityNew though.
Sorry - I meant entityLoad.
re: 'interesting. I still think it would be something that should belong in entityLoad though'
I don't think this belong on EntityGet at all.
When you start getting into complex object graphs, saying 'grab everything', could potentially grab your whole database. Eek!
Really, in Hibernate, get()/EntityLoad() is for simple operations operations, using the default configuration.
When you start wanting to do more complicated things, on a sepcific basis, that's when the power of HQL (and also Criteria) queries comes to the fore, as you have full power over fetching strategies, and a variety of other options.
You have two options - (1) that is simple and straight forward and (2) that gives you full control over what is going on. Why do you want something in between?
I see your point, but is what I'm doing so complex? Its an object with 2 related props that I'm storing for longer than the Hibernate session. To me that isn't complex.
Yes.. but that is a lack of complexity in *your* example.
Not a lack of complexity in regards to the framework.
It's like saying 'I only need to put one value in a distributed cache that is shared across a cluster. It's only 1 value, that should be a simple thing to implement, right?'
You're basically changing fetching strategies, which if done incorrectly, or in the wrong place in an appliaction can be a pretty big issue. Having flag that simply switches fetching strategies all over the place is a pretty bad idea, as it will get really abused in some bad ways.
Lets also not forget there are a slew of other (possibly better) options to solve this problem, some being:
1) just storing the id of the user in session, and retrieving that object as required
2) re-retrieving the object from Hibernate all over again when requested, so that you know the data is completely valid.
(I expound on those approaches some more in my linked article)
This is why dealing with detached objects is generally not a good way to go, it adds a large amount of complexity and management, with not a lot of gain.
But I didn't say it would globally disable the fetching. I'd imagine under normal circumstances I'd get a user object and _not_ persist it - but this one, which represents the current user, would be unique.
Not following you on this one... ;o)
Maybe I wasn't clear.
You're essentially doing 2 things, that are actually complex:
1) You want to change fetching strategies at runtime
2) You want to interact with a detached object
These two things are actually kinda tricky, esp. with Hibernate.
Hence the options I've outlined above for alternate strategies for when you want to store an Entity in the CF Session scope.
Does that make more sense?
I guess it just doesn't seem complex to me. I get why Hibernate would be lazy. That makes sense. Don't load X, a related property, unless the user asks you to. That lazyness is sensible. But it seems like if I know I'm going to need, but _after_ the current Hibernate session, then I should be able to ask Hibernate to not be lazy. This one time. Only. ;)
re: 'But it seems like if I know I'm going to need, but _after_ the current Hibernate session, then I should be able to ask Hibernate to not be lazy. This one time. Only. ;)'
And you can. With HQL or Criteria Queries.
So use them ;)
Heh, ok, you win. (But look, I got the last comment. So there. :)
@Ray, I think you can read this: http://www.rupeshk.org/blog...
see #1: Immediate fetching
:)
Yep, great article. Thanks for posting it here. Option 1 is what I'm doing now - although I _only_ used the lazy attribute.
I think Mark is correct on one aspect, and that storing detached objects in a session scope is a bad idea, especially in terms of scalability. Do you want thousands of detached objects sitting around waiting to die, or simply thousands of ids?
Or if you MUST use session scopes (almost always a bad idea), then get the object, get the three values you wanted, and persist those.
Why is using the session scope a bad idea all of a sudden? We've used it for ages. Moving from storing session.username to session.userOb is a no brainer for me I'd say. And thousands of objects? Well maybe if I have thousands of objects. As it stands, this is not per session, but per _logged in_ session. So this will be a minority of the total traffic.
Using sessions means that you're making a lot of assumptions about site scalability and traffic. You're never going to have a lot of visitors? Never going to have a lot of simultaneous logins? Never going to be popular? Never going to be Slashdotted? Never going to need to maintain uptime and availability?
I almost always design using client variables, which means that I usually won't have to change a thing the second I decide to hang another server out there for load-balancing and fail-over. Sticky-sessions won't wash either, as you lose both load-balancing and fail-over capabilities when you do so. Have a server go down, or take it down manually to do an upgrade, and the user's bound session goes poof.
And if your argument is still that you're only going to have a few dozen users at a time, then the exceedingly minor database hits a dozen users are going to cause to support client-based variables are equally meaningless... and you've still bullet-proofed your future.
I'd say that by storing the user object in the session scope, you are assuming it'll never be modified by another session (such as an admin. user changing a flag that allows a feature etc.) or is entityMerge() clever enough that things in the database trump things in the variable) ?
The issue you bring up is something that impacts Session variables whether you use ORM or not.
As for entityMerge being smart enough - I have no idea what would happen if you changed some prop in another session (note lowercase), and then merged it with Session.User. I'd always assume the 'freshest' copy will "win". I'll do a test later this morning.
@Michael: I don't think it makes sense to say you can never use Session variables. Whether I use ORM or not isn't important - it is the size. I can discover that size easily enough by using the Server Monitor and then plan accordingly. I don't think you can say "Lot of traffic means no session variables" as that just isn't true. And as for load balancing and fail over - you have support for that with CFCs/Session variables now.
Size isn't everything. Without know how the ORM maintains internal state you could be creating thousands of relatively heavy-weight objects that, say, prevent your database connections from being dropped, closed, or reused.
Worse, you're maintaining them for the entire duration of the page request as opposed to relegating their use inside a single function that obtains what it needs and then drops and frees the object immediately after the function exits.
As to session replication: "When a cluster uses session replication, session data is copied to other servers in the cluster each time it is modified. This can degrade performance if you store a significant amount of information in session scope. If you plan to store a significant amount of information in session scope, consider storing this information in client variables saved in a database."
It also gets worse the more servers you have in your cluster, as session traffic goes up exponentially.
Plus session replication is another one of those $7,500 a copy Enterprise-only features.
No thanks. For the cost of two CFE licenses I can buy two hardware-based LBs and run one with another in warm standby. Those will support, oh, a thousand or so servers, easy.
BTW, if entityMerge hits the database again to "refresh" the object, then there's really no reason whatsoever to persist the darn thing, is there?
Well, if you don't know the amount of data you are using, then you have problems anyway. ;)
As it stands, ORM entities live for an entire page request as that's how long the Hibernate session will live. You can force it to close early, but by default it is going to equal the CF Request.
I guess I'm not saying you are wrong per se - but it almost sounds as if you think one can't use ORM at all. I think like anything it is perfectly fine as long as you know what you are doing.
A possible soution to the Session/client scope issue when it comes to scaling: What about using a distributed memory cache (memecached or EhCache)? You still have to serialize/deserialize your data, but avoid the DB hit. Still not as simple as using the Session scope directly, but makes it easier to scale, without having to worry about round-robin or replication.