Ok, let me begin this by stating that I am not a DBA, so someone may correct me. I found out today that just because you use nvarchar or ntext or any other unicode field type in SQL Server, the data may not actually be stored in Unicode. If you do not prepend your SQL statements with N, the data isn't treated as Unicode. So, for example, to store string 'foo' as Unicode, you need to do N'foo'.
Now, this is news to me, and it seems a bit silly. If I specify column X as Unicode, why in the hell do I have to force the issue when I'm writing SQL?
I've actually never even seen that syntax before - has anyone else?
Archived Comments
Yeah, I''ve been using the N syntax for quite some time now, after myself finding out the same information. It doesn''t make much sense to me either why you would need the N..
Its a guess, but I think if you pass SQL a unicode string ( be it UTF-8 or otherwise ) it will convert it to its internal unicode format ( presumably UTF-16 ). If you pass it some crazy string in another encoding ( eg: SHIFT_JIS ) then it has no way of knowing what encoding it is, and so should leave it alone ( unicode encoding it will cause issues ). The N just tells it whether or not it can convert the string.
Personally, I want to be able to just set everything to UTF-8 and continue on my merry way.
I think if you just pass UTF-8 strings to SQL everything would be alright ( except for string functions in SQL ).
jon''s hit it though to nit pick, sql server like java, etc. use ucs-2 internally. utf-16 is "extended" ucs-2.
I wrote a lengthy blog on this subject recently. It''s the third most recent entry as of today. You can use this link..
http://www.trajiklyhip.com/...
..to view the blog but beware you''ll lose the site navigation.
Try PostgreSQL (postgresql.org): it''s free, fast, UNICODE-compliant, and doesn''t need any funny N syntax.